💻 Custom code evaluators in LangSmith

DATE: October 17, 2024

AUTHOR: The LangChain Team

You can now write custom code evaluators and run them in the LangSmith UI!

Custom code evaluators allow you to evaluate experiments using deterministic and specific criteria - such as checking for valid JSON or evaluating exact matches. We also support more advanced use cases by allowing the import of packages like numpy and pandas. Use these alongside LLM-as-a-Judge evaluators to test and evaluate your LLM applications.

These custom evaluators can be run in LangSmith's Playground across datasets, with no coding required. This makes it easy for developers to set up evaluators in the UI and collaborate with other team members - such as prompt engineers or product managers - when iterating and running experiments in the Playground.

Try it out in LangSmith today: smith.langchain.com

Learn more in the docs: https://docs.smith.langchain.com/how_to_guides/evaluation/bind_evaluator_to_dataset#custom-code-evaluators