DATE:
AUTHOR:
The LangChain Team
LangSmith

๐Ÿ Pairwise Evaluations in LangSmith

DATE:
AUTHOR: The LangChain Team

For LLM use cases like text generation or chat (where there may not be a single "correct" answer), picking a preferred response with pairwise evaluation can be an effective approach.

LangSmithโ€™s pairwise evaluation lets you (1) define a custom pairwise LLM-as-judge evaluator with any desired criteria and (2) compare two LLM generations using this evaluator.

  • Read the blog post to learn more about pairwise evaluations

  • Dive into our video tutorial to walk through an example of how to use custom pairwise evaluators in LangSmith

  • Check out the docs

Bonus: Need to backtest on your production logs? This video shows how pairwise evaluation can also help you compare different versions of your app runs to the baseline production app.

Powered by LaunchNotes