DATE:
AUTHOR:
The LangChain Team
LangSmith SaaS

📦 Bulk data export from LangSmith for offline analysis

DATE:
AUTHOR: The LangChain Team

LangSmith now supports bulk data exports, now available in beta for LangSmith Plus and Enterprise plans. If you need to analyze your trace data offline in an external tool, this allows you to export data in Parquet format to your own S3 bucket or any S3-compatible storage. With bulk data export, you can query your LangSmith data in external tools like:

  • BigQuery

  • Snowflake

  • RedShift

  • DuckDB

  • Jupyter Notebooks

  • ClickHouse

By combining LangSmith traces with other data sources, you can gain deeper insights into performance trends, data quality, and costs.


How It Works

  • Launch Exports by Project and Date Range:
    You can target specific LangSmith projects and define a custom date range for your export.

  • Automated Orchestration & Resilience:
    Once an export is initiated, the system will manage concurrency, retries, and handle runtime timeouts (set at 24 hours). This ensures your data export runs smoothly, even with large datasets.

  • Your data will be exported in Parquet, a columnar storage format optimized for analytics. This ensures seamless import into tools like BigQuery, Snowflake, and other databases. All exported data will maintain the same structure as LangSmith's Run data format.


Getting Started

  • Availability: Currently in beta for LangSmith Plus and Enterprise plans.

  • How to Enable: Contact support@langchain.dev to get started.

For more details, check out our documentation: https://docs.smith.langchain.com/how_to_guides/tracing/data_export

Powered by LaunchNotes