- DATE:
- AUTHOR:
- The LangChain Team
📦 Bulk data export from LangSmith for offline analysis
LangSmith now supports bulk data exports, now available in beta for LangSmith Plus and Enterprise plans. If you need to analyze your trace data offline in an external tool, this allows you to export data in Parquet format to your own S3 bucket or any S3-compatible storage. With bulk data export, you can query your LangSmith data in external tools like:
BigQuery
Snowflake
RedShift
DuckDB
Jupyter Notebooks
ClickHouse
By combining LangSmith traces with other data sources, you can gain deeper insights into performance trends, data quality, and costs.
How It Works
Launch Exports by Project and Date Range:
You can target specific LangSmith projects and define a custom date range for your export.Automated Orchestration & Resilience:
Once an export is initiated, the system will manage concurrency, retries, and handle runtime timeouts (set at 24 hours). This ensures your data export runs smoothly, even with large datasets.Your data will be exported in Parquet, a columnar storage format optimized for analytics. This ensures seamless import into tools like BigQuery, Snowflake, and other databases. All exported data will maintain the same structure as LangSmith's Run data format.
Getting Started
Availability: Currently in beta for LangSmith Plus and Enterprise plans.
How to Enable: Contact support@langchain.dev to get started.
For more details, check out our documentation: https://docs.smith.langchain.com/how_to_guides/tracing/data_export