Find failures fast with agent observability.
Quickly debug and understand non-deterministic LLM app behavior with tracing. See what your agent is doing step by step —then fix issues to improve latency and response quality.
LangSmith is a unified observability & evals platform where teams can debug, test, and monitor AI app performance — whether building with LangChain or not.
Quickly debug and understand non-deterministic LLM app behavior with tracing. See what your agent is doing step by step —then fix issues to improve latency and response quality.
Evaluate your app by saving production traces to datasets — then score performance with LLM-as-Judge evaluators. Gather human feedback from subject-matter experts to assess response relevance, correctness, harmfulness, and other criteria.
Experiment with models and prompts in the Playground, and compare outputs across different prompt versions. Any teammate can use the Prompt Canvas UI to directly recommend and improve prompts.
Track business-critical metrics like costs, latency, and response quality with live dashboards — then get alerted when problems arise and drill into root cause.
LLM app traces are complex — packed with text, tool calls, audio, and images. You’ll need to find signal in the noise, so you can debug faster and explain behavior with confidence.
There are no guarantees with LLMs. Unified testing & observability lets you turn real user data into evaluation datasets and catch issues that traditional monitoring & testing tools would miss.
From PMs to subject matter experts, everyone’s involved in building GenAI apps. Close the gap between ideas and working software by making it easy to collaborate across teams — whether it’s through writing prompts or providing feedback on experiments.
Works with or without LangChain
Hybrid and self-hosted deployment options
API-first and OTEL-compliant to complement existing DevOps investments
Yes, you can log traces to LangSmith using a standard OpenTelemetry client to access all LangSmith features, including tracing, running evals, and prompt engineering. See the docs.
LangSmith traces contain the full information of all the inputs and outputs of each step of the application, giving users full visibility into their agent or LLM app behavior. LangSmith also allows users to instantly run evals to assess agent or LLM app performance — including LLM-as-Judge evaluators for auto-scoring and the ability to attach human feedback. Learn more.
Yes, we allow customers to self-host LangSmith on our enterprise plan. We deliver the software to run on your Kubernetes cluster, and data will not leave your environment. For more information, check out our documentation.
For Cloud SaaS, traces are stored in GCP us-central-1 or GCP europe-west4, depending on your plan. Learn more.
No, LangSmith does not add any latency to your application. In the LangSmith SDK, there’s a callback handler that sends traces to a LangSmith trace collector which runs as an async, distributed process. Additionally, if LangSmith experiences an incident, your application performance will not be disrupted.
We will not train on your data, and you own all rights to your data. See LangSmith Terms of Service for more information.
See our pricing page for more information, and find a plan that works for you.
Get started with tools from the LangChain product suite for every step of the agent development lifecycle.