LangServe: Deploy Your LLM Chain to Production in 15 Minutes
LangServe wraps your LangChain runnables as a production REST API. Built on FastAPI, includes auth, batching, streaming, and a playground UI.
Your first LangServe endpoint
# app.py
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langserve import add_routes
from fastapi import FastAPI
app = FastAPI(title="My LLM")
model = ChatOpenAI(model="gpt-4o-mini")
prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
chain = prompt | model
add_routes(app, chain, path="/joke")
# Run: uvicorn app:app --host 0.0.0.0 --port 8000
Thatโs it. You now have:
POST /joke/invokeโ sync callPOST /joke/streamโ SSE streamingPOST /joke/batchโ batchGET /joke/playground/โ web UI
Auth (add API key)
from langserve import add_routes
from fastapi import Depends, HTTPException
from fastapi.security import APIKeyHeader
API_KEY = "your-secret-key"
header = APIKeyHeader(name="X-API-Key")
def auth(api_key: str = Depends(header)):
if api_key != API_KEY:
raise HTTPException(status_code=401)
add_routes(app, chain, path="/joke", dependencies=[Depends(auth)])
Monitoring (LangSmith)
import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "lsv2pt_..."
# All calls now traced in LangSmith dashboard
AWS Lambda deployment
# Install Mangum (ASGI โ Lambda bridge)
pip install mangum
# lambda_handler.py
from mangum import Mangum
from app import app
handler = Mangum(app, lifespan="off")
Bundle with sam or cdk and deploy.
Sources
- r/LocalLLaMA: 560 upvotes
- HN: 380 points