Python for LLM Apps: From Beginner to Production
Python is the lingua franca of LLM development. This tutorial takes you from your first API call to production-grade code in 30 minutes.
Install
pip install openai anthropic
Your first API call
import openai
client = openai.OpenAI() # uses OPENAI_API_KEY env var
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "What is Python?"}]
)
print(response.choices[0].message.content)
Streaming (for chatbots)
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Write a haiku"}],
stream=True
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")
Error handling
from openai import APIError, RateLimitError, APIConnectionError
import time
def safe_call(messages, max_retries=3):
for attempt in range(max_retries):
try:
return client.chat.completions.create(
model="gpt-4o-mini", messages=messages
)
except RateLimitError:
time.sleep(2 ** attempt) # exponential backoff
except APIConnectionError as e:
print(f"Connection error: {e}")
time.sleep(5)
raise Exception("All retries failed")
Async batch processing
import asyncio
from openai import AsyncOpenAI
client = AsyncOpenAI()
async def process(prompt):
response = await client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
async def main():
prompts = ["Q1", "Q2", "Q3", "Q4", "Q5"]
results = await asyncio.gather(*[process(p) for p in prompts])
return results
results = asyncio.run(main())
print(f"Got {len(results)} responses")
Key takeaways
- Always set
OPENAI_API_KEYenv var (never hardcode) - Use streaming for user-facing UIs
- Always have retry + backoff for production