Instructor: Structured LLM Outputs in Python
Instructor: Python's Top Library for Structured LLM Outputs
In the rapidly evolving landscape of artificial intelligence, extracting reliable and structured data from Large Language Models (LLMs) remains a significant challenge. Enter Instructor, the leading open-source Python library designed to simplify this complex task. With over 3 million monthly downloads and a vibrant community, Instructor has become the go-to solution for developers seeking type-safe, validated, and automatically retried outputs from their AI models.
What is Instructor?
Instructor is built on top of Pydantic, a powerful data validation library, to provide an intuitive and robust framework for interacting with LLMs. It allows developers to define the exact structure of the data they need using Pydantic models. This ensures that the outputs from LLMs are not only consistent but also adhere to predefined schemas, significantly reducing errors and improving data quality.
Key Features:
- Structured Outputs: Define Pydantic models to precisely specify the desired data format from your LLMs.
- Automatic Retries: Built-in logic for automatic re-attempts when validation fails, eliminating the need for manual error handling.
- Data Validation: Leverages Pydantic's robust validation capabilities to guarantee the quality and integrity of LLM responses.
- Streaming Support: Real-time processing of partial responses and lists for enhanced application responsiveness.
- Multi-Provider Compatibility: Seamlessly works with over 15 LLM providers, including OpenAI, Anthropic, Google (Gemini), Mistral, Cohere, Ollama, DeepSeek, and many more, offering a unified API.
- Type Safety: Enjoy full IDE support with proper type inference and autocompletion, boosting developer productivity.
- Open Source Support: Run any open-source model locally using frameworks like Ollama,
llama-cpp-python
, orvLLM
.
Quick Start: Extract Structured Data in 3 Lines
Getting started with Instructor is incredibly simple. After installation, you can extract structured data almost instantly:
```python import instructor from pydantic import BaseModel from openai import OpenAI
class Person(BaseModel): name: str age: int occupation: str
client = instructor.from_openai(OpenAI()) person = client.chat.completions.create( model="gpt-4o-mini"