RAGbits: Rapid Development for GenAI Applications
RAGbits: The Toolkit for Rapid GenAI Application Development
In the rapidly evolving landscape of Generative AI, developers constantly seek robust and efficient tools to bring their innovative applications to life. Enter RAGbits, an open-source framework by deepsense.ai, purpose-built to accelerate the creation of reliable and scalable GenAI solutions, particularly those leveraging Retrieval-Augmented Generation (RAG).
What is RAGbits?
RAGbits is a comprehensive set of building blocks designed to streamline the entire GenAI application development lifecycle. It offers a modular and flexible architecture, allowing developers to integrate only the components they need, thereby reducing dependencies and optimizing performance. The framework is heavily focused on practical application, providing robust features for managing Large Language Models (LLMs), handling diverse data types, and deploying sophisticated RAG pipelines.
Key Features of RAGbits:
RAGbits stands out with its powerful feature set, empowering developers to build sophisticated AI applications with ease:
π¨ Build Reliable & Scalable GenAI Apps
- Flexible LLM Integration: Seamlessly swap between over 100 LLMs via LiteLLM or integrate local models, offering unparalleled flexibility.
- Type-Safe LLM Calls: Utilize Python generics to enforce strict type safety during model interactions, ensuring robustness and reducing errors.
- Bring Your Own Vector Store: Connect with popular vector stores like Qdrant, PgVector, and more, or easily integrate custom solutions.
- Developer Tools Included: Access a suite of command-line tools for managing vector stores, configuring query pipelines, and testing prompts directly from your terminal.
- Modular Installation: Install only the necessary components, tailoring the framework to your specific project needs and improving efficiency.
π Fast & Flexible RAG Processing
- Extensive Data Ingestion: Process over 20 data formats, including PDFs, HTML, spreadsheets, and presentations. Leverage powerful parsers like Docling and Unstructured, or implement custom parsers.
- Complex Data Handling: Extract structured content, tables, and images with built-in Visual Language Model (VLM) support.
- Any Data Source Connectivity: Use pre-built connectors for cloud storage services like S3, GCS, and Azure, or develop your own connectors.
- Scalable Ingestion: Process large datasets efficiently using Ray-based parallel processing for rapid data onboarding.
π Deploy & Monitor with Confidence
- Real-time Observability: Track application performance and gain insights using OpenTelemetry and comprehensive CLI analytics.
- Built-in Testing: Validate and refine your prompts with integrated
promptfoo
testing before deploying your applications. - Auto-Optimization: Continuously evaluate and optimize model performance through systematic processes.
- Chat UI: Deploy a ready-to-use chatbot interface complete with API, data persistence, and user feedback mechanisms.
Getting Started with RAGbits
Installation is straightforward. You can get started quickly with a simple pip
command:
pip install ragbits
This command installs a starter bundle, including ragbits-core
(fundamental tools), ragbits-agents
(for agentic systems), ragbits-document-search
(retrieval and ingestion), ragbits-evaluate
(unified evaluation), ragbits-chat
(conversational AI), and ragbits-cli
(command-line interface). Alternatively, individual components can be installed as needed.
Practical Examples:
The RAGbits documentation provides clear quickstart guides, demonstrating common use cases. Here's a glimpse into its simplicity:
-
Defining and Running LLM Prompts: Easily define type-safe prompts and generate responses from your chosen LLM.
# Example of LLM prompt generation import asyncio from pydantic import BaseModel from ragbits.core.llms import LiteLLM from ragbits.core.prompt import Prompt class QuestionAnswerPromptInput(BaseModel): question: str class QuestionAnswerPromptOutput(BaseModel): answer: str class QuestionAnswerPrompt(Prompt[QuestionAnswerPromptInput, QuestionAnswerPromptOutput]): system_prompt = """ You are a question answering agent. Answer the question to the best of your ability. """ user_prompt = """ Question: {{ question }} """ llm = LiteLLM(model_name="gpt-4.1-nano", use_structured_output=True) async def main() -> None: prompt = QuestionAnswerPrompt(QuestionAnswerPromptInput(question="What are high memory and low memory on linux?")) response = await llm.generate(prompt) print(response.answer) if __name__ == "__main__": asyncio.run(main())
-
Building a Vector Store Index: Ingest documents and query your custom knowledge base.
# Example of document search import asyncio from ragbits.core.embeddings import LiteLLMEmbedder from ragbits.core.vector_stores import InMemoryVectorStore from ragbits.document_search import DocumentSearch embedder = LiteLLMEmbedder(model_name="text-embedding-3-small") vector_store = InMemoryVectorStore(embedder=embedder) document_search = DocumentSearch(vector_store=vector_store) async def run() -> None: await document_search.ingest("web://https://arxiv.org/pdf/1706.03762") result = await document_search.search("What are the key findings presented in this paper?") print(result) if __name__ == "__main__": asyncio.run(run())
-
Constructing a RAG Pipeline: Combine LLMs with retrieved context for accurate and relevant responses. ```python # Example of a RAG pipeline import asyncio from pydantic import BaseModel from ragbits.core.embeddings import LiteLLMEmbedder from ragbits.core.llms import LiteLLM from ragbits.core.prompt import Prompt from ragbits.core.vector_stores import InMemoryVectorStore from ragbits.document_search import DocumentSearch
class QuestionAnswerPromptInput(BaseModel): question: str context: list[str]
class QuestionAnswerPromptOutput(BaseModel): answer: str
class QuestionAnswerPrompt(Prompt[QuestionAnswerPromptInput, QuestionAnswerPromptOutput]): system_prompt = """ You are a question answering agent. Answer the question that will be provided using context. If in the given context there is not enough information refuse to answer. """ user_prompt = """ Question: {{ question }} Context: {% for item in context %} {{ item }} {%- endfor %} """
embedder = LiteLLMEmbedder(model_name="text-embedding-3-small") vector_store = InMemoryVectorStore(embedder=embedder) document_search = DocumentSearch(vector_store=vector_store) llm = LiteLLM(model_name="gpt-4.1-nano", use_structured_output=True)
async def run() -> None: question = "What are the key findings presented in this paper?"
await document_search.ingest("web://https://arxiv.org/pdf/1706.03762") result = await document_search.search(question)
prompt = QuestionAnswerPrompt( QuestionAnswerPromptInput( question=question