Dash: Self‑Learning Data Agent with 6 Layers of Context

Dash: A Self‑Learning Data Agent with Six Layers of Context

Introduction

In the age of large language models, generating accurate SQL from natural‑language queries is still a challenge. Dash (repo: https://github.com/agno-agi/dash) tackles this problem by combining hybrid search, a rich knowledge base, and a lightweight learning loop. It is an open‑source Python library that you can drop into a project or run as a standalone service.

Why Dash?

  • Context‑Grounded Answers: 6 layers of context (schema, business rules, query patterns, institutional docs, learnings, runtime schema) mean every response is informed by real data.
  • Self‑Improving Without Retraining: The learning machine records error patterns and fixes on the fly, making the agent smarter after each query.
  • Zero‑Cost Continuous Learning: CPU‑friendly fine‑tuning is replaced by a lightweight, GPU‑poor loop.
  • Open‑Source & Extensible: 100+ stars on GitHub, Apache‑2.0 licence, and a clear folder structure for adding knowledge.

Core Architecture

flowchart TD
  A[User Question] --> B[Retrieve Knowledge + Learnings]
  B --> C[Reason About Intent]
  C --> D[Generate Grounded SQL]
  D --> E[Execute & Interpret]
  E -->|Success| F[Return Insight]
  E -->|Error| G[Diagnose → Fix → Save Learning]
  G --> C

The Six Layers of Context

  1. Table Usage – JSON files describing schemas, columns, and relationships.
  2. Human Annotations – Business metrics, definitions, and rules.
  3. Query Patterns – Proven SQL snippets stored in .sql files.
  4. Institutional Knowledge – External docs or wikis (optional).
  5. Learnings – Machine‑generated error fixes.
  6. Runtime Context – Live schema introspection via the introspect_schema tool.

Quick Start

# Clone the repository
git clone https://github.com/agno-agi/dash && cd dash

# Set your OpenAI key
cp example.env .env
# Edit .env to add OPENAI_API_KEY

# Build and start the service
docker compose up -d --build

# Load sample data and knowledge
docker exec -it dash-api python -m dash.scripts.load_data
docker exec -it dash-api python -m dash.scripts.load_knowledge

Now your API is available at http://localhost:8000/docs.

Using the Web UI

  1. Open https://os.agno.com and log in.
  2. Add an Online ServerLocal → http://localhost:8000.
  3. Click Connect.
  4. Ask questions like:
  5. Who won the most F1 World Championships?
  6. How many races has Lewis Hamilton won?
  7. The agent automatically uses the six layers of context to produce a meaningful answer.

Sample Query Flow

User: What was the total revenue in the last quarter?

1. Dash locates the `orders` table schema and business rules.
2. It picks a saved query pattern that calculates revenue.
3. Generates SQL with proper filtering (exclude refunded orders).
4. Executes against the database.
5. Returns a human‑readable summary.

Self‑Learning Loop Explained

  1. Execution Success – Insight is returned.
  2. Execution Failure – Dash diagnoses the bug (e.g., wrong data type), applies a fix, and stores it in learnings.
  3. The next time a similar query is run, the fix is automatically applied.
  4. Because this loop runs in the background, no additional engineering training is required.

Adding or Updating Knowledge

Dash expects a well‑structured knowledge/ folder:

knowledge/
├── tables/          # JSON table metadata
├── queries/         # SQL patterns
└── business/        # Business rules and metrics

Run:

python -m dash.scripts.load_knowledge
# For a clean start
python -m dash.scripts.load_knowledge --recreate

Example: Table Metadata

{
  "table_name": "orders",
  "table_description": "Customer orders with denormalized line items",
  "use_cases": ["Revenue reporting", "Customer analytics"],
  "data_quality_notes": ["created_at is UTC", "status values: pending, completed, refunded", "amount stored in cents"]
}

Example: Query Pattern

-- <query name>monthly_revenue
-- <query description>
-- Monthly revenue calculation.
-- Converts cents to dollars.
-- Excludes refunded orders.
-- </query description>
SELECT
  DATE_TRUNC('month', created_at) AS month,
  SUM(amount) / 100.0 AS revenue_dollars
FROM orders
WHERE status = 'completed'
GROUP BY 1
ORDER BY 1 DESC;

Deploying to Railway

# Login to Railway
railway login

# Spin up the required services
./scripts/railway_up.sh

# Load data/knowledge
railway run python -m dash.scripts.load_data
railway run python -m dash.scripts.load_knowledge

# Inspect logs
railway logs --service dash

The Railway dashboard makes it easy to view metrics, update environment variables, and redeploy after code changes.

Best Practices

  • Keep Knowledge Fresh: Regularly review and refine table metadata and query patterns.
  • Monitor Learnings: Inspect the learnings directory to see what fixes the agent is applying.
  • Secure the API: Add authentication layers for production deployments.
  • Leverage EXA API (optional) for additional web search context.

Conclusion

Dash demonstrates how a minimal, self‑learning loop combined with structured knowledge can dramatically improve natural‑language to SQL workflows. As an open‑source project under Apache‑2.0, it invites community contributions—whether adding new query patterns or expanding the learnings engine. If you’re looking for a practical, lightweight AI assistant that keeps learning without retraining, Dash is a compelling starting point.

Happy querying! 🚀

Original Article: View Original

Share this article