Mirage: A Unified Virtual Filesystem for AI Agents | AIBit-Discover Open Source Projects

Building AI agents that can interact with real-world data often feels like a fragmented nightmare. You end up juggling a dozen different SDKs, managing authentication for every service, and writing custom glue code just to get an agent to read a file from S3 and post a summary to Slack.

Mirage changes the paradigm by treating every backend as part of a single, unified virtual filesystem (VFS). Instead of teaching your agent how to use the Slack API, the GitHub API, and the S3 SDK, you simply mount them as directories. Your agent can then use the Unix tools it already knows—ls, cat, grep, cp—to interact with the entire stack.

Why Mirage Matters

Modern LLMs are trained on massive corpuses of code and documentation, making them incredibly fluent in bash and filesystem semantics. By abstracting remote services into a filesystem, Mirage leverages this existing "knowledge" to minimize hallucinations and API-specific errors.

Key benefits include:

Unified Abstraction: Whether it's Google Drive, Redis, or a local directory, it all looks like a standard file tree.
Bash-Native: Agents use standard command-line tools to perform complex operations across services.
Portable Workspaces: You can snapshot your entire environment (including remote data state) into a single file, making agent runs reproducible and easy to debug.
Framework Agnostic: It integrates seamlessly with OpenAI Agents SDK, Vercel AI SDK, LangChain, and Pydantic AI.

How It Works

Mirage acts as a middleware layer between your agent and your infrastructure. It uses a dispatcher and a two-layer caching system to ensure that your agent's interactions are performant and reliable.

The Caching Layer

Repeatedly hitting remote APIs is slow and costly. Mirage implements a robust two-layer cache:

Index Cache: Stores directory listings and metadata. Subsequent ls or find commands hit the local index until the TTL (Time-To-Live) expires.
File Cache: Stores object bytes. The first read streams from the origin, while subsequent reads are served from the local cache.

You can configure these to use RAM for ephemeral tasks or Redis for persistent, multi-worker environments:

const ws = new Workspace(
  { '/s3': new S3Resource({ bucket: 'my-bucket' }) },
  {
    cache: new RedisFileCacheStore({ url: 'redis://localhost:6379/0', limit: '8GB' }),
    index: new RedisIndexCacheStore({ url: 'redis://localhost:6379/0', ttl: 600 }),
  }
);

Getting Started

Mirage provides SDKs for both Python and TypeScript, making it easy to embed into your existing FastAPI or Express applications.

Quick Python Example

from mirage import Workspace
from mirage.resource.s3 import S3Resource
from mirage.resource.ram import RAMResource

# Mount multiple sources
ws = Workspace({
    "/data": RAMResource(),
    "/s3": S3Resource(bucket="my-logs")
})

# Perform cross-service operations
await ws.execute("cp /s3/report.csv /data/local.csv")
await ws.execute("grep 'error' /data/local.csv | wc -l")

Why You Should Use It

If you are building autonomous agents, you are likely spending too much time writing "tool-calling" logic that maps natural language to specific API calls. Mirage allows you to define a workspace once and let the agent navigate it naturally.

By using Mirage, you aren't just building an agent; you are building a reproducible environment. Because the workspace can be snapshotted, you can take a failing agent run, export the demo.tar snapshot, and inspect exactly what the agent saw at the moment of failure.

For developers working with Claude Code or similar CLI-based agents, Mirage provides a lightweight daemon that allows these agents to reach into your cloud infrastructure as if it were a local folder, significantly expanding their utility without requiring custom tool definitions for every single service.

Source

strukto-ai/mirage: A Unified Virtual Filesystem For AI Agents