Peekaboo: AI‑Powered macOS CLI for Screenshots & GUI

Peekaboo: AI‑Powered macOS CLI for Screenshots & GUI

In a world where AI is increasingly woven into our workflows, having a lightweight, scriptable tool that can see and act on your desktop is a game‑changer. Peekaboo is that tool. It’s a free, MIT‑licensed, macOS‑only command‑line interface (CLI) and optional MCP server that lets you capture screenshots, inspect UI elements, and send precise commands—all while using GPT‑style or local Ollama models to reason about the screen.

Why Peekaboo? What Makes It Stand Out

Feature What It Does Why It Matters
Pixel‑accurate capture Screenshots of windows, menus, or the entire screen, optionally Retina‑scaled Gives AI the fidelity it needs for reliable visual understanding
Natural‑language navigation Commands like peekaboo "Open Notes and create a TODO list" Lets non‑technical users author automation in plain English
Rich toolset see, click, type, scroll, menu, dock, etc. Each tool maps to a UI action, enabling complex workflows
Multi‑provider AI GPT‑5.1, Claude 4.x, Grok 4‑fast, Gemini 2.5, local Ollama Pick the model that fits your privacy or budget
CLI + MCP server One binary works for command‑line scripts and as a plug‑in for Claude Desktop or Cursor Versatility without double‑tooling
Open source, community‑friendly 2k+ stars, active contributors, MIT license No lock‑in, you can fork or add features

Getting Started

1. Install the macOS App & CLI

brew install steipete/tap/peekaboo

The Homebrew formula bundles the native Swift binary, a macOS application for drag‑and‑drop usage, and a Homebrew‑managed copy of the CLI tool.

2. Install as an MCP Server (Node 22+)

If you prefer to run Peekaboo from a JavaScript environment or integrate it with Claude Desktop/Cursor’s MCP interface:

npx -y @steipete/peekaboo

This will launch an MCP server listening on the default port, ready to accept requests from your favorite desktop AI.

Quick‑Start Examples

Below are a few command‑line snippets that demonstrate Peekaboo’s most common use cases.

# Capture the entire screen at Retina 2x and save it
peekaboo image --mode screen --retina --path ~/Desktop/screen.png
# Capture a screenshot of Safari, extract the snapshot id, and click a label
snapshot_id=$(peekaboo see --app Safari --json-output | jq -r '.data.snapshot_id')
peekaboo click --on "Reload this page" --snapshot "$snapshot_id"
# Run a full natural‑language automation script
peekaboo "Open Notes and create a TODO list with three items"
# Use the CLI to list all current windows
peekaboo list windows

4. Writing .peekaboo.json Automation Scripts

Peekaboo’s run subcommand lets you create deterministic, testable workflows:

{
  "steps": [
    {"click": {"on": "Google Search", "app": "Safari"}},
    {"type": {"text": "OpenAI API", "delay_ms": 200}},
    {"press": {"key": "Enter", "repeat": 1}}
  ]
}

Then run peekaboo run script.json.

Extending the Tool with Custom AI Models

Peekaboo defaults to GPT‑5.1 but you can point it at any OpenAI, Anthropic, xAI, Gemini, or local Ollama model simply by setting the PEEKABOO_AI_PROVIDERS environment variable or using peekaboo config add:

peekaboo config add openai/gpt-5.1
peekaboo config add anthropic/claude-opus-4
peekaboo config add ollama/llava

Inside your JSON scripts or interactive prompts, you can now ask the AI to generate screenshot coordinates, interpret vision, or suggest next actions.

Common Use Cases

Scenario How Peekaboo Can Help
Automated UI testing Use see to capture the DOM‑like tree, click and type actions to simulate user flows, and assertion scripts to compare snapshots
Voice‑controlled workflows Pipe speech recognition output to a Peekaboo prompt and let the AI decide which UI element to target
Desktop bots Combine Peekaboo with frameworks like robotjs or expect for end‑to‑end automation across macOS and AI
Accessibility audits Inspect the accessibility tree via see and feed it to the AI to produce audit reports

Contributing & Community

Peekaboo is actively maintained by @steipete and a handful of contributors. If you’d like to add a new feature, open a pull request, or simply raise an issue, please refer to the CONTRIBUTING.md for guidelines.

The project’s MIT license ensures you can fork, modify, and distribute without restrictions—perfect for both hobbyists and professional developers.

Wrap‑Up

Peekaboo transforms a raw screenshot into a programmable UI. Whether you’re writing a one‑liner automation or building a full‑blown AI‑driven desktop assistant, this open‑source CLI gives you the raw power and the AI intelligence you need—all for free. Grab it, try it on macOS, and watch your productivity sky‑rocket.

Happy automating!

Original Article: View Original

Share this article