BabelDOC: Open-Source PDF Translator Built for AI-Powered Docs

BabelDOC – The Open‑Source AI‑Powered PDF Translator

In the age of global research and rapid business expansion, the ability to translate complex PDF documents while preserving layout and formatting has become essential. Traditional OCR‑based tools often chop up text, break tables, or lose formatting, leaving translators to do a lot of manual cleanup. Enter BabelDOC, a community‑driven project that turns AI‑powered translation into a seamless, single workflow.

What Is BabelDOC?

BabelDOC is a Yet‑Another Document Translator written in Python. It accepts a PDF, extracts text using state‑of‑the‑art layout parsers, feeds sentences into an LLM (OpenAI‑compatible by default), and stitches the translated text back into a new PDF that mirrors the original design.

Key features include: - Dual‑page output: original and translated pages side‑by‑side, or alternating order. - Rich text support: formulas, tables, and complex formatting remain intact. - Offline asset generation: create a ZIP of fonts and model weights for air‑gap environments. - Extensible CLI and Python API: easy integration into scripts or larger applications. - Glossary support: keep terminology consistent across documents.

Getting Started – Installation

BabelDOC can be installed in two ways:

  1. PyPI + UV (recommended)

    uv tool install --python 3.12 BabelDOC
    babeldoc --help
    
    UV automatically resolves dependencies and places the babeldoc binary on your PATH.

  2. From source (for developers)

    git clone https://github.com/funstory-ai/BabelDOC
    cd BabelDOC
    uv run babeldoc --help
    
    The uv run command sets up a fresh virtual environment and runs BabelDOC directly.

Basic Usage

Translate a single PDF from English to Chinese:

babeldoc --openai --openai-model "gpt-4o-mini" \
  --openai-base-url "https://api.openai.com/v1" \
  --openai-api-key "YOUR_KEY" \
  --files example.pdf

For multiple documents, simply repeat the --files flag:

babeldoc --files paper1.pdf --files paper2.pdf --openai ...

The output appears in the same folder unless you supply --output /path/to/dir.

Advanced Options

BabelDOC’s CLI is packed with flags that give you fine‑grained control: - --disable-rich-text-translate – Skip rich text for improved compatibility. - --watermark-output-mode – Choose between watermarked, no watermark, or both. - --max-pages-per-part – Split large PDFs into manageable chunks. - --openai-model – Swap in any OpenAI‑compatible LLM such as glm‑4‑flash or deepseek‑chat. - --glossary-files – Load CSV term lists to force consistent translation.

These options are perfect for production pipelines where speed, size, and consistency matter.

Offline Asset Packaging

If you’re working in an environment without network access, BabelDOC can generate a self‑contained asset package:

babeldoc --generate-offline-assets ./offline_assets
Later, restore it on another machine:
babeldoc --restore-offline-assets ./offline_assets/package.zip

Integrating with Zotero

Academic researchers often store PDFs in Zotero. BabelDOC supports direct integration via 1. Immersive Translate plugin (for Immersive Translate Pro members). 2. pdf2zh‑next wrapper (for self‑deployed users).

These plugins automatically translate PDFs on download or when you press a contextual menu item, adding translated versions to your library.

Self‑Deployment with PDFMathTranslate

For users who want full control over the server stack, BabelDOC can be embedded into PDFMathTranslate‑next. The resulting application includes a web UI, batch queues, and a RESTful API.

Roadmap & Community

The project is actively maintained, with over 200 releases and 6+ k stars. Upcoming milestones include: - Native line support - Extensive table handling - Drop‑cap support - Cross‑page paragraph merging - Improved OCR detection for scanned PDFs

Contributors are welcome through pull requests. The code quality is maintained via pre‑commit hooks, automated tests, and continuous integration.

TL;DR

BabelDOC delivers: - One‑stop PDF translation with AI - Full control over output formatting - CLI and Python API for automation - Offline asset support for air‑gap - Community‑driven development and robust roadmap

Whether you’re translating research papers, technical manuals, or business contracts, BabelDOC offers the flexibility and power to keep the original layout while localizing content at the speed of AI. Give it a try today and transform how you handle multilingual PDFs.

Original Article: View Original

Share this article