Flux 2 in Pure C: Zero‑Dependency Image Generation

January 25, 2026

Category: Practical Open Source Projects

Tags:

Open Source Image Generation flux2 pure‑C ml‑inference

Flux 2 in Pure C: Zero‑Dependency Image Generation

Flux 2 is the new wave of latent‑diffusion models that can turn prompts into stunning images in four sampling steps. Black Forest Labs released the small “Klein” variant with just 4 B parameters, but the inference code remains large—16 GB of weights. Flux 2‑c gives you a fast, lightweight, pure C implementation that runs on macOS or Linux without requiring Python, CUDA, or a full ML stack.

Why a C implementation?

Zero external runtime – Only the C standard library is required, making the binary portable across environments that can compile C.
Memory‑mapped weights – The default --mmap mode keeps the 16 GB safetensors on‑disk and loads them on‑demand, bringing peak memory down to ~4–5 GB.
GPU acceleration (optional) – Apple Metal (MPS) on Macs or BLAS on Intel/Linux gives a ~30× speedup compared to the pure‑C baseline.

Quick Start

# 1. Clone the repo
git clone https://github.com/antirez/flux2.c
cd flux2.c

# 2. Choose a backend
make mps          # Apple Silicon (fastest)
# make blas      # Intel Mac / Linux with OpenBLAS
# make generic   # Pure C, no dependencies

# 3. Download the 16 GB model
./download_model.sh          # shell script (curl)
# or: pip install huggingface_hub && python download_model.py

# 4. Run the demo
./flux -d flux-klein-model \
       -p "A woman wearing sunglasses" \
       -o output.png

The image appears immediately in the terminal if it supports the Kitty protocol.

Features at a Glance

Feature	Description
Text‑to‑Image	Generate images from prompts in any resolution (64–1792 px).
Image‑to‑Image	Use existing images as reference via in‑context conditioning.
Multi‑Reference	Combine up to 16 reference images for composites.
Interactive CLI	Omit `-p` to start an REPL that remembers reference IDs.
Terminal image preview	`--show` / `--show-steps` display images directly in Kitty/Ghostty/iTerm2.
Zero‑Dependency API	Export a simple C library (`libflux.a`) with `flux_generate()` and `flux_img2img()`.
Metadata‑rich PNGs	Images include seed and model info in PNG metadata for reproducibility.
Benchmarks	On an M3 Max with 4 steps, pure‑C MPS generates 512×512 in ~13 s, matching PyTorch.

Building From Source

Prerequisites – a C compiler, optionally OpenBLAS for the blas target or Apple Accelerate for mps.

# On macOS
clang -framework Accelerate -o flux flux.c flux_h*.c ...
# On Linux via OpenBLAS
gcc -fopenmp -L/usr/lib -lopenblas -o flux flux.c flux_h*.c ...

Run make to list available backends. Each target builds a binary that automatically selects the fastest path for your platform.

Using the Library in Your Own Project

#include "flux.h"
#include <stdio.h>
int main(void) {
    flux_ctx *ctx = flux_load_dir("flux-klein-model");
    if (!ctx) { printf("Failed: %s\n", flux_get_error()); return 1; }

    flux_params p = FLUX_PARAMS_DEFAULT;
    p.width = 512; p.height = 512; p.seed = 42;

    flux_image *img = flux_generate(ctx, "A fluffy orange cat", &p);
    if (!img) { printf("Error: %s\n", flux_get_error()); return 1; }

    flux_image_save(img, "cat.png");
    flux_image_free(img);
    flux_free(ctx);
    return 0;
}

Compile with gcc -o myapp myapp.c -L. -lflux -lm -framework Accelerate (macOS) or the OpenBLAS equivalent.

Debugging & Compare With Python

The repo ships a debug/ folder with Python scripts that dump the exact tensors from the Black Forest reference implementation. Run the C binary with --debug-py to verify pixel‑perfect parity.

./flux -d flux-klein-model --debug-py -o c_debug.png

This is invaluable for sanity checks when tweaking model weights or tokenizers.

FAQ

Question	Answer
Can I use it on Windows?	Yes – a C compiler and the optional BLAS library will build the binary; MPS is exclusive to macOS.
Do I need a GPU?	Not strictly; the generic build runs on CPU. GPU backends simply accelerate inference.
What about the 16 GB download?	The binary can run with memory‑mapped weights (`--mmap`), letting 8‑GB systems generate images without loading the entire model into RAM.

Conclusion

Flux 2‑c brings state‑of‑the‑art image generation to developers who want performance without the bloat of a Python runtime or CUDA. Its tiny footprint, optional GPU support, and the ability to embed the library in C/C++ projects make it ideal for embedded systems, edge computing, or just a fun CLI tool for artists. Clone, build, and start creating—your terminal can now paint masterpieces for free.

Original Article: View Original

Share this article