Posts tagged with: MLX

Content related to MLX

oMLX: Mac Menu Bar LLM Server with SSD Cache

March 10, 2026

Discover oMLX, the ultimate local LLM server for Apple Silicon Macs. Run LLMs, VLMs, and embeddings from your menu bar with continuous batching, tiered KV caching (RAM + SSD), and multi-model serving. Features admin dashboard, OpenAI API compatibility, Claude Code optimization, and one-click model downloads from Hugging Face. Install via DMG, Homebrew, or source – perfect for developers wanting production-grade local AI without cloud costs.

Build Your Own LLM Server in a Week

September 12, 2025

Dive into the world of Large Language Models with Tiny LLM, a practical, open-source course designed for systems engineers. Learn to build and optimize LLM serving infrastructure from scratch using MLX on Apple Silicon. This week-long journey covers everything from fundamental matrix operations to advanced C++/Metal kernels and request batching for high throughput. Whether you're curious about LLM internals or aiming to deploy your own, Tiny LLM offers clear guidance and community support to demystify LLM serving.