MediaPipe Articles

Android Local LLM Inference: LiteRT, MediaPipe, Quantization, and Production Trade-offs

A practical guide to Android local LLM inference across LiteRT, ONNX Runtime Mobile, MediaPipe LLM Inference, INT4 quantization, GPU delegates, KV cache memory, and device fallback.

Android On-device RAG: From Local Vector Databases to LLM Inference

A practical walkthrough of on-device Android RAG, covering document chunking, local vector search with SQLite, MediaPipe LLM inference, and performance trade-offs.