Journal
Engineering notes, product stories and ideas from the team.
Designing Flux Engine: one runtime for every product
A peek at the architecture choices behind our shared on-device inference runtime — and the constraints that shaped them.
WhisperFlux preview: speech, speakers, summaries — all local
Our flagship enters internal beta. A walkthrough of the streaming pipeline and what we learned shipping ASR on real phones.
Squeezing a 7B model onto your phone: a quantization field guide
Q4_K_M, AWQ, GPTQ, SmoothQuant — what actually matters when you only have 4 GB of RAM and a 4 W power budget.
Private by architecture, not by promise
A privacy policy is a promise. An app that has no servers is a fact. Here is the difference, in code.
Speaker diarization on a phone: a deep dive
How we run end-to-end speaker diarization in real time on a 4 W power budget — without uploading any audio.
VisionFlux: a roadmap for local visual understanding
Our second product takes shape. What local vision-language models can do today — and where we are betting they go next.
TranslateFlux: building a private, offline universal translator
Notes on translation latency, quality, and the engineering tricks that let a small model feel competitive with a much larger cloud one.
NoteFlux: design philosophy for an AI notebook that respects you
NoteFlux is a notebook with a quiet local model living inside. Here is what we want it to feel like — and what we are explicitly not building.
CodeFlux: an offline pair programmer that respects your repo
A 7B-class code model running locally with project-aware retrieval. Why a smaller model with the right context beats a larger one without.
No account, by design
You will never sign up for an OmniFlux app. We argue why \"no account\" is a feature, not a missing one — and how it changes everything downstream.
Building for airplane mode
A simple design rule shapes everything we ship: every feature has to work without a network. Here is what that does to a product.