Translate Your App Locally with LLMs and MLX

March 21, 2026 Translate Your App Locally with LLMs and MLX

Cloud translation APIs send your app strings to external servers. For teams building health, finance, or government apps, that is a compliance problem. Apple's MLX framework runs large language models directly on your Mac's Apple Silicon chip. LocaleKit uses MLX to translate your entire project locally - no API key, no internet, no data leaving your machine.

This post walks through how on-device LLM translation works, when it makes sense over cloud APIs, and how to set it up with LocaleKit.

✅

Docs Reference

Full CLI documentation is available at docs.localekit.app

What is MLX and why use it for translation?

MLX is Apple's machine learning framework for Apple Silicon. It runs transformer models (the same architecture behind GPT, Qwen, and Gemma) directly on your M-series chip, using unified memory shared between CPU and GPU.

For app translation, this means you can run a model like Qwen3 with 119 language support right on your MacBook. No API key, no internet connection, no data leaving your machine.

When on-device MLX makes sense

Your project handles sensitive data (medical records, financial info, classified content)
Company policy prohibits sending source code to third-party services
You want zero ongoing cost - no per-word or per-token billing
You work offline or on restricted networks
You need to translate large batches without rate limits

When cloud APIs are still the better choice

You need top accuracy for European languages (DeepL is hard to beat for EN-DE, EN-FR)
Speed matters more than privacy - cloud APIs process faster than on-device inference
You are translating a small number of strings and already have an API key set up

How LocaleKit uses MLX

LocaleKit wraps the MLX inference pipeline into a single CLI command. You pick a model, point it at your project, and it translates every missing key using on-device inference. The model downloads once and caches locally.

Terminalbash

# Translate with the default MLX model (Qwen3 4B)
$ localekit translate --engine mlx

Downloading mlx-community/Qwen3-4B-4bit... done (2.5 GB)
Loading model into memory... done (3.1s)

Translating Localizable.xcstrings:
  en-US -> de-DE: 42 keys... done (28s)
  en-US -> fr-FR: 42 keys... done (26s)
  en-US -> ja-JP: 42 keys... done (31s)

126 translations written.

Choosing the right model for your Mac

Bigger models produce better translations but need more RAM and run slower. Here is how to pick based on your hardware:

MLX models by Mac configuration

8 GB Mac

localekit translate \
  --engine mlx

# Default: Qwen3-4B-4bit
# Download: 2.5 GB
# RAM usage: ~3 GB

Qwen3 4B is the default. Good quality across most languages. Fits on 8 GB machines with room for Xcode open at the same time.

16 GB Mac

localekit translate \
  --engine mlx \
  --mlx-model \
  mlx-community/Qwen3-30B-A3B-4bit

Qwen3 30B is a mixture-of-experts model. Only 3B parameters activate per token, so it runs fast despite its size. Best quality-to-speed ratio on 16 GB.

32 GB+ Mac

localekit translate \
  --engine mlx \
  --mlx-model \
  mlx-community/Qwen3-32B-4bit

Qwen3 32B is the full dense model. Highest translation quality for production releases, especially for less common language pairs.

💡

First run vs repeat runs

The model downloads once from Hugging Face and caches at ~/Library/Caches/huggingface/. After that, localekit translate --engine mlx works fully offline.

Complete workflow: translate an Xcode project with MLX

Full MLX translation workflow

Initialize your project

Run localekit init in your Xcode project root. This creates .localekitrc.yml with auto-detected settings.

Check current translation status

Run localekit status --detailed to see coverage per language and how many keys are missing.

Translate with MLX

Run localekit translate --engine mlx --languages de-DE,fr-FR,ja-JP. The model loads, translates all missing keys, and writes directly to your .xcstrings file.

Validate the results

Run localekit validate to check for placeholder mismatches (e.g., a %@ in English missing in the German translation).

Review and commit

Open your .xcstrings file in Xcode to review translations in context. Commit both the translations and .localekit-snapshot.json.

Terminalbash

$ localekit status --detailed

  Localizable.xcstrings (42 entries)
  Base language: en-US

  de-DE:  34/42 translated (81%)
  fr-FR:  38/42 translated (90%)
  ja-JP:  20/42 translated (48%)

$ localekit translate --engine mlx --languages de-DE,fr-FR,ja-JP

  de-DE: translating 8 missing keys... done
  fr-FR: translating 4 missing keys... done
  ja-JP: translating 22 missing keys... done

34 translations written.

$ localekit validate

  0 errors, 0 warnings.

Translation quality: MLX vs cloud engines

We tested Qwen3 30B against DeepL and GPT-4o on 200 iOS UI strings across 5 languages.

For European languages (German, French, Spanish), DeepL still produces the most natural output. The gap is small - Qwen3 30B scores within 5-10% on human evaluation.

For Asian languages (Japanese, Korean, Chinese), Qwen3 performs on par with GPT-4o and sometimes better for short UI strings where context is limited.

For all engines, the biggest error source is not the translation itself but placeholder handling. That is why localekit validate exists - it catches format string mismatches regardless of engine.

The best translation engine is the one your security team approves. For regulated industries, that means on-device only.

Mixing engines for best results

You can run multiple engines on the same project. LocaleKit only translates missing keys, so engines do not overwrite each other.

Terminalbash

# Use MLX for Asian languages (Qwen3 excels here)
$ localekit translate --engine mlx --languages ja-JP,ko-KR,zh-CN

# Use DeepL for European languages (best accuracy)
$ localekit translate --engine deepl --languages de-DE,fr-FR,es-ES

# Both runs fill in different gaps. Nothing gets overwritten.

MLX in CI/CD

MLX requires Apple Silicon, so your CI runner needs to be a Mac. GitHub Actions offers macos-latest runners with M-series chips.

.github/workflows/mlx-translate.ymlyaml

name: Translate with MLX
on:
  push:
    paths:
      - "**/Localizable.xcstrings"

jobs:
  translate:
    runs-on: macos-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install LocaleKit
        run: |
          brew tap hexagone-studio/localekit https://github.com/hexagone-studio/LocaleKit.git
          brew install localekit-cli

      - name: Login
        run: localekit login --email ${{ secrets.LOCALEKIT_EMAIL }} --password ${{ secrets.LOCALEKIT_PASSWORD }}

      - name: Translate and validate
        run: |
          localekit translate --engine mlx
          localekit validate --strict

      - name: Commit
        run: |
          git config user.name "LocaleKit Bot"
          git config user.email "bot@localekit.app"
          git add -A
          git diff --staged --quiet || git commit -m "Update translations (MLX)"
          git push

⚠️

CI cost

macOS GitHub Actions runners cost 10x more per minute than Linux. For frequent jobs, translate locally with localekit sync instead - it runs MLX on your Mac and pushes a PR.

MLX Translation FAQ

How much disk space do MLX models need?

Qwen3 4B uses 2.5 GB. Qwen3 30B uses 16 GB. Models cache at ~/Library/Caches/huggingface/ and persist across runs. Delete them anytime to free space.

Can I use a custom fine-tuned model?

Yes. Any MLX-compatible model on Hugging Face works. Pass the model ID with --mlx-model. Fine-tuning on your domain terminology gives the best results.

Does MLX support more languages than DeepL?

Yes. Qwen3 supports 119 languages vs DeepL's 33. Quality varies by language pair - European and CJK languages work well. Less common pairs may need a larger model.

Is the translation deterministic?

Not exactly. LLMs are probabilistic, so running the same translation twice may produce slightly different wording. The meaning stays consistent. DeepL is deterministic if you need exact reproducibility.

Can I mix MLX with cloud engines on the same project?

Yes. Run localekit translate --engine mlx --languages ja-JP first, then --engine deepl --languages de-DE. LocaleKit only fills in missing keys, so nothing gets overwritten.

Stop managing translation files manually

LocaleKit detects, translates, and syncs all your localization files — iOS, Android, Flutter, and more. Everything runs locally on your machine.

Download for macOSApple Silicon (M1+)Book a demo

Privacy-first. No cloud required.