Burmese-first language models, datasets, and tools — built by the community, for the community. Constructing the digital heritage infrastructure that our community should own.
[ Latest Build ]
v0.9.2-alpha
[ Parameters ]
7B / 1.5B (Quantized)
[ Training Data ]
They built the revolution in English. We are rebuilding it for us.
For 50 million voices, the AI revolution arrived with a "Tokenization Tax." Big Tech models charge 13x more to speak Burmese, while the few working alternatives remain locked behind closed doors.
We didn't wait for permission. We started in 2016, building Burmese NLU in Yangon long before the ChatGPT hype. Our engine processed 100 million conversations for giants like Samsung and Unilever—proving it could be done when others ignored us.
Now, we are taking control. We are open-sourcing everything. Led by an EB-1A "Extraordinary Ability" founder—recognized by the U.S. for rising to the top of this field—we are building the public utility we wish we had a decade ago.
We aren't just building models. We are securing our digital heritage.
Pre-LLM Era
Built from scratch in Yangon. Before transformers existed.
LLM Revolution
The breakthrough was in English. Burmese was an afterthought.
Sovereignty Era
Open-sourcing everything. We own our future.
Language access is a human right. Language sovereignty is how we protect it.
When your conversations flow through foreign APIs, you're renting access to your own language. We're building infrastructure that belongs to the community — open, private, permanent.
Every model we release. Every dataset we share. Yours to deploy on your terms.
No API lock-in
Apache 2.0 forever
Your data stays yours
No extraction
Runs offline
On your hardware
Fully customizable
Fine-tune for your needs
| Big Tech | Own It |
|---|---|
| Terms change tomorrow | Apache 2.0 forever |
| Your prompts train them | Data stays yours |
| Requires internet | Runs offline |
| One-size-fits-all | Fine-tune it |
| Pricing increases | Free forever |
What You Can Deploy Today
Llama-2-7B fine-tuned on 52K Burmese instructions. GGUF quantized for local deployment.
Chatbot-based tool for crowdsourcing translation data. How we build datasets with the community.
Myanmar's MNIST. 60K training samples, 27.5K test samples. Foundation for Burmese OCR.
See the 1300% tokenization tax in action. Interactive visualization of how LLMs handle Burmese.
This isn't just about one language. It's a blueprint for any community that wants to own its AI future.
In Development
Next-gen LLM, trained from the ground up for Burmese.
Burmese-English parallel corpus, production-quality.
Billion-token intent classification dataset.
Regional Expansion
Global Peers
We're not alone. Around the world, communities are building language AI they own.
AI for African languages, by African researchers. 2000+ languages.
Protecting Māori data sovereignty in New Zealand.
Different languages. Same principle.
This infrastructure exists because people contributed. Here's how you can be part of it.
For Developers
Run Burmese AI on your own infrastructure. Apache 2.0. Fork it. Fine-tune it. Ship it.
For Burmese Speakers
Help build the datasets that train these models. Use Echopod to contribute translations.
▶ Try EchopodFor Organizations
Fund development. Collaborate on research. Bring this infrastructure to your community.
▶ Contact Us