Breaking Down AI's Language Barriers: Why Local Languages Matter for African Development
By Prof. Vukosi Marivate, Chair of Data Science, Professor of Computer Science, University of Pretoria
When artificial intelligence systems fail to understand local languages, they don't just create technical inconveniences—they perpetuate digital exclusion that threatens cultural identity and economic opportunity across Africa. The challenge facing leaders from Cape Town to Cairo is stark: how do we ensure AI serves all our people, not just those fluent in English?
Prof. Vukosi Marivate
Chair of Data Science, Professor of Computer Science, University of Pretoria
The Current Reality
Today's AI systems perform poorly in African languages. When asked to count from one to ten in isiZulu, ChatGPT responds with "one good, two good, three four"—a response that would be laughable if it weren't so revealing of the broader problem. Translation attempts fare even worse, with systems converting simple phrases about airplane travel costs into accusations of cowardice.
This isn't merely a technical glitch. Research shows that major AI models score between 29-60% on African language tasks, while African-focused models achieve significantly better results. The root cause lies in data inequality: English Wikipedia contains 7 million articles, while isiZulu has just 11,000, and lesser-resourced languages like Sesotho have fewer than 1,000.
Beyond Technical Challenges
The implications extend far beyond software performance. Language carries culture, knowledge, and human dignity. When AI systems cannot process local languages, they effectively exclude millions from participating in the digital economy. Small business owners cannot use voice assistants in their mother tongue. Students cannot access educational AI tools in languages they understand best. Healthcare workers cannot leverage AI diagnosis systems that comprehend local symptom descriptions.
This digital divide creates a feedback loop of marginalisation. As AI becomes increasingly integrated into economic and social systems, communities whose languages are unsupported risk being further left behind. The problem resembles what Professor Vukosi Marivate calls a "high interest credit card"—the longer we delay addressing language representation, the more expensive and difficult the solution becomes.
Practical Investment Strategies
African researchers and institutions are pioneering solutions that provide clear roadmaps for other regions. The Masakhane research network, with over 3,000 contributors across the continent, demonstrates how grassroots collaboration can drive meaningful progress while winning international awards and securing millions in funding.
Leaders can support local language AI through several concrete measures:
Data Infrastructure: Convert existing multilingual dictionaries and glossaries from PDFs into machine-readable formats. Follow South Africa's Next Voices model of systematically recording and transcribing thousands of hours of local speech. Work with data organisations to ensure local content isn't automatically filtered out of global training datasets.
Technical Development: Focus on smaller, resource-efficient models rather than competing with billion-dollar systems. Build mathematical representations that help AI understand relationships between words across multiple languages. Create accessible APIs that let local developers integrate language tools into their applications.
Policy Reform: Implement equitable licensing frameworks like the NOODL license, which allows free use by developing countries while ensuring benefits flow back to communities. Reform academic funding to recognize collaborative research and conference publications, not just journal articles that discourage the international cooperation this work requires.
Institutional Support: Fund pan-African research collaborations and make cross-border academic partnerships easier. Train more local researchers through grants and capacity-building programs. Create institutions like Professor Marivate's new African Institute for Data Science and AI to anchor long-term development.
The investment required is substantial but achievable. Unlike billion-dollar models from tech giants, effective local language systems can be built with strategic resource allocation and strong partnerships between universities, government agencies, and private sector partners.
Conclusion
As AI reshapes global economic and social structures, African leaders have a choice: accept systems designed elsewhere that exclude local voices, or invest in developing AI that truly serves all citizens. The technical challenges are solvable, but success requires understanding that this is fundamentally about human dignity and cultural preservation.
The window for action is narrowing. Every day that passes without investment in local language AI makes the eventual solution more expensive and the digital divide wider. The question isn't whether African countries can afford to invest in multilingual AI—it's whether they can afford not to.
Resources:
Data Science for Social Impact (DSFSI), University of Pretoria: https://www.dsfsi.co.za
African Institute for Data Science & AI (AfriDSAI): https://up.ac.za/afridsai
Masakhane Research Foundation: https://www.masakhane.io/
Deep Learning Indaba: https://deeplearningindaba.com/
Next Voices (African speech dataset project): https://huggingface.co/datasets/dsfsi...
PuoBERTa Language Models: https://huggingface.co/spaces/dsfsi/P...
NOODL Open Data License: https://licensingafricandatasets.com
Esethu License Framework (Lelapa AI): https://lelapa.ai/a-global-first-how-...
About Prof. Vukosi Marivate
Prof. Vukosi Marivate is Chair of Data Science and Professor of Computer Science at the University of Pretoria, where he leads the Data Science for Social Impact group. His research focuses on Machine Learning (ML), Artificial Intelligence (AI), and Natural Language Processing (NLP), particularly for African and other low-resource languages. He co-founded Lelapa AI, the Masakhane Research Foundation, and the Deep Learning Indaba. His work spans social challenges in science, energy, public safety, and utilities, aiming to create AI for Africans by Africans.