Na AI Go Save Our Language?
Nigeria has over 500 languages. The machines could rescue the dying ones or harvest them. Who decides is a Nigerian question.
A language does not die when its last speaker dies. It dies earlier, in the long quiet years when the children stop answering in it. The grandmother speaks; the child understands but replies in English, or in Pidgin, or in the bigger language of the bigger market. The grammar survives one more generation as comprehension and then disappears as speech. By the time a visiting linguist identifies and records the last fluent speaker, the language has usually been dead, as a living thing, for thirty years. What the linguist captures is a specimen.
Nigeria has somewhere above 500 languages, depending on how one counts the boundary between a language and a dialect, which is itself a political question more than a linguistic one. Ethnologue lists 520 living indigenous Nigerian languages and twelve already extinct. Most are healthy. Hausa, Yoruba, and Igbo are spoken by tens of millions; Pidgin by tens of millions more. But a long tail of smaller languages, many in the Middle Belt and the old Eastern minority areas, are now in the quiet years. Some have a few thousand speakers. Some have a few hundred. A handful have fewer, and the children have already stopped answering.
There is a question worth asking before the eulogies. Does it matter? And if it matters, can the technology that is reorganising everything else also be turned, before it is too late, toward this?
It matters, and the reason is older than Nigeria. Writing preserves memory; print confers prestige; prestige shapes survival. When Europe's vernaculars began to be printed in the 15th century, the book did more than store them; it told their speakers that the language they thought in was a language one could think in. Latin had been the tongue of serious thought; print made serious thought possible in French, German, and English. A language that exists in books can hold a philosophy, a law code, a national literature. A language that does not exist in books inherits a particular poverty: cut off from the myths and organising stories a culture uses to tell its members who they are and what is owed.
A colonial lie lingers in the neighbourhood and must not be picked up by accident: that Africa had no history because it had no writing. Oral tradition is a legitimate historical method, capable of transmitting genealogy, law, and chronology across centuries with disciplined accuracy. West Africa was not uniformly oral. The libraries of Timbuktu held a vast archive of manuscripts on astronomy, jurisprudence, medicine, and poetry while Europe was still emerging from its own scarcity of manuscripts, and the Sokoto Caliphate produced a written literature of real depth. The problem has always been selective institutionalisation: which languages received schools, orthographies, and printing, and which did not.
Many Nigerian languages never made the crossing. Conquest broke transmission chains. Urbanisation scattered the communities in which a smaller language was the medium of daily life. The economy attached opportunity to English and to the three large languages, so that a parent's love for a child became, rationally and tragically, a reason to raise the child away from the smaller tongue.
The Nigerian dimension was just made more urgent by the state itself. In November 2025 the Federal Government cancelled the National Language Policy and declared English the sole medium of instruction from early childhood through tertiary education. The Minister of Education, Dr Tunji Alausa, cited the lack of teachers and instructional materials in indigenous languages. The 2022 policy his administration reversed had required mother-tongue instruction for the first six years of basic education. The implementation problems are real. The deeper choice the reversal expresses is that, faced with the difficulty of teaching Nigerian children in Nigerian languages, the country chose English. This is the context in which the question about the machines must be asked.
The promise is real. Modern language technology can now do quickly and cheaply what used to require a linguist's career. It can transcribe recordings, build a rough dictionary and a draft grammar from a corpus that once took decades to assemble, and produce, from a modest amount of recorded speech, a model that lets a younger speaker hear a sentence in a language their parents stopped teaching them. For a tongue with a few hundred speakers and no written tradition, the floor has dropped. The work is smaller, faster, and cheaper than it has ever been.
There is already a model for how to do this well, and it is African. Masakhane, which means we build together in isiZulu, is a research network founded in 2019. By 2023 it counted over 400 researchers across more than 30 African countries, with published findings on more than 38 African languages. Its defining commitment is the opposite of extraction: the work is participatory, native speakers are co-authors, the corpora and models are released openly. In November 2025 Masakhane joined Microsoft, the Gates Foundation, and Google.org in launching LINGUA Africa, anchored in the principle that a language belongs to the people who speak it, and that the technology built on it must answer to them.
The other principle is profit, and it is naive to pretend it is not coming for these languages too. Large models improve with more data, and the world's text has largely been consumed already. Low-resource languages are one of the few untapped corpora left. The commercial incentive is real: harvest whatever Nigerian-language text and speech exists, fold it into systems owned elsewhere, and sell the capability back, sometimes to the very communities the data came from, at a price they do not set. A Nigerian language becomes raw material, extracted, refined far away, returned as a product. The language survives in the model and dies in the mouth, which is the most modern way a language has ever been able to die.
The difference between rescue and extraction is governance. Who holds the corpus? Who decides what is built? Who is paid, credited, and consulted? Whether the community can say no.
If the machines are to do the rescuing rather than the harvesting, the work cannot be left to the market, nor to volunteers alone. It requires the state, which is the part of this argument Nigerians are most tired of hearing and least able to avoid. A serious counter-policy would treat the documentation of endangered Nigerian languages as national infrastructure, because a country's languages are the substrate on which everything else it values is built. It would fund the unglamorous work: the recording, the orthography development, the corpus building, the training of young Nigerian computational linguists, so that the people who model Defaka and Eleme and the smaller Edoid tongues are Nigerians who answer to those communities. It would reward the work with status, so that a brilliant young Nigerian who could go and optimise advertising in Silicon Valley has a reason to stay and rescue a language instead.
It would also, finally, do the thing deferred for fifty years. It would take Nigerian Pidgin seriously.
Pidgin is the largest language in Nigeria that the Nigerian state refuses to fully acknowledge. Tens of millions speak it, across every ethnic line, as the one genuinely national tongue the country produced by itself rather than inherited. It has been creolising for generations; for many Nigerians it is now a first language. It still has no settled orthography and no place in the school system commensurate with its reach. Professor Ben Elugbe, who died in September 2025, spent much of a distinguished career arguing that this was an error the country could afford to correct. With Augusta Omamor he produced the foundational modern study, Nigerian Pidgin: Background and Prospects (1991), identifying it as a language in its own right. Against those who wanted Pidgin spelled as a kind of broken English, he proposed a writing system closer to that of Nigeria's indigenous languages, because Pidgin is a Nigerian language and should look like one on the page. He did not live to see the argument settled. The standardisation he called for would do for Pidgin what print did for the European vernaculars: confer the standing its speakers have earned and the state has withheld.
Both the small dying languages and the large unacknowledged one are forms of the same withholding. The state has decided, by neglect and now by formal reversal, which Nigerian ways of speaking are serious and which are not. The technology now arriving makes the cost of that decision newly visible, because for the first time the rescue is technically cheap, and the only remaining obstacle is whether anyone with power decides it is worth doing.
So, can AI save Nigerian languages from extinction? AI cannot save anything by itself. It is a tool, and tools do what their holders intend. The same system that could give a Defaka child a model of her grandmother's voice could turn that voice into a line item in a corpus owned in another hemisphere. The technology has made the rescue possible. Whether it happens is a political question, and the politics are Nigerian.
What is true, and what would have been impossible to say a decade ago, is that the door is open. For a few thousand recorded sentences and a community willing to be co-authors rather than subjects, a language that was going to become a specimen can instead become something else: archived, teachable, searchable, alive enough to be handed to a child who will answer in it. Second chances for whole languages do not come often in history. The first one, for most of these tongues, was the printing press, and they missed it through no fault of their own.
It would be the bitterest kind of repetition to miss the second one the same way, while the means to take it sat in every phone in the country, waiting for someone with authority to decide that a grandmother's sentences were worth as much as an advertisement.
They are worth more. Someone should say so with a budget.