Meta has announced the development of an open-sourced ‘No Language Left Behind’ NLLB-200. This is a single AI model that is the first to translate across 200 different languages, including 55 African languages, with results that are state-of-the-art. By utilizing the modeling techniques and lessons learned from the project, Meta is enhancing and expanding translation capabilities on Facebook, Instagram, and Wikipedia.
This single AI model was created with a focus on African languages in order to establish high-quality machine translation capabilities for the majority of the world’s low-resource languages. The goal of this work is to develop these capabilities on a global scale. They provide certain difficulties when attempting to translate them with a computer.
To be able to learn, AI models require an enormous amount of data, but there is not a lot of training data available that has been translated by humans for these languages. For instance, although there are more than 20 million people who speak and write in Luganda, examples of written Luganda are quite difficult to discover on the internet.
“We worked with professional translators for each of these languages to develop a reliable benchmark which can automatically assess translation quality for many low-resource languages”, said Meta CEO Mark Zuckerberg in a post on his Facebook profile.
“We also work with professional translators to do human evaluation too, meaning people who speak the languages natively evaluate what the AI produced. The reality is that a handful of languages dominate the web, so only a fraction of the world can access content and contribute to the web in their own language. We want to change this by creating more inclusive machine translation systems – ones that unlock access to the web for the more than 4B people around the world that are currently excluded because they do not speak one of the few languages content is available in”.
Language is our culture, identity, and lifeline to the world. However, as high-quality translation tools don’t exist for hundreds of languages, billions of people today can’t access digital content or participate fully in conversations and communities online in their preferred or native languages. This is especially true for hundreds of millions of people who speak the many languages of Africa.
“Africa is a continent with very high linguistic diversity, and language barriers exist day to day. We are pleased to announce that 55 African languages will be included in this machine translation research, making it a major breakthrough for our continent,” Balkissa Ide Siddo, Public Policy Director for Africa said while speaking about the launch of the AI model.
“In the future, imagine visiting your favorite Facebook group, coming across a post in Igbo or Luganda, and being able to understand it in your own language with just a click of a button – that’s where we hope research like this leads us. Highly accurate translations in more languages could also help to spot harmful content and misinformation, protect election integrity, and curb instances of online sexual exploitation and human trafficking.”
While commenting on accessibility and inclusion in the pursuit of building an equitable metaverse, Ide Siddo added “At Meta, we are working today to ensure that as many people as possible will be able to access the new educational, social, and economic opportunities that the next evolution of the internet will bring to future technology and an everyday living experience tomorrow.”