Masakhane, An african AI story

Elyes Manai
3 min readDec 6, 2019

“Let’s build together” is the english term for “Masakhane”, which is in Zulu, a south african language. Didn’t know that? Us neither, and that’s the problem.

When we talk about languages, we always mention English, French, Spanish, German, Japanese… The “popular” ones. The ones that everyone HAS to learn to communicate with the rest of the world, at the cost of one’s own language.

African languages & dialects are such an example, as only Africans & some external people that like the culture are speaking them and trying to understand them

This translates to a lot of content in the major languages and too little in the others. Which in turn translates to an imbalance in available data between languages. Which finally translates to poor online cross-lingual services like translation. If Google Translate, with all the english & spansih content online, still mistranslates some english to spanich sentences, then imagine the rest.

That’s why a handful of AI practicioners from South Africa banded together and launched the “Masakhane” initiative, a continent-level collaborative project that focuses on Machine Translation for African Languages.

The project consists of everyone translating the same big english text to their native language, then training an NLP (Natural Language Processing) AI to learn how the translation works. The amazing thing is that since we’re all translating from the same source text, we can perform cross-lingual translations (eng → Tunis & eng → Zulu can become Tun → Zulu).

By the end of the project, we should have a functional AI-based African Language Translator.

Volunteers from all across Africa banded together to work towards a common goal: Putting Africa in the AI map!

It quickly became a community, with volunteers now partnering up with volunteer teams to help in the backend. We number 60 representative alone!

The first 12 representatives in a long list, Our Head of Research is right there in the right!

And it’s starting to gain media attention:

And what about Tunisian Translation?

Representing Tunisia is of course Data Co-Lab! Our research team saw the opportunity in the project since, if done, it could lead to a plethora of interesting projects.

We have been part of this project for a while now, and are starting to see results. We have a team of 15 persons working on the project : 2 Co-labers dealing with the AI & Management and 13 volunteers helping us translate the english text to tunisian.

At first, we had a score of 0 (on 1), but with the help of the volunteers, the score is steadily increasing. It wouldn’t have been possible without them!

Can I help?

Of course you can! Be it in AI or in translating, help is always welcome! All our volunteers will be publicly endorsed in our projects page and rewarded.

When can we see the results?

Once we achieve a satisfying score, we’ll start deploying the solution online for you to test, so don’t forget to follow our facebook page for any upcoming news!

What else?

At Data Co-Lab, we’re working on projects that are useful from day 1 of deployment, we will soon publish our other projects and you’ll see.

We’re always looking for collaborations & problems to solve, so if you got anything interesting, contact us!

--

--

Elyes Manai

Google Developer Expert in Machine Learning & Nvidia AI Instructor