Skip to main content

From the creators of AfriSenti and AfriHate

Build African language datasets, the right way.

An open playbook and annotation platform for grassroots NLP data collection — designed with communities, for communities, across the continent.

AfriPlaybook

A practical guide to dataset creation, written with the communities who use it — from task formulation and label schema design to consent forms, inter-annotator agreement, and sustainability. Every chapter is built around real low-resource language scenarios.

  • Step-by-step guidelines, video demos, and quality checklists
  • Voice, text, speech–text alignment, and translation chapters
  • Templates for consent, licensing, and governance toolkits
  • Translated into 5 African languages with community review

AfriAnnotate

An open, mobile-first, Progressive Web App for grassroots data collection — built for the realities of African contexts: patchy connectivity, multiple scripts, and community-led annotation workflows.

  • Offline-first capture with background synchronization
  • Speech, text, ranking, and multimodal annotation support
  • Inter-annotator agreement (Fleiss' κ, Krippendorff's α) dashboards
  • African-language localization and virtual keyboards
  • Apache 2.0 licensed with a clear contributor agreement

AfriFinder

Find verified annotators and African NLP experts by language, domain, and region — vouched for by the communities who speak the language. A marketplace for annotation jobs and a directory for researchers and linguists, in one place.

  • Post annotation tasks and hire native-speaker annotators
  • Search experts by language, NLP domain, and region
  • Language Lead verification per language community
  • Mobile money and local payout options for annotators
  • Open profiles for collaboration and project invitations

From the Community

The Playbook is exactly the practical, reproducible guide that African NLP has needed — a real reference, not a brochure.

NLP researcherAfrican NLP community

Pairing the Playbook with the Tool turns annotation theory into reproducible practice — that combination is what makes it useful in the field.

Dataset builderAfrican NLP community

Documenting low-resource language work has long been ad-hoc — a shared playbook gives our teams a common vocabulary and saves a lot of guesswork.

Computational linguistAfrican NLP community

Open guidance like this lowers the barrier for builders across the continent to ship language-first AI products responsibly.

ML engineerAfrican NLP community

Open infrastructure for African languages is finally catching up with the rest of the field. This is a major milestone for the community.

Language technologistAfrican NLP community

A community-built standard for low-resource annotation. Long overdue and well executed — the kind of resource teams will reach for daily.

Educator & researcherAfrican NLP community

For multilingual annotation across Bantu languages, this is the resource I wish we had had years ago — clear, applicable, and honest about trade-offs.

PhD researcherAfrican NLP community

The combination of methodological rigor and African-context grounding makes this stand out from generic NLP guides — a long-overdue reference.

Open-source contributorAfrican NLP community

Thanks to our Contributors

The Playbook is built by a growing community of researchers, students, and language experts. If you've contributed code, content, or review — thank you.

SUPPORTED BY

Masakhane African Languages HubBayero University, KanoBahir Dar UniversityHausaNLPEthioNLP