Ruka hadi maudhui makuu
Community

Grassroots Data Collection Training

Capacity building for community annotators across African language groups

DateTBD
VenueHybrid (In-person + Online), TBD
Duration3 hours

About the workshop

A training session designed for grassroots annotators, linguists, and community leaders. Participants learn how to set up annotation projects, manage contributors, apply quality-control measures, and connect their data pipeline to the AfricaNLP Playbook governance guidelines.

Agenda

  1. 20 min
    Data governance and community ownership principles
  2. 40 min
    Setting up and managing an annotation project
  3. 40 min
    Hands-on annotation across language groups
  4. 30 min
    Quality control and validation workflows
  5. 30 min
    Discussion — sustainability and long-term engagement

Objectives

  • Build local capacity for running annotation projects.
  • Teach data governance and community ownership principles.
  • Connect grassroots annotators with the broader Masakhane community.
  • Identify priority languages and tasks for upcoming data collection campaigns.

Expected outcomes

  • Participants able to independently run annotation projects.
  • A network of trained community annotators across African language groups.
  • A list of priority languages and tasks for upcoming campaigns.

Who should attend

Grassroots annotators and community leadersLinguists working with under-resourced languagesNGOs and civic tech groups interested in language data

Organizers

Masakhane community team