Skip to main content

Speech

Collecting and labelling audio for African languages. The tasks below share the same recording, consent and quality groundwork; they differ in what is produced from the audio.

Shared across speech data

Speaker recruitment, consent & voice rights
Recording setup & environment (device, microphone, background noise)
Audio formats, sample rates & file standards
Audio quality control (SNR, clipping, silence, channel checks)
Transcription conventions (orthography, code-switching, disfluencies)
Metadata (speaker demographics, device, environment, locale)
Licensing & ethical handling of voices

Tasks

ASR (Automatic Speech Recognition) – (transcription, multilingual ASR, code-switching)
TTS (Text-to-Speech) – (single-speaker, multi-speaker, expressive TTS)
Speech-to-Speech Translation (STS) – (direct speech translation across languages)
Audio Understanding – (audio classification, sound event detection)
Speech emotion recognition
Speaker diarization

Contributor

Shared across speech data
Tasks

Join the discussion

Spotted an error, have a question, or want to share what worked on a real project? Sign in with GitHub to add your voice — every thread lives in the open, powered by GitHub Discussions.

Loading discussion…