# 6. Community engagement

#### <mark style="color:blue;">Chapter 6 overview:</mark>

{% hint style="info" %}
This chapter is for project managers and community engagement officers. It’s also for anyone who is trying to get communities involved in collecting voice data.

We will cover how to:

* find the right contributors
* onboard contributors
* keep them involved over time
* motivate users with ethical methods
* use feedback to keep people involved&#x20;

Note: This section is practical. You don’t need any technical skills.
{% endhint %}

Community engagement means the process of getting speakers of marginalized languages involved in your project. They then contribute data that will later be used to build [language technology](#user-content-fn-1)[^1]. It will also be used for research, training and even to assess existing language models and AI.&#x20;

For [TWB Voice](#user-content-fn-2)[^2], this process was the key to success for [voice data collection](#user-content-fn-3)[^3] and validation. In this section, we share our learnings and the approaches that we tested in the TWB Voice project. We offer practical help with finding, onboarding and supporting [community members](#user-content-fn-4)[^4] and keeping them involved so we can collect as much [voice data](#user-content-fn-5)[^5] as possible.

<br>

[^1]: **Language Technology (LT):** Technologies that focus on human language, including both spoken and written language. They can process, understand, and generate language. Examples are the tools on your phone or computer that understand and generate words, like translation apps or voice assistants. They allow us to communicate and interact with our devices with language. When you record a message and the device transforms it into text, or your phone suggests the next word in a message, that’s because of language technology. It makes digital tools more accessible, interactive and sometimes more efficient.

[^2]: **TWB Voice:** A platform for collecting voice data. It was developed by CLEAR Global, who also own it. Users can make voice recordings to help with active data collection projects in TWB Voice by [signing up to the TWB Community](https://translatorswithoutborders.org/join-the-twb-community/). The main goal of TWB Voice is to help to develop voice technology for speakers of marginalized languages. For example, by creating the voice datasets that are needed to build language models for TTS and ASR.

[^3]: **Voice data collection:** Gathering recordings of speech with their transcriptions in a systematic and ethical way. Also involves collecting demographic data (age, gender, accent) and for Automatic Speech Recognition should include a range of speakers. The voice data is used in research and for training or developing voice language models.

[^4]: **Community members:** People at the center of a project who contribute by recording and validating data. In this context, community refers to their shared interest in contributing to data collection and the languages they speak. Community members are well positioned to provide feedback on linguistic aspects and offer input on the project's direction and needs.

[^5]: **Voice data:** Audio recordings of human speech. These recordings capture the acoustic features of spoken language, such as pronunciation, speaking patterns, and rhythm.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://twbvoiceplaybook.clearglobal.org/6.-community-engagement.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
