
3. Setting up a project to collect voice data
Chapter 3 overview:
Why set up a project to collect voice data?
Before you start a project to collect voice data, you need to define your goals. This is to make sure that the data will be useful. It is especially important in projects. This is because if there is very little available, you need to make sure that you don’t collect data that is not useful or relevant.
Here is a typical list of questions to ask:
What are the main goals of your project?
Who will it help?
What problem do you want to solve?
How will the data be used?
How will your project support and empower the language community involved?
The answers will help you to design your project. For example:
If you plan to use the to build tools like speech recognition, the data must meet technical needs, such as language variant, quality, and .
If you plan to use the datasets to build a speech solution for a specific use case, the data should match the context. For example, if you want to build a voice assistant for farmers, you may need to collect data in rural dialects, on topics relating to farming, and in real-world outdoor settings.
Last updated
Was this helpful?