MBRCGI Websites
|
Ibtekr.org
|
MBRCGI.gov.ae
|
UAE Innovates
|
Edge of Government
|
Pitch@Gov

Estonia chooses crowdsourcing as a way to preserve local language

10 minute read
To preserve the linguistic heritage and facilitate the development of speech techniques, The Estonian government is working to create the largest database of spoken language with the participation of volunteers from different community groups.
Share this content

Add to Favorite ♡ 0

To preserve the linguistic heritage and facilitate the development of speech techniques, The Estonian government is working to create the largest database of spoken language with the participation of volunteers from different community groups.

Recent trends may promote that technology is putting popular legacies at risk. But the reality is that technological progress and the preservation of cultural heritage are not parallel lines. And that they will inevitably meet at a certain point, The strategic use of technology will not only preserve underrepresented cultures, It will even help promote them.

Since the emergence of spoken speech analysis technology in smart devices, It saved a lot of time and effort and was popular and used to provide services in the public and private sectors. But creating and running them effectively requires an enormous set of training data to develop algorithms. And when we talk about spoken language, This data is hours and hours of recorded speech. however How to get it? This is the question faced by organizations wishing to create digital solutions to improve their customer experience, While these technologies perform better when trained to handle user voices, Protection and privacy considerations constrain the achievement of this goal, Especially when the implementer is a software developer trying to create a generalizable training model, Many popular voice assistant apps have also recorded cases of gender and racial bias.

In other cases, The problem was due to the low rates of use of native languages, As in the case of Estonia, In the services, information technology and higher education sectors, The reliance on foreign languages and the growing presence of the international workforce have contributed to the decline in the presence of the mother tongue, This prompted the government to launch the "Estonian Language Strategy 2021-2035", To maintain its position in light of the rapid growth of the digital society.

As a pioneer in digitization, Her Ministry of Economic Affairs and Communications collaborated with the Information System Authority to launch the "Donate Your Speech" project for local language crowdsourcing.

In this campaign, The State addresses all adults who speak its language, Whether it is their mother tongue or an acquired language, It invites them to literally donate their words in order to build an extensive database and make it available to government, private and research institutions wishing to develop speech-based services.

Conceptually, Voice crowdsourcing means collecting a large amount of sounds from diverse populations or from different styles of speech. Patterns refer to languages, dialects, or even speech problems that may be common to certain social groups. The technology can also be used to record meetings, convert interviews into transcripts, and create automatic media subtitles.

This campaign benefited from Mozilla's crowdsourcing tool. And through them, It seeks to establish an open database of 4,000 high-quality hours of spoken speech, translated text, and sign language datasets. Open data was chosen to eliminate the need to create separate datasets for each individual project.

To collect this data, The Ministry is preparing a wide advertising campaign that will be broadcast through various media and social media to raise awareness of the importance of language techniques and the preservation of the local language. The technical team has designed a special website that participants can access from any device with audio input such as a personal computer, tablet or smartphone. And talk about any topic of their choice.

Earlier this year, the government launched an app called Porokrat, an AI-powered program that allows people to use voice assistants to access public services. The Public Broadcasting Agency was also able to develop a smart system called "Hans", and replaces the book of shorthand, It converts the content of programs broadcast live on television into brief written texts watched by tens of thousands of people with hearing difficulties. It also records parliamentary conversations in the form of audio files and converts them into written texts, For editors to review before being published on the official website of Parliament.

But crowdsourcing projects usually face several challenges. The first is the quality and accuracy of data, Transcription of audio recordings can cause technical problems and affect the clarity of speech. The second challenge lies in data privacy, Especially since registrations will be available on an open portal. So, After collecting the recordings, Identifying information that may refer to its owners will be deleted, They will still be able to delete their recordings whenever they want.

As for the biggest challenge, It is data bias, Some societal groups will, of course, register less participation. This includes minorities, the elderly and people of determination. Hence, To reach a comprehensive database for all Estonians, Additional efforts must be made to reach out to different population groups and address them with the most appropriate awareness discourse.

Voice crowdsourcing contributes to more diverse data collection, therefore, Develop smarter algorithms. The campaign will also help establish language technologies in information systems used in the public and private sectors. and improve access to services.

Speech recognition software is useful in facilitating the work of security, criminal, judicial, health, research and media agencies. Blogging and detailed reporting are vital necessities.

In the long run, These efforts aim to make voice recognition a positive experience for everyone, regardless of their languages, genders, ages or affiliations.

References:

https://www.hm.ee/sites/default/files/htm_eesti_keele_arengukava_2020_a4_web_en.pdf

https://e-estonia.com/estonian-parliament-uses-speech-recognition-technology-to-create-verbatim-records/

https://annetakonet.ee/projekti-kirjeldus/

https://govinsider.asia/inclusive-gov/estonia-crowdsources-speech-data-for-the-preservation-of-the-estonian-language/

https://thenextweb.com/news/how-mozilla-is-crowdsourcing-speech-to-diversify-voice-recognition

Subscribe to Ibtekr to stay updated on the latest government initiatives, courses, tools and innovations
Register Now
Subscribe to the Ibtekr's mailing list | every week
Innovators Mailing List
We share with more than 20,000 innovators weekly newsletter that monitors global innovations from all over the world
Subscription Form (en)
More from Ibtekr

Innovative Tools to Safeguard Public Health Against Heatwaves Worldwide

Due to climate change, the intensity and frequency of extreme heatwaves have increased in recent years, becoming a serious threat to public health and ecosystems, in addition to their negative repercussions on various aspects of life and the economy. In response to this challenge, authorities in Australia, the United States, and Hong Kong have launched innovative initiatives to mitigate the effects of extreme heat. These solutions focus on enhancing resilience in dealing with high temperatures through various measures that combine technology and practical tools, including proactive measures, readiness, real-time data, and effective public communication.

 · · 21 April 2024

Seoul Aspires to Become the Global Hub for Robotics

In the face of the population aging phenomenon and the shrinking workforce, the government of the South Korean capital, Seoul, is turning to robots to bridge the supply and demand gap in the local economy. They have started employing robots to provide services, support local developers' projects, and established a specialized research center. In doing so, they aim to harness technology to overcome serious challenges in their human capital.

 · · 21 April 2024

France Introduces National Label to Combat Food Waste

Driven by environmental commitment and social responsibility, the French government seeks to find solutions for food waste by educating consumers, considering them as the key to solving the equation. Laws have been enacted obligating institutions to donate surplus food, and a national label has been awarded to entities demonstrating greater dedication in the journey against waste.

 · · 21 April 2024

Promoting Responsible Artificial Intelligence Adoption in Singapore 

Singaporean authorities have revealed a framework and a set of innovative testing tools that assist companies across various sectors in enhancing governance, transparency, and accountability in their artificial intelligence (AI) applications.

 · · 29 January 2024

Lessons in Circular Economy from the Finish Experience 

In the past few years, Finland has become a hub for circular economy. The country aims to curb the use of natural resources by 2035 and has committed to achieving climate neutrality by 2035. The road towards this goal cannot be reached without circular economy. Finland outlined a clear circular economy roadmap guided by supportive […]

 · · 29 January 2024
1 2 3 79
magnifiercrossmenuchevron-down