Google plans to develop Artificial Intelligence with the mostly used spoken languages. And, to do so Google wants to mine data in new languages not only from internet texts, but also from images, videos, and speech.
For competing as a tech giant dominating the internet’s next battleground, Google plans to develop artificial intelligence with the mostly used spoken languages. On Wednesday, Google announced to development of Artificial Intelligence with the world’s 1000 most spoken languages.
You May Also Like To Read Google AI Plan for 2030
Data is critical to Artificial Intelligence advancements, and Google and its Big Tech rivals want to tap into it to help make products perform better and reach the broadest possible audience. “Imagine a new internet user in Africa speaking Wolof… using their phone to ask where the nearest pharmacy is,” said Google researcher Johan Schalkwyk. Such situations are “too common,” Schalkwyk told reporters, adding that languages are “not available to everyone in the world.”

Google Plans To Develop Artificial Intelligence With The Mostly Used Spoken Languages
According to Schalkwyk, there are over 7,000 languages in the world. However, Google only provides translations for about 130 of them. The search engine behemoth intends to broaden this significantly by mining data in new languages not only from the text on the internet, but also from videos, images, and speech.
The group is also looking for audio clips for languages that may not have much-written material. As the project progresses, which is expected to take several years, Google intends to incorporate its findings into its products, including YouTube and Google Translate.
As a first step toward this goal, it presented an Artificial Intelligence model trained on over 400 languages, which The Verge describes as “the largest language coverage seen in language models to date.” Language and Artificial Intelligence have almost certainly always been at the heart of Google’s products, but recent advances in machine learning, such as the development of powerful, feature-rich “large language models” (LLMs), have placed a new emphasis on these areas.

Google has already begun to incorporate these language models into products such as Google Search while defending the systems’ functionality. Language models have several flaws, including a proclivity for harmful societal biases like racism and xenophobia, as well as an inability to parse language with human sensitivity.
Previous research has demonstrated the effectiveness of such an approach, and the scale of Google’s proposed model may offer significant advantages over previous work. Such large-scale projects have become the norm for technology companies seeking to dominate Artificial Intelligence research. A similar project is Facebook parent company Meta‘s current attempt to create a “universal language translator.”
Final Thoughts
However, access to data is a challenge when learning multiple languages, and Google says it will fund data collection for low-resource languages, including audio recordings and written texts, to support work on the 1,000-language model. Let’s wait for more updates from Google and stay tuned to be the first to get upcoming updates. Thanks for reading.!