Google Translate adds support for 110 languages, including Punjabi, Marwadi, and more

Google has announced that its Translate app now supports 110 new languages, adding to the 113 previously available.

This update is part of Google’s 1,000 Languages Initiative, aimed at building AI models to support the world’s 1,000 most spoken languages.

PaLM 2 Large Language Model

Google is using its PaLM 2 large language model to integrate these 110 new languages into Google Translate.

The new languages, including major world languages and those spoken by small Indigenous communities, represent over 614 million speakers, with about a quarter from Africa, marking Google’s largest expansion of African languages to date. These include:

  • Fon
  • Kikongo
  • Luo
  • Ga
  • Swati
  • Venda
  • Wolof

Newly Supported Languages
  • Afar: Spoken in Djibouti, Eritrea, and Ethiopia, with significant volunteer contributions.
  • Cantonese: Highly requested, overlaps with Mandarin in writing.
  • Manx: Celtic language of the Isle of Man, revived after near extinction.
  • NKo: Standardized form of West African Manding languages, with a unique alphabet.
  • Punjabi (Shahmukhi): Written in Perso-Arabic script, most spoken in Pakistan.
  • Tamazight (Amazigh): Berber language of North Africa, written in Latin and Tifinagh scripts.
  • Tok Pisin: English-based creole and lingua franca of Papua New Guinea.
Google Language Selection

Google considers regional varieties and spelling standards when adding languages, prioritizing commonly used varieties even for those without a standard form.

For instance, Romani text in Translate blends Southern Vlax Romani with elements from Northern Vlax and Balkan Romani dialects.

Efficiency with PaLM 2

Google also mentioned that its PaLM 2 model helps translate learn languages related to each other, such as Awadhi and Marwadi (close to Hindi) and Seychellois and Mauritian Creoles (French-based).

Google plans to support more language varieties and spelling conventions over time as technology advances and partnerships with linguists and native speakers continue.

Source


Related Post