In a recent development, Google is updating its Gboard keyboard with AI-powered offline dictation on its Pixel phones. This means no more network latency or spottiness — the new recogniser is always available, even when you are offline.
The model works at the character level, so that as you speak, it outputs words character-by-character, just as if someone was typing out what you say in real-time. Since there is no need to send data over the Internet, Gboard’s voice typing should now be faster and more reliable.
Previously, the uncompressed models which Gboard traditionally uses for speech recognition take up about 2GB. When the user taps on the microphone icon, the speech is recorded, sent to Google’s servers to be converted into text, then that text is sent back.
Having said that, the company also trained a smaller model using recurrent neural network transducer technology. Even though it runs on-device with the same accuracy as server-based ones, it takes up around 450MB in the storage space.
With model quantization process, Google was able to further reduce the size of the model, leading to a package that only takes up about 80 megabytes. This also increases the speed of transcription.
At present, the latest Gboard update is only available in American English and on Pixel phones. Google’s AI team may expand the update to include more languages and more devices in the future.