OpenAI adds support for voice and image capabilities to ChatGPT


OpenAI has unveiled voice and image functionalities in ChatGPT, providing a more user-friendly interface that allows for voice interactions and visual demonstrations.

Voice Conversations with ChatGPT

ChatGPT now offers voice conversations. You can converse with your assistant, ask for bedtime stories, or have debates.

To start, enable voice conversations in the mobile app’s Settings → New Features, then select a voice from five options by tapping the headphone icon.

The feature uses a state-of-the-art text-to-speech model and whisper, an open-source speech recognition system, to create lifelike audio and transcribe your words into text.

OpenAI has collaborated with professional voice actors for this and is actively working to mitigate potential risks like impersonation or fraud.

Discussing Images with ChatGPT

ChatGPT now supports image discussions. You can share images, analyze graphs, or use the drawing tool in the mobile app to highlight parts of an image.

To start, tap the photo button or the plus button for iOS and Android users. You can discuss multiple images and guide your assistant using the drawing tool.

The image understanding feature uses multimodal GPT-3.5 and GPT-4 models to apply language reasoning to various images, including photos, screenshots, and documents.

However, OpenAI acknowledges challenges like misinterpretations and has conducted extensive testing for responsible usage.

Model Limitations

OpenAI acknowledges that while ChatGPT is useful for specialized topics, it has limitations and should not be used in high-risk situations without verification.

The model excels at English text, but may struggle with non-roman scripts. Therefore, non-English users are advised to use it cautiously.

Availability

OpenAI is rolling out voice and image features for ChatGPT to Plus and Enterprise users in the next two weeks. These features will be available on iOS and Android through settings, and images will be accessible on all platforms.

OpenAI also has plans to extend these capabilities to other user groups, including developers, in the near future.