paint-brush
ChatGPT Now Speaks, Listens, and Understands: All You Need to Knowby@sergey-baloyan
2,861 reads
2,861 reads

ChatGPT Now Speaks, Listens, and Understands: All You Need to Know

by Serge BaloyanSeptember 25th, 2023
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

OpenAI is introducing groundbreaking features to ChatGPT, allowing it to break barriers by enabling the AI to see, hear, and speak. The multimodality and enhancements in user interaction are set to roll out in the next two weeks, a significant leap from the text-based interactions. Here’s what you need to know about this version:
featured image - ChatGPT Now Speaks, Listens, and Understands: All You Need to Know
Serge Baloyan HackerNoon profile picture

OpenAI is introducing groundbreaking features to ChatGPT, allowing it to break barriers by enabling the AI to see, hear, and speak. The multimodality and enhancements in user interaction are set to roll out in the next two weeks, a significant leap from the text-based interactions. Here’s what you need to know about this version:

1. Voice Interaction:

For the first time, ChatGPT will respond verbally to user queries, turning it into a more interactive and engaging conversational partner. This feature will be available on iOS and Android, and users can opt-in to use voice to engage in back-and-forth conversations with the AI.


This places ChatGPT in direct competition with renowned voice assistants like Siri and Alexa.

2. Enhanced Multimodal Interaction:

Users can now show images to ChatGPT and have live conversations about them, allowing for a more intuitive and enriched user experience. This feature is a significant step in providing more context during interactions and is available on all platforms.

3. Personalized Artificial Personas:

Five neutral artificial personas including Juniper, Breeze, and Ember, will be answering verbally to user queries. Eventually, OpenAI plans to allow users to create their personalized voice, broadening user experiences.

4. Advanced Whisper Engine:

The advanced Whisper engine powers the AI voices, which have received positive initial reviews for their human-like interaction, despite having a style that some may find intrusive.

5. Subscription-Based Access:

The new voice and image capabilities will be exclusive to ChatGPT Plus subscribers, priced at $20 per month, focusing on providing advanced features to dedicated users.

6. Addressing Ethical Concerns:

OpenAI is confident of having resolved most bugs and is focusing on addressing ideological issues, including potential voice fraud, discrimination against uncommon accents, and inadvertent attribution of social and political baggage to the AI's voice. They assure that the capability for de-anonymizing individuals through photos in ChatGPT has been blocked.

7. New Use Cases:

These features provide more versatile applications, allowing users to take pictures of landmarks or their fridge contents and have real-time conversations about them, thus helping in various scenarios like travel, cooking, or learning.


Want to check the tests of the new Voice feature and understand how to get started with Voice in ChatGPT? I will be testing and reviewing it in my newsletter ‘AI Hunters’. There you can find new instruments and use cases for the most groundbreaking AI instruments. Subscribe; it’s absolutely free!

Summary:

OpenAI's ChatGPT is poised to become more interactive and user-friendly with the introduction of these new features, aligning it more closely with the needs and preferences of users and expanding its range of applications.


The amalgamation of voice interaction, image recognition, personalized AI personas, and ethical considerations, all underscore OpenAI's commitment to providing innovative and responsible AI solutions.


With the deployment of these enhancements, ChatGPT is not just another chatbot but is becoming a multifaceted assistant, bringing an intuitive and enriching experience to its users.