TrustRadius: an HG Insights company

Azure AI Speech

Score8.6 out of 10

19 Reviews and Ratings

What is Azure AI Speech?

The Azure AI Speech service provides a range of speech recognition and generation capabilities including speech transcription, text-to-speech and speech translation. It provides a range of speech recognition and generation capabilities including speech transcription, text-to-speech, speech translation, and speaker recognition.

Azure AI Speech, a great fellow traveler in your speed to text adventure!

Use Cases and Deployment Scope

We use Azure AI Speech to capture the user's voice from an Angular frontend app to then tokenize/rank words for helping the User to have a fact checker in an open discussion with another speaker

Pros

  • Build a conversational language understanding model
  • Translate text with Azure AI Translator service
  • Create a custom text classification solution

Cons

  • Voice quality needs to be improved
  • Word Error Rate in Azure is bigger than OpenAI's, needs to improve

Return on Investment

  • Quick to get up to speed to build a call center system
  • Increased new customers onboard since capturing voice with Azure AI

Alternatives Considered

Amazon Polly, OpenAI API and Google Cloud Speech-to-Text

Other Software Used

Azure AI Search, Azure DevOps Services, Docker

Great Recognition Capability with Azure Cognitive Speech Services and the Technical Team is Very Reliable.

Use Cases and Deployment Scope

Simplicity on the initial implementation of Azure Cognitive Speech Services is a big plus. The features' flexibility is very unique and customizing any function is simple. The software reaches with powerful tools with effective voice recognition ability and easy to manage record and other business data management through Cloud services, and even the engagement functions and also predictive data analytics from this solution are the best.

Pros

  • Supportive data integration functions.
  • Simple adaptation to all functionalities.
  • I really love the speed of data migration with this platform.

Cons

  • The initial training when new to this software is an essential process.
  • Tracking a huge amount of recording history.
  • Collective multiple reports and evaluation is a turf operation.

Most Important Features

  • Lead and contacts management functions.
  • Reports tools performance is nice.
  • Data connectors options are very responsive.
  • Functional engagement tools.

Return on Investment

  • The platform provides solutions for project information and easy management of client contacts.
  • Reliable tools for quick reporting and the predictive data offered are quite relevant.
  • Multiple data migration and easy to schedule through the platform.

Alternatives Considered

IBM Cloud for VMware Solutions, IBM Watson Speech to Text and SAP Conversational AI

Other Software Used

Webex Meetings, BlueJeans Events, Kentico Xperience

A solid service provided by Microsoft which has some room for minor improvements. Definitely one of the top service in this market and well worth considering

Use Cases and Deployment Scope

There are two main uses for this product within our organisation as of yet, firstly: we use the accurate voice analysis with custom speech models in lectures to ensure our lectures are accessible to students with hearing-related accessibility issues, mostly through live text translation. Secondly, students are able to use this service and integrate its functionality into their application development during projects within their computing degrees.

Pros

  • It implements accurate voice analysis which can be improved with customised speech models
  • Affordable
  • Doesn't have to be run online/ can be run and stored locally

Cons

  • It can be quite difficult to set up
  • Speech recognition is occasionally inaccurate
  • It sometimes struggles with non-native English speakers' accents

Most Important Features

  • Accurate speech detection and transcription
  • Live speech detection functionality
  • Easy deployment

Return on Investment

  • Increased accessibility of our lectures for students
  • Reduced the time required by lectures to introduce CC captions to remote lectures during the COVID-19 pandemic

Alternatives Considered

IBM Watson Text to Speech and Azure Cognitive Search (formerly Azure Search)

Other Software Used

IBM Watson Text to Speech, boost.ai

Enterprise grade speech services for the ML generation

Use Cases and Deployment Scope

We use Azure Cognitive Speech Services to add speech to text, text to speech, and other AI-driven NLP-related speech services to our customised applications esp those involving chatbots for different business functions. The idea was to make use of speech services for mobile apps to make them hands-free and more accessible. The range of languages helped especially from an Indian context as only one competitor product could support as many Indian languages apart from a few European and middle eastern ones.

Pros

  • APIs offered are very robust.
  • Languages supported is far greater than most of its competitors.
  • Integration with our custom apps was easy.
  • Speech models that we created using neural voices were quite impressive.
  • Translation services worked really well.
  • Built in machine learning opens it to a lot more business use cases for the future.

Cons

  • At times different accents can be an issue but over time with more data, this can be further improved esp with reinforcement learning.
  • Price is on the higher side so ROI is slow to realise.
  • For community development, perhaps some of its source code could be open-sourced for further engagement and development as the overall community is small.

Most Important Features

  • Text to speech.
  • Speech to text.
  • Translation APIs.
  • Customizable keywords.
  • Integration with 3rd party apps.
  • Ease of deployment on the cloud.

Return on Investment

  • Although it takes time our apps powered by speech services gave us good ROI.
  • Made our products stand out in finance, hr and operations functions.
  • Gave much-needed AI-powered machine learning integration through NLP offered by azure.
  • Our chat assistants became more user friendly and thus UX increased.

Alternatives Considered

Google Cloud Speech-to-Text, Amazon Transcribe and IBM Watson Speech to Text

Other Software Used

Google Drive, IBM Cognos Analytics with Watson, Automation Anywhere, Microsoft 365 (formerly Office 365), Sophos Intercept X, Jira Software, VMware Blockchain, Broadcom Test Data Manager (formerly CA Test Data Manager)

Good secured platform for enterprise cognitive requirements.

Use Cases and Deployment Scope

We used it for a POC where we had to convert speech recordings from customers calling at our helpline to text. These text scripts were to be used for training and doing an analysis on customer sentiments. Azure cognitive speech services were used to convert speech to text. The scope of the use case was extended to analyze all customer conversations calling for inquiries and support.

Pros

  • Deployment is easy since its available on the cloud.
  • It is directly as a service and no expertise in AI or ML is needed by the development team.
  • Security of data since Azure promises that it does not store the data of the customers that is used by the service.

Cons

  • More support for India regional languages and the ability to interpret Indian dialect.
  • More detailed documentation with more coded examples to be available.

Most Important Features

  • Security and privacy of client's data - this is most important.
  • Support for multiple languages available.
  • Support from Azure and its partner ecosystem.

Return on Investment

  • It has improved productivity of sales training program by 9%.
  • It has reduced manpower at helpline by a significant amount.

Alternatives Considered

Yellow Messenger