AIWave Speech & Voice
In multimedia-driven world, the ability to process spoken language efficiently and accurately has become crucial in various scenarios. The demand for fast and precise automatic speech processing has surged across variety of sectors such as customer service, media, market research and content moderation.
Speech and Voice tools are designed to address these growing needs by automating the understanding and processing of audio and video contents.
AIWave Speech & Voice is a one-stop shop of powerful tools to design speech-based applications.
This toolkit includes all the capabilities for transforming audio and video into text. It also allows to index contents and extract information by native integration with Natural Language Understanding and Search services.
Features and Capabilities
Automatic Speech Recognition understands automatically voice signals coming from an audio source.
Speech to Text converts the speech signal contained in an audio file into a text.
Speech Translation translates the text generated from a root language into one or more languages.
Batch and
Real-time transcript.
Inverse Text Normalization features
such as numeral formatting.
Punctator
improves readability by punctuating the text.
Truecaser
improves readibility by truecasing the text where needed.
Speaker Diarization
attributes speech segments to a specific speaker.
Voice Activity Detection with Audio Classification
such as music, speech, jingles, silences and other non-speech signals and noises.
42 languages and 60+ domain-specific language models
fine-tuned for verticals, available in narrow, mixed, and broadband.
Multi-Framework
selects the more appropriate framework to dispatch the audio stream either on the fastest or on the most accurate decoding engine.
Speaker Identification
identifies the speaker from a database
Spoken Language Identification
automatically detects the language spoken by the speaker.
Gender Classification
detects gender with additional biometric information.
Anonymization
of personal data by text altering and morphing.
Natural Language Understanding
for intent recognition and classification.
Additional Tools
Transcription Biasing
Weighting to create bias and model the transcription according to business needs.
MyListWeb
An ontology tool to enrich transcripts.
Benefits
Speech Translation
Translate transcript thanks to Machine Translation, translating from a source language to multiple target languages.
More than Transcripts
Generating accurate transcripts is not the endpoint. It is also possible to perform sentiment analysis on speech transcriptions thanks to the integrated NLU capabilities.
Fast and Accurate
The solution provides a universal decoder to choose the more appropriate framework for the job. This tool guarantees flexibility and can speed up the transcription jobs up to 5x.
Index and Search
Performs indexing and content search on semantic analysis engines enhanced by the latest generative composite AI technologies, allowing text extraction and indexing with respect to conceptual models.
Logical Architecture