Velvet Speech-2B
October 14, 2025
A compact and versatile language model designed for real-time, dynamic interactions, with the ability to process and understand spoken language. It features integrated speech recognition, enabling seamless audio input handling for applications like voice commands, conversational interfaces, and assistive technologies.
The essence of Velvet Speech 2B
Velvet Speech 2B multimodal version brings text and voice together in a single system.Born from the scientific and technological heritage of PerVoice — a spin-off of the Bruno Kessler Foundation renowned for its pioneering work in voice technologies — the model builds on decades of research and innovation in speech and language processing.
Languages
Italian and English. The model is designed according to a natively multilingual approach not centered around English.
Training Dataset
The fine-tuning was made using a combination of open source datasets with permissive license and internally collected synthetic datasets tailored for solving long context problems.
Parameters
2 Billion parameters.
Vocabulary
127K tokens.
Context Window
32K tokens.
Data Freshness
Data cutoff date: June 2025.
Specialization
1M examples.
Safety
50K instructions.
Input and Output
The model accepts both text and audio inputs, generating outputs exclusively in text form.
Capabilities
Speech
Recognition
Speech
Translation
Real-Time
Voice Interaction
Code Switching
Management
Speech Emotion
Recognition
Question
Answering
Machine
Translation
Information Extraction
Common Sense Reasoning
Text
Classification
Textual
Entailment
Function
Calling
Spoken
Translation
Paraphrasing
Summarization
RAG
Why Velvet Speech 2B
Velvet-2B is designed for efficient fine-tuning on specialized tasks, making it a flexible solution across different use cases. Not all challenges demand the same approach; some scenarios require speed and scalability, while others face constraints related to cost or hardware capacity.
High Volumes
Velvet-2B provides a responsive and agile solution for organizations handling vast amounts of data in (near) real time, keeping an optimal balance between speed and performances.
Tighter Scope
Velvet-2B delivers efficient, cost-effective results tailored to specific needs of smaller organizations with limited computational resources.
Narrow Tasks
Due to its ability to be easily fine-tuned with minimal hardware, Velvet-2B is ideal for highly specific tasks with well-defined objectives.