Skip links

Velvet Speech-2B

October 14, 2025

A compact and versatile language model designed for real-time, dynamic interactions, with the ability to process and understand spoken language. It features integrated speech recognition, enabling seamless audio input handling for applications like voice commands, conversational interfaces, and assistive technologies.

The essence of Velvet Speech 2B

Velvet Speech 2B multimodal version brings text and voice together in a single system.Born from the scientific and technological heritage of PerVoice  a spin-off of the Bruno Kessler Foundation renowned for its pioneering work in voice technologies  the model builds on decades of research and innovation in speech and language processing.

Languages

Italian and English. The model is designed according to a natively multilingual approach not centered around English.

Training Dataset

The fine-tuning was made using a combination of open source datasets with permissive license and internally collected synthetic datasets tailored for solving long context problems.

Parameters

2 Billion parameters.

Vocabulary

127K tokens.

Context Window

32K tokens.

Data Freshness

Data cutoff date: June 2025.

Specialization

1M examples.

Safety

50K instructions.

Input and Output

The model accepts both text and audio inputs, generating outputs exclusively in text form.

Capabilities

Speech
Recognition
Speech
Translation
Real-Time
Voice Interaction
Code Switching
Management
Speech Emotion
Recognition
Question
Answering
Machine
Translation
Information Extraction
Common Sense Reasoning
Text
Classification
Textual
Entailment
Function
Calling
Spoken
Translation
Paraphrasing
Summarization
RAG

Why Velvet Speech 2B

Velvet-2B is designed for efficient fine-tuning on specialized tasks, making it a flexible solution across different use cases. Not all challenges demand the same approach; some scenarios require speed and scalability, while others face constraints related to cost or hardware capacity.

1
High Volumes

Velvet-2B provides a responsive and agile solution for organizations handling vast amounts of data in (near) real time, keeping an optimal balance between speed and performances.

2
Tighter Scope

Velvet-2B delivers efficient, cost-effective results tailored to specific needs of smaller organizations with limited computational resources.

3
Narrow Tasks

Due to its ability to be easily fine-tuned with minimal hardware, Velvet-2B is ideal for highly specific tasks with well-defined objectives.

Discover Velvet AI Models

Check it out on Hugging Face

Discover and Download Velvet AI Models