Skip links

Velvet-2B

A compact, bilingual, high-efficiency language model built for real-time applications, delivering powerful AI even in resource-constrained or edge settings.

The essence of Velvet-2B

Velvet-2B is an instruct model fine-tuned from Velvet-2B-base using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets, designed to handle textual instruction-based tasks.

Languages

Italian and English. To ensure high-quality multilingual performance, the dataset was curated to balance linguistic representation, reducing overfitting biases.

Architecture

Auto-regressive language model with a transformer-based causal decoder-only design, built on 28 transformer layers.

Parameters

2 Billion parameters.

Specialization

Over 1M human-annotated and synthetic examples for SFT.

Safety

Over 50K human generated examples for safety instructions.

Vocabulary

127K Tokens.

Data Freshness

The pretraining data has a cutoff between August 2024 and October 2024.

License

Open weight with Apache 2.0 license.

Training Infrastructure

Built from scratch on a dense architecture, it was trained on Italy’s Leonardo supercomputer, hosted by CINECA.

Capabilities

Question
Answering
Machine
Translation
Information Extraction
Common Sense Reasoning
Text
Classification
Textual
Entailment
Text
Completion
Natural Language
Interface
RAG
Summarization
Paraphrasing

Performance Evaluation

An Independent Evaluation Board compared Velvet with other models under 30B parameters built from scratch, using several metrics to assess the model’s logical reasoning, problem-solving capabilities, and ability to go beyond statistical correlations. 

Italian language
CategoryBenchmarkVelvet-2B
GeneralMMLU (5-shot)39.6
CommonsenseHellaswag (0-shot)54.3
WinoGrande ITA-bench (0-shot)61.9
PIQA ITA-bench (0-shot)67.3
SciQ ITA-bench (0-shot) with p.86.6
ReasoningARC-Challenge (0-shot)41.7
English language
CategoryBenchmarkVelvet-2B
GeneralMMLU (5-shot)43.4
Instruction FollowingIFEval (0-shot)53.2
CommonsenseHellaswag (10-shot)65.0
WinoGrande (0-shot)60.9
ReasoningARC-Challenge (25-shot)50.6

These metrics evaluate its scientific reasoning, capacity to generate plausible, contextually relevant responses based on common sense, and overall understanding across multiple subjects, focusing on providing accurate and informed answers.

Why Velvet-2B

Velvet-2B is designed for efficient fine-tuning on specialized tasks, making it a flexible solution across different use cases. Not all challenges demand the same approach; some scenarios require speed and scalability, while others face constraints related to cost or hardware capacity.

1
High Volumes

Velvet-2B provides a responsive and agile solution for organizations handling vast amounts of data in (near) real time, keeping an optimal balance between speed and performances.

2
Tighter Scope

Velvet-2B delivers efficient, cost-effective results tailored to specific needs of smaller organizations with limited computational resources.

3
Narrow Tasks

Due to its ability to be easily fine-tuned with minimal hardware, Velvet-2B is ideal for highly specific tasks with well-defined objectives.

Discover Velvet AI Models

Check it out on Hugging Face

Discover and Download Velvet AI Models