Mistral AI has officially launched Mistral Small 3, a latency-optimized 24 billion parameter model designed to enhance generative AI capabilities. Released under the Apache 2.0 license, this new model aims to provide a competitive alternative to larger models such as Llama 3.3 70B and Qwen 32B, while delivering performance that is more than three times faster on the same hardware.
Key Features and Performance
Mistral Small 3 is engineered to excel in the 80% of generative AI tasks that require robust language understanding and instruction-following capabilities, all while maintaining very low latency. With an impressive 81% accuracy on the MMLU benchmark and a processing speed of 150 tokens per second, Mistral Small 3 stands out as the most efficient model in its category.
The model’s architecture features significantly fewer layers compared to its competitors, which contributes to its rapid processing times. Mistral AI has released both a pre-trained and instruction-tuned checkpoint, providing a solid foundation for developers looking to accelerate their AI projects.
Human Evaluations and Competitive Edge
In a series of evaluations conducted with an external third-party vendor, Mistral Small 3 was tested against other models using over 1,000 proprietary coding and generalist prompts. The results demonstrated that Mistral Small 3 performs competitively with open-weight models that are three times its size, as well as proprietary models like GPT4o-mini across various benchmarks, including code, math, and general knowledge.
Use Cases and Applications
Mistral Small 3 is poised to serve a variety of applications across multiple industries. Some of the key use cases identified include:
- Fast-response conversational assistance: Ideal for virtual assistants requiring quick and accurate responses.
- Low-latency function calling: Suitable for automated workflows that demand rapid execution.
- Fine-tuning for subject matter expertise: Can be customized for specific domains such as legal advice, medical diagnostics, and technical support.
- Local inference: Beneficial for organizations handling sensitive information, allowing for private deployment on standard hardware.
Availability and Collaboration
Mistral Small 3 is now available on la Plateforme as mistral-small-latest
and mistral-small-2501
. The model can also be accessed through partnerships with platforms like Hugging Face, Ollama, Kaggle, Together AI, and Fireworks AI. Additional integrations are expected soon on platforms such as NVIDIA NIM, Amazon SageMaker, Groq, Databricks, and Snowflake.
Looking Ahead
Mistral AI is committed to advancing open-source AI technology, renewing its dedication to the Apache 2.0 license for its general-purpose models. The company plans to continue developing both small and large models with enhanced reasoning capabilities in the coming weeks.
As the open-source community eagerly anticipates the potential of Mistral Small 3, Mistral AI invites developers and enthusiasts to explore the model and contribute to its evolution. With a focus on collaboration and innovation, Mistral Small 3 is set to make a significant impact in the generative AI landscape.