The Editorial Staff at AIChief is a team of professional content writers with extensive experience in AI and marketing. Founded in 2025, AIChief has quickly grown into the largest free AI resource hub in the industry.
Nvidia's Llama-3.1 Nemotron Ultra Beats DeepSeek R1 with Fewer Parameters
Nvidia's Llama-3.1 Nemotron Ultra outperforms DeepSeek R1 in benchmarks, showcasing advanced reasoning in a lightweight 253B model.

Originally reported byventurebeat
Nvidia has introduced its latest large language model, the Llama-3.1 Nemotron Ultra, boasting 253 billion parameters and designed for advanced reasoning and AI assistant tasks. Announced on April 7, 2025, this fully open-source model outperforms Meta's DeepSeek R1, despite having less than half its total parameters. The model is now publicly available on Hugging Face, featuring open weights and post-training data.
At the core of the Llama-3.1 Nemotron Ultra model is an architecture optimized for efficient inference, fine-tuned through Neural Architecture Search (NAS). This new design incorporates features like skipped attention layers and compressed feedforward networks, allowing deployment on a single 8x H100 GPU node, while reducing memory usage and computational requirements. The model is compatible with Nvidia's B100 and Hopper microarchitectures and can operate in two modes to handle varying complexity in tasks.
Performance evaluations indicate significant improvements, particularly in reasoning-enabled mode. For example, the model scored 97% on the MATH500 benchmark, up from 80.4% when not enabled for reasoning. Such gains highlight its effectiveness in instruction following and general reasoning tasks, surpassing DeepSeek R1 in numerous areas.
Developers can integrate the model with the Hugging Face Transformers library and customize performance based on specific task needs. With multilingual capabilities, Llama-3.1 Nemotron Ultra supports various applications, including chatbots, code generation, and retrieval-augmented generation.
Released under the Nvidia Open Model License, the model is prepared for commercial use, with guidance on assessing its alignment and safety. Oleksii Kuchaiev from Nvidia expressed excitement about the model's launch, highlighting its innovative design and potential applications in AI development.
#news
ES
Editorial Staff Editor
View all posts
Filter:
No comments yet. Be the first to comment!
Related stories
Anthropic Reverses Course on Fable Safety
#ainews#anthropic#claudefable5#hiddensafeguards#distillation
Anthropic has committed to making its previously covert safeguard, designed to prevent model distillation, as transparent and visible as its other established safety measures. The company has issued a...
2h ago
Deezer's AI now spots fake music across all streaming platforms.
#ainews#deezer#aimusic#musicdetection#streamingplatforms
Recognizing that industry competitors have not adopted its proprietary technology, Deezer is now making its advanced detection capabilities directly accessible to the public. Deezer is launching a new...
6h ago
Opendoor's India Pullout Sparks AI & Outsourcing Rethink
#ainews#opendoor#indiaexit#aiimpact#offshorework
Opendoor, the San Francisco-headquartered online real estate platform, is ceasing its operations in India, less than two years after establishing its presence in the country. This decision has quickly...
10h ago