The Editorial Staff at AIChief is a team of professional content writers with extensive experience in AI and marketing. Founded in 2025, AIChief has quickly grown into the largest free AI resource hub in the industry.
Nvidia's Llama-3.1 Nemotron Ultra Beats DeepSeek R1 with Fewer Parameters
Nvidia's Llama-3.1 Nemotron Ultra outperforms DeepSeek R1 in benchmarks, showcasing advanced reasoning in a lightweight 253B model.

Originally reported byventurebeat
Nvidia has introduced its latest large language model, the Llama-3.1 Nemotron Ultra, boasting 253 billion parameters and designed for advanced reasoning and AI assistant tasks. Announced on April 7, 2025, this fully open-source model outperforms Meta's DeepSeek R1, despite having less than half its total parameters. The model is now publicly available on Hugging Face, featuring open weights and post-training data.
At the core of the Llama-3.1 Nemotron Ultra model is an architecture optimized for efficient inference, fine-tuned through Neural Architecture Search (NAS). This new design incorporates features like skipped attention layers and compressed feedforward networks, allowing deployment on a single 8x H100 GPU node, while reducing memory usage and computational requirements. The model is compatible with Nvidia's B100 and Hopper microarchitectures and can operate in two modes to handle varying complexity in tasks.
Performance evaluations indicate significant improvements, particularly in reasoning-enabled mode. For example, the model scored 97% on the MATH500 benchmark, up from 80.4% when not enabled for reasoning. Such gains highlight its effectiveness in instruction following and general reasoning tasks, surpassing DeepSeek R1 in numerous areas.
Developers can integrate the model with the Hugging Face Transformers library and customize performance based on specific task needs. With multilingual capabilities, Llama-3.1 Nemotron Ultra supports various applications, including chatbots, code generation, and retrieval-augmented generation.
Released under the Nvidia Open Model License, the model is prepared for commercial use, with guidance on assessing its alignment and safety. Oleksii Kuchaiev from Nvidia expressed excitement about the model's launch, highlighting its innovative design and potential applications in AI development.
#news
ES
Editorial Staff Editor
View all posts
Filter:
No comments yet. Be the first to comment!
Related stories
ChatGPT Brings Prompt-Powered Presentations to PowerPoint
#ainews#chatgpt#powerpoint#presentations#microsoft
Microsoft PowerPoint has unveiled a new ChatGPT integration, which, much like its predecessors for Excel and Google Sheets, introduces a dedicated sidebar. This feature empowers users to construct or...
9h ago
Trump Delayed AI Executive Order Signing Over 'Disliked Aspects.
#ainews#donaldtrump#executiveorder#airegulation#uschina
According to reports from Politico, former President Donald Trump unexpectedly postponed the signing of an executive order focused on government oversight and access to artificial intelligence on Thur...
10h ago
Microsoft's New AI Chip Challenges Amazon, Google
#ainews#microsoft#maia200#aichips#azure
Microsoft is commencing the deployment of its new Maia 200 chip across its data centers today, marking a significant step in its AI infrastructure development. Today, Microsoft officially unveiled the...
10h ago