NVIDIA has certainly raised the bar with its latest GeForce RTX 5090, showcasing superior inference performance on the DeepSeek R1, significantly outpacing AMD’s RX 7900 XTX, thanks to its cutting-edge fifth-generation Tensor Cores.
## Making the Most of NVIDIA’s RTX GPUs for DeepSeek’s Reasoning Models
It’s quite clear that consumer GPUs are becoming a powerhouse for running complex LLM models right on home desktops. NVIDIA and AMD both are fiercely pushing to optimize performance for these setups. Not long ago, AMD demonstrated the capabilities of its RDNA 3 flagship GPU with the DeepSeek R1 LLM model. But now, NVIDIA has come in strong, unveiling impressive inference benchmarks with its brand new RTX Blackwell GPUs. The GeForce RTX 5090, in particular, is stealing the spotlight.
The RTX 5090 doesn’t just lead—it commands a substantial advantage over the Radeon RX 7900 XTX. In tests involving Distill Qwen 7b and Distill Llama 8b, the GPU achieved a staggering rate of up to 200 tokens per second. This is nearly double the speed that AMD’s RX 7900 XTX managed. This not only underscores NVIDIA’s stronghold in AI performance but also points to an exciting future where edge AI becomes a common feature on consumer PCs, driven by the robust “RTX on AI” support.
For those looking to dive into DeepSeek R1 via NVIDIA’s RTX GPUs, the process is refreshingly straightforward. NVIDIA has released a detailed blog explaining the steps, making it as simple as using any online chatbot. Here’s a quick guide:
> “Developers wishing to explore this area can now experiment securely with the massive 671-billion-parameter DeepSeek-R1 model. Available as an NVIDIA NIM microservice preview at build.nvidia.com, it can churn out an impressive 3,872 tokens per second on just a single NVIDIA HGX H200 system.
>
> The anticipated availability of the API as a downloadable NIM microservice is set to be part of the NVIDIA AI Enterprise software package, allowing developers to test and experiment seamlessly. These standard APIs ease deployment, and enterprises can ensure data security by choosing to run the NIM microservice on their preferred accelerated computing setups.”
>
> – NVIDIA
With NVIDIA’s NIM service, both developers and tech enthusiasts have an easy on-ramp to experiment with AI models locally. It not only enhances performance but also ensures data privacy, provided your hardware meets the requirements.