Expect to be dissatisfied.
Nvidia’s Ada Lovelace architecture introduces a new level of performance at the top of the stack, with the RTX 4090 outperforming the previous generation RTX 3090 Ti by an average of 52% in our rasterization benchmarks and 70% in our ray tracing benchmarks, both at 4K. The 4090 now sits comfortably atop our GPU benchmarks and is one of the best graphics cards on the market, assuming you have deep pockets.
Unfortunately, the transition from the RTX 4090 to the RTX 4080 is rather abrupt, resulting in a 23% decrease in rasterization performance and a 30% decrease in ray tracing performance. Compared to the 4080, the performance of the new RTX 4070 Ti is reduced by an additional 22% when compared to the 4080. The third-string Ada card with the AD104 GPU is slower than the previous-generation 3090 Ti, despite Nvidia’s claims to the contrary, which are based on benchmarks using DLSS 3’s Frame Generation.
The RTX 4070 Ti has only a 192-bit memory interface, which is perhaps even more alarming. It still has 12GB of GDDR6X memory, and the large L2 cache means that the narrower bus isn’t a deal-breaker, but when we consider future lower-tier RTX 40-series components like the 4060 and 4050, things don’t look so good.
Nvidia recently announced the entire line of RTX 40-series laptop GPUs, ranging from the RTX 4090 mobile, which utilizes the AD103 GPU (basically a mobile 4080), to the RTX 4050, which has a rather puny name. Here is the complete list of mobile component specifications.
|Graphics Card||RTX 4090 for Laptops||RTX 4080 for Laptops||RTX 4070 for Laptops||RTX 4060 for Laptops||RTX 4050 for Laptops|
|Process Technology||TSMC 4N||TSMC 4N||TSMC 4N||TSMC 4N||TSMC 4N|
|Die size (mm^2)||378.6||294.5||?||?||?|
|Ray Tracing “Cores”||76||58||36||24||20|
|Boost Clock (MHz)||1455-2040||1350-2280||1230-2175||1470-2370||1605-2370|
|VRAM Speed (Gbps)||18?||18?||18?||18?||18?|
|VRAM Bus Width||256||192||128||128||96|
|TFLOPS FP32 (Boost)||28.3-39.7||20.0-33.9||11.3-20.0||9.0-14.6||8.2-12.1|
|TFLOPS FP16 (FP8)||226-318 (453-635)||160-271 (321-542)||91-160 (181-321)||72-116 (145-233)||66-97 (131-194)|
The desktop RTX 4070 is likely to utilize the same AD104 as the RTX 4070 Ti, albeit with fewer SMs and shaders. The desktop RTX 4060 Ti may or may not use the AD104 GPU; the only other option would presumably be the AD106 GPU found in the mobile 4070/4060. That presents a problem.
The RTX 3060 Ti from the previous generation featured 8GB of GDDR6 memory on a 256-bit interface. We were dissatisfied with the lack of VRAM, particularly when AMD began shipping the RX 6700 XT (and later the 6750 XT) with 12GB VRAM. Nvidia essentially made a course correction with the RTX 3060 by equipping it with 12GB VRAM, making it a significant improvement over the RTX 2060. Even the RTX 2060 eventually received 12GB models, albeit at prices that made them largely unattractive.
Now we’re talking about RTX 4060 most likely returning to 8GB, which would be terrible. The number of games that can utilize more than 8GB of VRAM will only increase over the next two years. GDDR6 and GDDR6X memory capacities are capped at 2GB per 32-bit channel, leaving Nvidia with few alternative options.
There is the possibility of implementing “clamshell” mode with two memory chips per channel, one on each side of the PCB, but that would be a mess and we wouldn’t expect it from a mainstream GPU. This could allow the 128-bit interface to support up to 16GB of VRAM, which would be odd given that high-end component such as the 4070 Ti only has 12GB. Still, that sounds superior to the RTX 4060 8GB model.
What is the status of the RTX 4050? Perhaps Nvidia will stick with the 128-bit interface on the AD106 GPU and skip using AD107 on a desktop part, as it did with GA107, which was almost exclusively used for the laptop RTX 3050. However, if it attempts to use AD107 on a desktop, it would only have up to 6GB of VRAM, and again, clamshell VRAM would be an option.
Memory capacities are not the only cause for concern. In the RTX 4070 Ti review, we stated that performance wasn’t terrible but also wasn’t exceptional. It is essentially a cheaper version of the RTX 3090, with half the VRAM and reduced power consumption. The 4070 Ti is equipped with 60 Streaming Multiprocessors (SMs) and 7680 CUDA cores (GPU shaders), which is slightly more than the RTX 3070 Ti. But AD106 could reach a maximum of only 40 SMs, or even 36 SMs, which would put it in a similar territory to the RTX 3060 Ti in terms of core counts, leaving GPU clocks as the only means of performance improvement.
The combination of these two factors — insufficient VRAM and relatively modest increases in GPU shader counts — suggests that performance improvements over the previous Ampere-generation GPUs will likely be modest.
Nvidia will then introduce DLSS 3 performance enhancements, which only apply to a subset of games and do not provide true performance increases, and the situation will become even direr. A benefit of having a GPU that can run games at 120 fps today is that, as games become more demanding in a few years, it will still be able to run most games at 60 fps. But what happens when these framerates are not actual?
Assume a game running at 120 fps thanks to the Frame Generation technology of DLSS 3, with a base performance of 70 fps. As games become more demanding, the base performance will fall below 40 fps and eventually drop below 30 fps in the future. Even if the monitor is receiving twice as many frame updates per second, Frame Generation with a base framerate of fewer than 30 fps appears to have a framerate of fewer than 30 fps.
The same logic applies to higher framerates, so DLSS 3 at 120 fps with a 70 fps base will still feel like 70 fps, even if it appears more fluid to the eye. Most individuals will be unable to distinguish between input rates of 70 samples per second and 120 samples per second. However, once you fall below 40, even amateur players will begin to notice the difference.
In other words, DLSS 3 and Frame Generation are not panaceas. They can help smooth out the visuals and possibly improve the feel of games, but the benefit will not be as noticeable as actual fully rendered frames with new user input taken into account, especially when performance drops below 60 frames per second.
That’s not to say it’s a bad technology; it’s quite clever, and we have no problem with its existence. Nvidia must stop comparing DLSS 3 scores to non-DLSS 3 results and acting as if they are equivalent. Add 10–20% to the base framerate before Frame Generation, and that’s what a game feels like, not the 60–100% higher fps that benchmarks will indicate.
To return to the topic at hand, future mainstream and budget RTX 40-series GPUs will surpass existing models in terms of raw performance, and they will also support DLSS 3. If the RTX 4060 costs $499 and the RTX 4050 costs $399, they will be minor upgrades compared to the existing cards at those price points. Hopefully, Nvidia will return to prices more in line with the previous generation.