Installing several GPUs in a machine / amount & type of system RAM

litwinaa · July 14, 2022, 9:45pm

Hello,

Thank you so much. The help I have received from this forum has really helped me in choosing components to use in my next computer system that I anticipate will be running some flavor of Windows 11 and Rhino 7 and then [I suspect the not to far in the future] Rhino 8.

While it is clear that more than one Nvidia GPU can be installed on the same machine and have them recognized:

But for example, if one were to install two Nvidia RXT A6000 cards on the same computer, would both cards be contributing to a Raytracing and or Rendering computation [each card has 48GB of GPU memory so there is a potential of 96GB total video memory available to be utilized] ?

Scenario # 1: The computer has a single Nvidia RTX A6000 GPU and has a model / scene that requires more or close to 48 GB of GPU memory to load / look at in Raytrace mode and thus one might suspect will not be “practically manageable” e.g. moving the model / scene results in a magnitude of sluggishness / lag time that one cannot work with it.

However, on a computer with two Nvidia RXT A6000 cards that same model / scene or one even larger can be worked with because the GPU memory is now pooled; you have a total of 96 GB of video memory to work with. The workload is distributed across the two GPU cards in such a way that optimizes the rendering performance / speed. Is this true ? Or is it that the model is still confined / constrained to a 48 GB GPU memory space?

Scenario # 2 Given that it is true that in a computer with two Nvidia RXT A6000 cards the GPU memory is pooled / shared and the workload optimally distributed between the two cards; will Raytrace Mode rendering speed increase for both static images and for Raytrace video production ? The model / scene might not be particularly large but rather at some fairly broad range of model / scene size the render speed increases perhaps because the the workload is managed more efficiently.

Finally, I read somewhere that a good “rule of thumb” is that your computer have at least twice the amount of system RAM as it does GPU memory. Is this advise generally on target? Putting aside cost considerations is it adventurous to go with DDR5 system RAM ?

Thank you,

Andy

TomTom · July 15, 2022, 7:58am

Hi,

I’m not up-to-date, but some years ago, both cards need to communicate over SLI to achieve any gain.
But could be that current the current graphics interfaces are able to orchestrate that.
Once you run out of VRAM, resources are allocated on the RAM. And if you run out of RAM your
drive is used for this. But really, even if you build in a memory leak on purpose, 48 GB of data is a lot.
I mean, there are always exceptions to common rules. But if you really allocate more than 48 GB on the VRAM, then you do something wrong in my opinion…

jeremy5 · July 15, 2022, 8:51am

See this post and its follow-ups: 2 GPUs question - #9 by nathanletwory

seltzdesign · July 15, 2022, 8:51am

Some thoughts on dual graphics cards, SLI and sharing memory.

As far as I understand it, for raytracing you don’t want/need SLI. SLI comes from the gaming world and does not offer benefits in raytracing.

Even if using multiple GPUs for raytracing, the scene has to fit into the memory of each card. Therefore you would have 48GB, or slightly less if you are also using the GPUs to drive your monitor(s).

Here is some good input I found from a staff member of Otoy, makers of Octane:

actually multiple cards together will add up the cores resulting in a linear speedup, unfortunately it is not same for the vram - memory is not compounded in the same manner in gpu renderers. All the textures and hdri among other stuff used for an entire scene has to fit into a functional swap space - that is each GPU must have a copy of all scene elements it needs to process - and therefore this swap space will be limited to the size of that card with the least amount of vram present in your machine. So if you use two 4gb 980 for rendering, the rendering is faster but the memory size used for the rendering is effectively only 4gb.

Also, SLI is mostly used for game applications to be able to work with separate video cards simultaneously to process very high frame rates between the cards and the CPU. Not necessarily for detection of gpus. Octane is rendering with the GPU (not the CPU) and Octane does not need SLI to detect the gpus installed in the machine. We don’t recommend SLI for rendering, in fact, Octane will run much better without it.
(source: OTOY Forums • View topic - Two GPU's...only one uses Vram)

JimCarruthers · July 15, 2022, 10:22am

Quadros can ‘pool’ their memory together. Whether your renderer supports doing that or not, whether that’s advisable for CUDA rendering, is another matter.

Well, you you need plenty of RAM because programs normally can’t access the VRAM directly–it’s a new feature that I think technically first appeared on the PS5–it all passes through RAM first.

What I found when I had 4 X 1080ti’s was what you need is enough VIRTUAL memory to be able to ‘swap out’ all that VRAM.

TomTom · July 15, 2022, 11:36am

Maybe another note, in the end there are a lot of parameters to consider. And probably nobody really knows. Even benchmarks are not always done correctly or do represent real world scenarios.
It’s not just the hardware, but as others also mentioned it’s the software, the drivers, the model and hundreds of minor settings.

In the end, you can tweak all these things and figure out if it’s really beneficial to have 2 High-end cards. The last major setup I had was a system with two 2xP6000 and one system with 2xTitanX to do VR Visualization. 2 of the cards were beneficial, because a VR headset is basically a two monitor system. While I was pushing the limits in one software, for another software the same setup was a complete overkill. And this is because, VRAM is just a buffer and the GPU is just doing stupid matrix multiplications. What and how much you compute and allocate is really a matter of the software and the data provided. As a user, you have no chance in tweaking the software, but you have large influence on how to optimize a model/scene.

E.g. You can bake global illumination into textures, and in a static model this can lead to the same (or even better) visual quality, as having real time ray-tracing enabled. You can tweak the render mesh, you can remove invisible geometry and so on. It is a difference if a less-curved surface consists of 30 or 30000 faces. Hardly any difference in the visual quality, but when it comes to memory allocation, this sums up quickly.

Of course, you can waste money and resources to buy the best system you get. But in the end it’s questionable if you really push the limits at all, and if you do, then the issue is likely not the hardware. Usually you are way below and above the performance threshold . So 20% difference in the overall performance, has no impact at all.

litwinaa · July 16, 2022, 1:50am

Hello Tom,

I agree in many ways with the sentiments you state in both of your posts. Thank you.

Indeed you are spot on in:

(1) Modeling for rendering so as to use resources efficiently: I need to improve with this and am trying to do so. Thus, I am taking the “Rendering with Rhino 7” course. But until I learn to do better I need to implement a force vs grace workflow, as indicated, out of necessity and to get the model done.

(2) Indeed there are allot of parameters to consider. And yes who knows and what will be. It is a “Complex System” problem. A problem solving and predictive approach that I am a fan of.

(3) I am going to try to address some of your other points when I respond to others help on this thread.

I was not as precise as I should have been and in this is where “…exceptions to common rules…” may occur.

(1) Frequently I may need to visualize 3-4- 5 models at the same time that often have many similar properties but for e.g. on 2 out of the 5 models they have quite significantly different properties. And then move them and or view them is say the 4 standard views in Raytrace mode.

(2) I want the render to be as fast as possible

(3) I want to be able to make Raytrace mode render videos.

Thank you,

Andy

litwinaa · July 16, 2022, 4:46am

Hello Jeremy,

I believe I had seen this post before but apparently it did not “sink in”. Now it is. Thank you. Since fast render speed is especially in Raytrace mode a high priority goal having the two cards seems to significantly contribute to achieving this goal. I would imagine video renders would speed up to.

Thank you,

Andy

litwinaa · July 16, 2022, 7:31am

Hello Armin,

Thank you. Unless this has changed the:

for linking two of their GPUs together would not work. So yes it seems that currently -and as you state- that model size is constrained by the GPU memory of the card; you cannot pool the memory from multiple cards and have it all be available to use.

I do see that your reference is from 2015 but in 2022 I had the impression ( but could easily be wrong) that games do use Raytracing and I imagine a fast frame rate is advantageous.

So puzzling to me because it would seem that being able to pool all of your memory to do as fast as possible Raytrace rendering [both still image and video- a fast frame rate] and work in a practical way with large models would be something to aspire toward.

Thank you,

Andy

litwinaa · July 16, 2022, 8:26am

Hello Jim,

Thank you. By Quadros you also mean the latest Nvidia RXT A6000 etc. cards correct ? [It seems Nvidia has changed their naming system so I just wanted to check] and it certainly does seem like the memory could be pooled from several cards using the Nvida NVLINK and I guess in the case where the memeory can be pooled this allows for a larger model ?

Assuming that the render does support the pooled memeory then what I do not understand is why using CUDA may or may not be advisable. I believe you may be bringing to my attention a keen distinction I had not a clue to before.

Thank you for the tip about the need for enough Virtual memory.

Thank you,

Andy

JimCarruthers · July 16, 2022, 12:10pm

I’m just saying that it’s probably ultimately fastest for each card to have its own memory than to be sharing.

Of course all this discussion seems a bit trivial, what are you actually currently using or intending to use for rendering? Why do you think 48GB per GPU may not be enough? Is this for ILM? Almost NO ONE has a 48GB GPU, let alone is using it for raytracing instead of machine learning research or splitting it across dozens of virtual machines. I’m still soldiering on with an 11GB 1080ti, I used to have 4 of them(sold the other 3 to upgrade the rest of my system while GPU prices were still berserk!) for rendering with iRay and that was perfectly adequate for anything I cared to throw at it, which did include 4K animations.

Brenda · July 16, 2022, 12:49pm

I think that, with Cycles or any other realtime rendering, texture memory is something to watch. Unfortunately, I think that the time you run an A6000 out of memory, Rhino’s cache shaders/mappings to disk (SSD) will be a nightmare. I had even considered setting up a RAM drive to see if it would help. There is no reason not to do stuff like that on disk while system memory exists. I’ve asked for no-draw shaders that could be applied to unseen surfaces to speed that process.

I’ve noticed that doing 4K renders on Cycles, system memory usage tends to go up quite fast. Memory-wise, I am comfortable doing fairly complicated scenes 4K’s on my 32GB system RAM, with a 2X render, I would want 64GB.

Simple Solution: send me the extra A6000. It’s okay, I could get by with one : )

litwinaa · July 17, 2022, 9:46am

Hello Jim,

Thank you for the clarification.

For rendering I am currently using Rhino Render and am quite resistant to incorporating into the workflow other render software. I believe that Rendering in Rhino 7 has a tremendous amount of excellent capabilities. And am working to learn how to be able to take advantage of them. I have every confidence that Rhino 8 will continue this path.

If by ILM you mean the company: Industrial Light & Magic then the answer is no. If ILM stands for something else then please kindly tell me.

I am not sure 48GB will be enough on a 3 monitor setup [I currently have 2 Eizo FlexScan S2133 and plan to add a 3rd if I can] whereby it is very important that one can:

With 3-5 models on screen:

Change to Raytrace mode in e.g. all 4 standard viewports and have them quickly rendered to a “good quality” and then perhaps rotate them some and then quickly get a good render of the new view. Or perhaps remove some of the models from view and then bring them back into view quickly and rendered.
Do very high quality static Raytrace image and Raytrace video quickly
Raytrace as I model: For example set the perspective viewport and front viewport to Raytrace so that it obtains enough passes to get a really nice render. I am modeling in shaded or ghosted etc. mode in another viewport and want to quickly see - or get a good idea of - what the effect of the change(s) I just made are. Or I make a setting change to a material etc. wanting to see the effect quickly

I have been soldiering on with an Nvidia p4000 (8GB) in a laptop, 64GB system RAM and I have maxed it out - awhile ago.

I am also, as best I can, taking into consideration what I might be needing 1/2 - or a yr from now.

Thank you,

Andy

litwinaa · July 17, 2022, 11:09am

Hello Brenda,
Funny.
Yes marvelous work:

And the video walk through of the lighthouse- wonderful. I can remember being fascinated by the lighthouse lenses as clearly as if I was at this lighthouse yesterday.

I guess the general concept is what I think of as “data density” and although not the proper technical term; but when having a model (s) where each measurement unit of it has a data density analogous to that of approaching Osmium; memory use / need goes up quickly.

I do not technically understand your thinking that at the point an A6000 is out of memeory become problematic for Rhino’s cache shaders / mappings. Could you please explain this a little or if you have a link to a reference that would be great.

Thank you,

Andy

JimCarruthers · July 17, 2022, 11:56am

For maximum raytracing speed, maximizing the CUDA core count for your budget is what matters, not the quantity of RAM. The stupid-priced cards are rarely the best way to get that.

Your monitor setup is NOT demanding at all, but if you want to maximize performance, don’t throttle God’s Own Compute GPU by making it also handle the drudgery of drawing the OpenGL windows and Windows fluff. You have dedicated “display” and “compute only” cards.

And I’m not sure why you want a $5000 video card for either the Rhino renderer–which is not as full-featured as add-in products–or 3 tiny monitors. It’s like asking if a semi truck is adequate to pick up a few cans of paint at Home Depot.

nathanletwory · July 18, 2022, 5:41am

With multiple GPUs without nvlink/sli/crossfire memory is not shared between the devices.

I have an RTX A6000 and RTX A5000, my A6000 has 48GB RAM, my A5000 has 24GB RAM. I have both selected as my render device, but I can do “only” ~24GB scenes to fit in memory. With multiple GPU rendering data is copied to all render devices.

But rendering speed will increase as expected.

seltzdesign · July 18, 2022, 9:46am

I agree with your comments. I would like to add though, that the pure CUDA count does not necessarily relate to final speed. Usually the faster cards also have more cores though.

I find that a good measure of real-world raytracing performance is Octane’s Octanebench. You can filter there by single-GPU and multi-GPU setups.

https://render.otoy.com/octanebench/results.php?v=2020.1.5&sort_by=avg&scale_by=linear&filter=&singleGPU=1&showRTXOff=0

I actually made an Excel sheet somewhere comparing the different GPUs and their CUDA counts to come up with a score for Octanebench points per CUDA core and other factors like price.

I am also curious what exactly OP would need 48GB of GPU memory for.

litwinaa · July 18, 2022, 10:51am

Hello Armin,

I have really tried to explain the need for the GPU memory in some of the previous posts. However, maybe, as is sometimes the case, similar to Sherlock Holmes taking note of what did not happen, I have failed to state the central challenge I am attempting to palliate as much as possible.

As I make changes to a model (s) be it e.g. material, view, lighting, hide / show a model (s) etc. I want to be able to get a good visual - Raytraced- on the change (s) as quickly as I can because it is in the time that passes before the new visual emerges that is the workflow and creative bottleneck. To put it in very human terms, it is the cummulative wait time that fatigues me.

I hope this helps to clarify.

Thank you,

Andy

JimCarruthers · July 18, 2022, 12:32pm

The amount of VRAM has no, zero, nada impact on speed, as long as the scene fits in it. And frankly throwing all the GPUs you can at the problem will only do so much until the bottleneck becomes something else in your platform, and it’s dubious that even the biggest baddest server hardware money can buy will actually give you a noticeable benefit. What sort of projects are you even working on?

seltzdesign · July 18, 2022, 3:10pm

Hi Andy.

I absolutely get your need and desire. I have the same desire. I love working with Octane as part of that reason - updates are very quick and results look great even after minimal wait time.

Memory will not be your bottleneck unless you are working on scenes that are gigantic, like realistic city size. Any smaller scene will easily fit into the memory of a high-end gaming card (like the RTX3080ti or RTX3090). Even if you were to use huge textures like those from V-Ray Scans and the like, I doubt you will hit the memory limit.

Therefore I would rather look into other factors that will increase the speed of your rendering. Those are:

Raytracing software
The settings of said software
The number of CUDA cores available
The speed with which your assets like materials and meshes get uploaded to the GPU memory

1 and 2 very much depend an what you like, whether it is available for the platform and host software, etc. We like using V-Ray (not the fastest, but has some cool options) and Octane (one of the fastest and a solid feature set). Number 2 can have a huge effect depending on the tradeoffs you are willing to take, etc.

Number 3 depends solely on the type of GPU and number of cards. I think 2 cards is good, after that the requirements for the PC go up quite heavily.

Number 4 also largely depends on the kind of system you have, but any good gaming system will be good for that.

So in the end I would save my money and invest in a solid gaming system. It will cost much less than a system with expensive A6000 GPUs and so on, while giving minimal, if any, speed advantages. Also if you always want the fastest speeds possible, I would rather update the PC more often. A good PC in 2 years will be faster than even a highest end one today.

I hope all that makes sense.

If I were to recommend a ready made system, I would go for something like the HP Envy TE02-0950nz.

Topic		Replies	Views
Rhino 7 not recognizing Nvidia RTX A4000 GPUs as render device Rhino for Windows windows , cycles , cyclesrenderengine , nvidia	22	4309	January 5, 2022
2 GPUs question Windows Hardware	14	1855	July 3, 2021
Dual graphics cards and Rendering with Rhino Rhino for Windows	12	3327	November 23, 2017
Multiple graphic cards & Rhino Windows Hardware	5	2135	March 14, 2018
Choosing between two video cards on same PC Rhino for Windows	5	1224	September 22, 2016

Installing several GPUs in a machine / amount & type of system RAM

Related topics