Rhino Speed Bottle Neck

MY company is about to choose the next model of computer to purchase for the computational design group and I have been asked for my input.

Our biggest factor is we have these very large Rhino models (1-4gb typical but sometimes larger) composed of multiple linked and embedded blocks. These files run SLOW. We are trying to simply reduce hang time. This comes to my questions:

What is the bottle neck in rhino for large files? When I run commands (in Rhino or in grasshopper), I’ll often get mutli-minute freezes while the process is running. I have tried watching task manager to see what recourses are maxed out, but I don’t see anything as a red flag. RAM sits comfortably at 18-24gb out of 32gb. CPU only has 1, sometimes 2 cores in use at 20-40%. GPU doesn’t appear to be doing anything. HDD is also mostly silent.

We do plan on upgrading all components, but what components really make a difference? Currently we are rocking the very old Dell Precision 7730 (xeon E-2176m, 32gb ram, Nvidia Quadro P3200). We are currently looking at the Precission 7680 (i9-13950HX, 64GB ram, RTX 3500 Ada). We can consider further upgrading the CPU, GPU, or RAM, but would only want to do so if upgrading those will provide immediate performance difference, not just as a future proof.

Any recommendations on machines is appreciated. Insight in to determining what our bottle neck is ideal.

Hi, this is likely the issue. Rhino is for most parts single threaded, so you need to compare the cpu for single-core performance on a benchmark website. Other than that, the more geometry you deal with, the more your gpu has to do for displaying. This can be tweaked by reducing the render-mesh density and by choosing a different display mode. It’s not uncommon that companies doing “computational design” do not pay great attention to the quality of the models. It’s easy to generate lots of heavy geometry, and this is also something which drains performance. Sure you can upgrade to the best hardware to compensate that, but it might be worth to review your files for redundancy and inefficient geometry.

It’s also a known fact that in Rhino, blocks are slower to display than discrete geometry. So that may be contributing to the slowness.

1 Like

Ok, so single core performance is important. That’s really helpful. But why would Rhino only use 15-20% of a single core (So on an 8 core CPU, that looks like 3-5% total utilization)? Why does it not max out the single core?

I am well aware of the heavy geometry problem in the computational design community. I work to have my models as light and efficient as possible, but when you are dealing with a 50 story building and hundreds of thousands of parts, there’s only so much you can do.

1 Like

To my recollection, Rhino is still focused on being oriented for low budget machines, and does not take advantage of high end technology – hence, only single thread utilization “for most part” in the 21st century. :sob:

100% / 6 cores = 16,7 %
So that’s one core running full speed.

The cpu swaps around on what core to use to distribute heat, and a lot of processes can’t be hyperthreaded if A has to be calculated before B and C can be determined. But some tasks can be handled in parallel and Rhino is better at that now than before.

Hope that helps a bit.

1 Like

Holos is correct, the xeon E-2176m has 6 cores so it makes sense to see something around 16% on full load. Just to add something to Holos second part. As he said, it is simply not possible to parallelize every programming problem. In fact most code can’t be parallelized at all. Just consider preparing a cup of tea. You’ll never get faster tea by dividing the task into 6 smaller ones. However, if you also want prepare some juice, you can do that while your water is boiling without the need for a second chef. This is called concurrency.

3 Likes

Well, the water might boil faster if divided into 6 smaller quantities all being heated at the same time, but then you are faced with the time it takes for the task of recombining the 6 into one cup. That’s called “overhead”. Plus, you still have to wait for the tea to brew, which is in itself incompressible.

1 Like

poorly formed blocks… not blocks in general.

blocks that are blocks inside of blocks inside of blocks (etc) are bad.

these type of blocks often come from part libraries and online model banks.

the fix is simple, edit the offending block by double clicking it, then explode all blocks to reduce it to discrete geometry, then reblock it once to create a proper (for rhino) block with only one level of
“blockiness” these will load and perform just fine.

I’ve lost count how many times I see people complain about performance and have a block with 250 levels of internal structure all blocked, then copied 200 times around their model. (window and light fixture vendors are notorious for this) once it’s edited and reduced to one level of block structure the model flies again.

6 Likes

I can see individual core performance. It will be 15-20% on one core while 0% on the others. I do not see the load jumping around. The only other explanation I can think of is Task Manager mis-reporting individual CPU core usage.

No. “Content creation” in general is just not easy to parallelize. It’s not a damn video game. The reason to have lots of cores is to have 52 Chrome tabs open while working.

2 Likes

This is my laptop doing a bunch of boolean difference tasks:
image

(It’s in Norwegian so it probably doesn’t make much sense though :wink: )
As you can see different cores peaks at 100% but total use is 13%.

1 Like

Well that Xeon is a mobile unit limited to 45 watts, from 2018, the whole platform is going to be slow.

You’ll probably be disappointed at how much faster the best workstation you can buy today isn’t at a single-thread task, but…yeah it sucks.

1 Like

According to userbenchmark a modern i9 is about twice as fast as your xeon. That is considered MUCH faster in the pc world, but in real life it is just faster IMO. I mean if it takes 0.5 or 0.25 seconds to compute doesn’t matter, and if it takes 20 minutes instead of 40 minutes to complete something then you still have to wait a long time, even though it is “much” faster. To me when something is 5x faster it is starting to be a game changer, 10x for sure. 2x not so much. 20% is not noticeable. Unless you are talking about FPS, then there is a big difference between 10fps and 20fps, but anything above 60fps you won’t notice, so 120fps doesn’t matter (in modelling, in gaming it can mean you get a headshot instead of missing the target, and in VR it can mean you won’t get motion sickness)

Rhino’s biggest bottleneck is tons of objects.
It is much faster at handling one very complex mesh than 10.000 light ones. I hate low frame rates so when we get complex IFC’s from architects I run them through a script that meshes and optimizes them so they go from barely handleably (2fps) to super fast (60 fps) and THAT is a big difference. Hope I didn’t overexplain it, but my point is that you might not get the huge jump in speed that you wish for. Maybe something else can be done to optimize even further.

3 Likes

I don’t like blocks at all, I usually just explode them all and purge them :grin:

It kindof is a video game though, especially over time when we will be using CAM simulations and Robot simulations more and more, not to mention animations with GH plugins and stuff. :blush:

But simulations are usually linear equations…which literally means at some point it gets squeezed down to a single thread. Multiple cores have been around almost as long as Rhino, the stuff that was easy to exploit was a long time ago.

1 Like

it is a pain in chest when with 36 core you always see 2.5% cpu usage at full load. :sweat_smile:

1 Like

We have issues even with one level nested blocks (blocks inside a block).
They don’t necessarily have to come from part libraries. They can be created directly in Rhino.
Rhino’s nested blocks very nicely replace an “assembly” - a must have concept for anyone doing more detailed architectural design then a cottage at a lake. To be fair to Mcneel, Rhino’s capability to import mechanical CAD’s assembly structure as nested blocks is fantastic.
Only if display performance would be slightly better, when their number is numerous inside one .3dm file.

1 Like

I’m not talking about 250 levels, just one or two… nested blocks are a good way to keep track of certain types of part hierarchies and to say that anything nested is bad removes that usefulness.

Plus, IIRC there was some info that even non-nested blocks have poorer performance than their discrete geometry equivalents - but maybe this is old/bad info…

2 Likes