Impact of clusters on performance

In an article, I read that: “Clusters cannot be computed parallel”

I assume this means that if the solver encounters a cluster, it will stop, enter the cluster, evaluate the solver there, and then – with the results from the cluster – resume execution of the main definition. However, as far as I understand it, Grasshopper is single threaded anyhow. From David’s post Let’s talk about Grasshopper 2.0: “Because we knew from the start that Grasshopper 2.0 needed to be multi-threaded, and you cannot ‘just’ parallelise single-threaded code.”

Take the following definition. I assume, that the divisions will be computed one after another, independent of whether packaged into a cluster or not.


2024-04-12_computation.gh (4.9 KB)

I.e. this should not be slower, at least to my understanding:


2024-04-12_computation_in_cluster.gh (6.0 KB)

Is there any example where the lack of parallel computation with clusters slows down the solver?

Multi threaded parallel computation can be difficult to quantify performance gains or loos of performance. Many times it take smore time and memory to setup a multi-treaded setup. But then over time that loss in performance for setup can sometimes be overcome with many parallel calculations.

Clusters are not that sophisticated. They are simple a grouping mechanism for sections of the definition.

There are ways to run multi-treaded parallel in Grasshopper 1.

  1. Some components are multi-treaded: Rhino - Multi-threaded components
  2. The Hops component can run parallel definitions locally or on a larger set of servers: Rhino - The Hops Component

Hopefully that helps.

Thank you for the details!

To come back to my question:

That means, clustering components has no impact on performance, except for an O(1) overhead for calling a cluster. Is that correct?

If that’s not correct, I’d like to see an example where clustering components does have a negative impact on performance.

The impact on clusters is negligible. Clusters can help you in structuring your definitions. So you or a potential coworker performs better in maintaining your definitions. Other than that, you should not care about computational performance unless its obvious or required.

Premature optimization is evil!

If you need to optimize a definition, you either find a better combination of GH components, or you bypass Grasshopper as much as possible by writing your own code. It is helpful to have profiling skills, because many people don’t even know where the bottleneck is and begin to optimize the wrong part.

Whenever you don’t know how to improve performance, try to create a minimal example and post it here. Optimization topics are interesting threads in this forum, and often you get something faster😜

1 Like

Perfectly put, I often have to remind myself that.

The bottleneck navigator in Metahopper is a good starting point.

This article is also good.

But it is not all about fast executing graphs. If it were all about speed, we wouldn’t be using Grasshopper.

Anyway, here is a simple test. It looks like the difference is negligible.


Cluster Basic Test.gh (18.0 KB)

1 Like

Thanks, but can this be quantified? negligible = O(1)?

I much prefer Hops for that, for reasons of readability and maintainability, but that’s for another topic.

That’s interesting, thanks! I’d prefer an authoritative answer by someone who knows how it works under the hood. Perhaps clusters are just fancy groups that get exploded before the solver does it’s job.

(my question is not about optimization)

You cannot apply algorithmic complexity to a whole software component and evaluate performance. Nobody knows for sure. O(1) essentially means that you execute some system calls without iterating (over the data) again and introducing data-specific overhead. But this doesn’t quantifies overhead. Data is passed as reference, in contrast to script components, where data is copied for security reasons. (You can also bypass this when you deal with large sets of data), but still even if you re-iterate over the data, this can be negligible unless you deal with millions of data points. CPU’s are really fast nowadays!

Even if clusters would introduce significant overheat. It is still wrong to not use them, because you might create pseudo-optimised code, trading readability for a couple of milliseconds.

Other than that, also consider that the window needs to draw less, the less components are on the canvas. If you deal with a couple of thousand components, this can also matter performance-wise.

Really, its wrong and very one-dimensional to think about performance like this. If you have a concrete problem then start investigating…