Cluster runtime very slow

Hello all,

Ive seen this raised on other forums but sadly to no avail, and so i thought id try my luck here.

When creating a definition, I have noticed that the runtime of the definition differs significantly if you cluster it. Ive investigated this for a few days now and have come to the following conclusions (as useless as they may be!)

  1. A definition becomes significantly slower if you cluster it.

  2. The runtime of the cluster is not reflected by the runtime of the components within the cluster itselfā€¦ let me explain: I created a definition and clustered it, as expected, the cluster was very slow. As it stands, the cluster takes 4.7 seconds to compute, however, If I enter the cluster, and evaluate the runtime of each component within the cluster (using the ā€˜bottleneck navigatorā€™ as part of the ā€˜metahopperā€™ addon - this lists all the components on the canvas according to their computational time), the cumulative runtime for all the components within the cluster is 40ms!! This differs significantly from the cluster runtime of 4.7 seconds.

  3. Finally, lets say you created a definition and you clustered it. And lets say that this cluster draws a 1000 points and then connects every four points with a polyline, giving you 250 curves. Now lets assume that one of the inputs to the cluster is a slider that selects one of these curvesā€¦ so that everytime you move the slider, the cluster outputs a different curveā€¦ essentially, you have a ā€˜list itemā€™ inside the cluster that chooses a curve from your list of 250 curves and outputs it. Should you have ever been in my position, then you know already know the issue im about to raise. If you move the slider that is inputted to the cluster, rather than simply listing the item you want to select, the cluster recomputes all of the components within and then outputs the selected curve. Essentially, the cluster always recomputes itself when you make ANY changes to any of its inputs. so if the cluster time was 4.7 seconds to compute, it takes another 4.7 seconds if you make any changes to the clusters inputs.

If you have experienced any of the above and have any solutions or insights into how one can resolve this, please let me know! any help would be greatly appreciated.

**Update:
Ive run a few more tests and i think ive narrowed down the problem to issue number 1 and 2 raised above. in the cluster i originally created, I had two inputs, each with its own dataā€¦ If i internalise the data INSIDE the cluster (so now the data is stored inside the cluster itself, therefore im not inputing data from outside the cluster) and delete the input, the cluster has gone down from 4.7 seconds to 50ms!.. which is much closer to what the actual runtime of the components in the cluster is. I dont think that the content of the input to the cluster itself matters significantly, i internalized each input separately and regardless which one i internalize, the runtime went down drastically.

Im guessing the issue is with the translation of data from the canvas to inside the clusterā€¦ perhaps the ā€˜cluster inputā€™ paramater is whatā€™s causing the issues?

Best,
Mohammed

ps: Im running these tests on rhino 5ā€¦ Im still waiting for my rhino 6 licence to be emailed to me and ill run the same tests in rhino 6 and will report my findings.

2 Likes

Iā€™ve encountered this issue before, and itā€™s frustrating not being able to selectively expire components in a cluster, especially when it involves continuously-running things like Kangaroo solvers

One easy workaround is to use a DataDam directly after the cluster inputs which cause expensive operations and set it to a low update frequency, this way only your fast-updating components will get recomputed.

delayed_cluster.gh (9.4 KB)
image

Hello qythium,

ive tried the data dam route, but it didnt solve my problem in anyway, if you have a quick read to the update of my post above, i believe the issue seems to be with the translation of data from outside the cluster to inside the clusterā€¦ When i put the data dam directly after the ā€˜cluster inputā€™ within the cluster, i would still face the same problem of the cluster computing itself when any changes are madeā€¦ I believe that when you make changes to an input going into the cluster (like a slider for example), the cluster seems to re-read the data from ALL of its inputs rather than from only the input that was changed.

Are your input parameters themselves very large or composed of many items? I believe there is copying of params going on at the cluster interfaces which may cause delays for heavy inputs- maybe you could provide a sample gh file

The input paramaters are quite large yesā€¦ however, when doing my testsā€¦ when my two data sets were inputted as two separate inputs into the cluster, the cluster was 4.7 secondsā€¦ however, when i ā€˜Entwinedā€™ the two data sets outside of the cluster so that both sets are within one listā€¦ and i input them to the cluster as a single input (and then ā€œun-entwineā€ them in the cluster) the runtime goes down to 40msā€¦ so in terms of data sets, in both instances they are the sameā€¦ the only difference is that in the first (slower) test they were inputted separately, while the second (faster) test, they were inputted as one list.

Iā€™ve logged this under RH-44500.

Iā€™m not sure the problem is fixable in Grasshopper 1.0, it will require a lot of performance counting first.

1 Like

I just committed some changes to the way clusters deal with expiring inputs in the current Rhino7 WIP code. So hopefully when the next weekly version is released clusters will work faster if they can re-use large swaths of the previous solution. And even more hopefully the testing Iā€™ve done didnā€™t fail to find any issues with stale data.

2 Likes

RH-44500 is fixed in the latest WIP

1 Like

Thank you! This will make the clusters much more of a feature to use. I had a design last week where I wound up copying chunks of code, as i did not want to go the cluster route. Next week i will use the new version.