Arithmetic operations on very large data trees

Hi,

I am trying to perform simple arithmetic operations on two very large data trees of equal size (~140 million values each). The process is extremely long (I waited for up to 4 hours) and takes up all my memory (I only have 16 Go).
For demonstration purposes, the two data trees in the attached Grasshopper file are much smaller (~2 million values) but still take several seconds to complete.

How can I speed up the process of working with very large data trees and use less memory?


large_matrix.gh (10.6 MB)

Interestingly, the calculations ran much faster when the data was internalised (see screenshots). Why is that?


Many thanks for your help!

I guess the best would be to do as many calculations as you can outside of Grasshopper, and only bring back the results… Extra step, but could maybe speed up (except writing to and reading from will probably quite heavy as well)

But may I ask how one could ever need that many values? :slight_smile:

Check out Impala for fast math operations

Edit: Quite an improvement :grinning_face_with_smiling_eyes:

5 Likes

My guess from the screenshots - to do calculations involving ground level wind speeds, e.g. for pedestrian comfort calculations, involving the output from a CFD solver plugin such as Butterfly or Swift. The two trees might be different climate or building scenarios using the same mesh of points

Hi @antoinemaes, I am trying to compute UTCI values for thousands of points and for each hour of the year using the Ladybug UTCI Comfort component…

1 Like

I see !

I’m surely not the one to tell what you should do… But doing this for every hours seems a bit overkill, especially when LB is already a bit vague on some data (estimates based on years, different type of entities collecting the data, …)

You may gain an incredible amount of computation time while having similar results (within 1% or less probably), with a fraction of hours per day. (I’m not even running it for all the days in each months…)

Just a thought, entirely up to you :slight_smile:

Edit @ad.simon Maybe post this on Ladybug Forum to see what’s best practice for this

Thanks @Konrad! I will try this out.
But my problem is really the amount of RAM that is being used (~60 GB when I run the full version of the script on a 128 GB RAM machine). Will Impala also reduce this memory usage?

You really shouldn’t use Grasshopper for this scale (140M)

High memory usage is due to GH’s value boxing and conversion, which is difficult to overcome regardless of addons.

2 Likes

I did test memory usage and it seems to be the same:
image

When processing so many values they have to be stored somewhere I guess. You could try to write them to disc and then process them sequentially but I wonder if the overhead is not causing more problems. One thing I noticed is that the GH canvas gets very unresponsive with so much data regardless how fast impala can chew through the calculations. So GH just might be the wrong tool for the job

Thank you for your insights @Konrad and @gankeyu!
How would you do this outside Grasshopper? What tools would you use instead?
Cheers

Export the data into some format. Process it by programming (python/vs) or math software (such as MATLAB / Mathematica).

3 Likes