Arithmetic operations on very large data trees

ad.simon · August 30, 2022, 4:43pm

Hi,

I am trying to perform simple arithmetic operations on two very large data trees of equal size (~140 million values each). The process is extremely long (I waited for up to 4 hours) and takes up all my memory (I only have 16 Go).
For demonstration purposes, the two data trees in the attached Grasshopper file are much smaller (~2 million values) but still take several seconds to complete.

How can I speed up the process of working with very large data trees and use less memory?

large_matrix.gh (10.6 MB)

Interestingly, the calculations ran much faster when the data was internalised (see screenshots). Why is that?

Many thanks for your help!

antoinemaes · August 31, 2022, 1:14pm

I guess the best would be to do as many calculations as you can outside of Grasshopper, and only bring back the results… Extra step, but could maybe speed up (except writing to and reading from will probably quite heavy as well)

But may I ask how one could ever need that many values?

Konrad · August 31, 2022, 3:26pm

Check out Impala for fast math operations

Edit: Quite an improvement

Dancergraham · August 31, 2022, 8:27pm

My guess from the screenshots - to do calculations involving ground level wind speeds, e.g. for pedestrian comfort calculations, involving the output from a CFD solver plugin such as Butterfly or Swift. The two trees might be different climate or building scenarios using the same mesh of points

ad.simon · September 1, 2022, 8:16am

Hi @antoinemaes, I am trying to compute UTCI values for thousands of points and for each hour of the year using the Ladybug UTCI Comfort component…

antoinemaes · September 1, 2022, 8:21am

I see !

I’m surely not the one to tell what you should do… But doing this for every hours seems a bit overkill, especially when LB is already a bit vague on some data (estimates based on years, different type of entities collecting the data, …)

You may gain an incredible amount of computation time while having similar results (within 1% or less probably), with a fraction of hours per day. (I’m not even running it for all the days in each months…)

Just a thought, entirely up to you

Edit @ad.simon Maybe post this on Ladybug Forum to see what’s best practice for this

ad.simon · September 1, 2022, 8:22am

Thanks @Konrad! I will try this out.
But my problem is really the amount of RAM that is being used (~60 GB when I run the full version of the script on a 128 GB RAM machine). Will Impala also reduce this memory usage?

gankeyu · September 1, 2022, 8:37am

You really shouldn’t use Grasshopper for this scale (140M)

High memory usage is due to GH’s value boxing and conversion, which is difficult to overcome regardless of addons.

Konrad · September 1, 2022, 9:10am

I did test memory usage and it seems to be the same:

When processing so many values they have to be stored somewhere I guess. You could try to write them to disc and then process them sequentially but I wonder if the overhead is not causing more problems. One thing I noticed is that the GH canvas gets very unresponsive with so much data regardless how fast impala can chew through the calculations. So GH just might be the wrong tool for the job

ad.simon · September 1, 2022, 9:59am

Thank you for your insights @Konrad and @gankeyu!
How would you do this outside Grasshopper? What tools would you use instead?
Cheers

gankeyu · September 1, 2022, 10:26am

Export the data into some format. Process it by programming (python/vs) or math software (such as MATLAB / Mathematica).

Topic		Replies	Views
Learning Data tree operations FASTER !? Grasshopper tree	6	2755	November 30, 2023
Optimize large data manipulation Grasshopper windows	23	922	April 4, 2024
DataTree poulated with geometrical data in a loop results in significant slow down Grasshopper	3	302	September 4, 2020
Grasshopper Save Solution? Grasshopper windows	3	815	October 15, 2021
Quicker way of creating specific paths in a data tree Grasshopper windows	3	439	December 12, 2023

Arithmetic operations on very large data trees

Related topics