Enormous "Split tree" component running time

I am working on a definition that reads all the blocks in a file, parses them in categories, and exposes their attributes.
On a large file, the definition took almost 12 minutes to process.
At first, I thought it was the “Volume” component, but this one only took 8 seconds.
The bottleneck navigator found the culprits : “Split tree” components that I use to sort the blocks (a class of objects that is output by Elefront components).


How can that be ?

EDIT : Actually, it has nothing to do with the block class.
I have other “Split tree” components that deal with the insertion point of the blocks, and they take exactly the same amount of time to process.

So what’s up with these components ? Is there a threshold in the number of paths that they can deal with before falling into Reggae mode ?

I’ve made another test with a smaller file.
Here I have 4948 branches so that’s 12.7 times less than 63089.
If the “Split tree” had a linear behavior, it should then take 27 seconds to deal with the 63089 branches, but it takes 72 seconds…

@DavidRutten ?

Here is what’s gobbling up all my memory :

The Rhino model is 126 MB
The Rhino instance with that particular model takes up 730 MB
…and with my GH definition that parses the block data, it grows to over 10 GB !!!

I replaced the “Split tree” components by “Branch” , and the definition now takes under 3 minutes to execute (compared to almost 14 minutes).
The “Branch” components are 270 times faster !

That is nuts !
The memory usage, though, has spiked even higher, but at least, it goes back down afterwards.

I’m just guessing, but there’s a lot of work involved in converting a string into a “Tree mask” with all of the complex wildcards and operators that are supported. If you’re not taking advantage of any of these wildcards, and just retrieving a specific branch, I’m not surprised split tree would be much slower than Tree Branch, which doesn’t have to do any of this sophisticated parsing.

That makes a lot of sense, but I would never have thought it would be so dramatic.

By the way, another bottleneck is sometimes…the bottleneck navigator.
If the memory is already loaded, it makes Rhino+GH completely un-responsive.
Could this be tweaked somehow ?
Maybe by referencing just the longest running components ?

Also, is it somehow possible to spot the components generating the biggest RAM footprint ?
Sorry if this sound naïve, but I’m dealing with a 10 GB situation on a base model which is “just” 126 MB, and I’d like to understand why.