Another Multi-Threaded Solid Difference Topic

Hello,

I have a script that relies on a large dataset of boolean difference operations. Please assume there isn’t a way to avoid this (via other geometry creation/manipulation methods)

I’ve been looking for speedier Boolean operators in Grasshopper for a bit and recently stumbled upon a post where @magicteddy found a performance increase over the standard Solid Difference component by simply calling the Rhino Common equivalent inside a scripting component.

While there is a noticeable performance difference between the two components (in favor of the magicteddy/Rhino Common version) I couldn’t help but remember reading through this post and this post in the past and thinking there is an option to leverage the magicteddy version plus enable parallel computing in the case of the A input being a long list of items.

I’ve tried to take a stab at this myself but I’m stuck on understanding how to loop within a multithreading task or rather how to execute a single operation within the looping tasks.


If I have a list with inputs A as a list of 10 objects and a B list with 150 objects each, I’m assuming I commit a thread per each item in A and then operate on the corresponding B sublist within the thread task.

I can’t seem to get it to work though and currently when I enable Parallel in my component I get an “appended” list of all possible A/B boolean differences per A item instead of the culminated result of all A/B differences merged together. I’m unclear of how to properly append the final R/result list to be single objects matching the length of A.

Here’s the code and a test file:

C#:

using System;
using System.Collections;
using System.Collections.Generic;
using System.Drawing;

using Rhino;
using Rhino.Geometry;

using Grasshopper;
using Grasshopper.Kernel;
using Grasshopper.Kernel.Data;
using Grasshopper.Kernel.Types;

public class Script_Instance : GH_ScriptInstance
{
    int processor_count;
    double tol;

    private void RunScript(System.Collections.Generic.IEnumerable<Rhino.Geometry.Brep> A, System.Collections.Generic.IEnumerable<Rhino.Geometry.Brep> B, bool Parallel, out object R)
    {

        //Display Component Message
        processor_count = System.Environment.ProcessorCount - 1;

        tol = Rhino.RhinoDoc.ActiveDoc.ModelAbsoluteTolerance;

        if (Parallel)
        {
            R = ParallelCreateBooleanDifference(A, B);
            //this.Component.Message = "processor_count + " Threads Available"";
            this.Component.Message = "Parallel";
        }
        else
        {
            R = Brep.CreateBooleanDifference(A, B, tol);
            this.Component.Message = " ";
        }
    }

    private object ParallelCreateBooleanDifference(IEnumerable<Rhino.Geometry.Brep> A, IEnumerable<Rhino.Geometry.Brep> B)
    {
        var result = new List<Rhino.Geometry.Brep>();

        // Use Parallel.ForEach to run in parallel
        System.Threading.Tasks.Parallel.ForEach(A, a =>
        {
            foreach (var b in B)
            {
                // Compute the difference for each pair of A and B
                var diff = Brep.CreateBooleanDifference(a, b, tol);
                lock (result)
                {
                    result.AddRange(diff);
                }
            }
        });

        return result;
    }

    public bool Parallel;
}

Test File:

20230907_Solid_Difference_Speed_Test_01b.gh (18.3 KB)

Model Space:

Graph Space:

Speeds:

I’ve read that there is overhead in parallel computation and that it’s not always faster depending on the setup. Currently, I just want to get the logic working to even be able to compare that case in this test file.

Thank you all for your help!

2 Likes

Hi @michaelvollrath ,
The approach in my previous C# from 2018 I spread branches over threads by using DataTrees. I think the same technique can apply here. In the attached definition, I’ve modified it to use the List<Brep> overloads for the firstSet input of Brep.CreateBooleanDifference() instead of a single Brep.
20230907_Solid_Difference_Speed_Test_01b_BW.gh (23.9 KB)

  private void RunScript(DataTree<Brep> S, DataTree<Brep> D, ref object R)
  {
    double tolerance = doc.ModelAbsoluteTolerance;

    var mainBrepsMT = new System.Collections.Concurrent.ConcurrentDictionary<GH_Path,List<Rhino.Geometry.Brep>> ();

    int totalMaxConcurrancy = System.Environment.ProcessorCount - 1;
    this.Component.Message = totalMaxConcurrancy + " threads";

    //start of the parallel engine
    System.Threading.Tasks.Parallel.ForEach(S.Paths, new System.Threading.Tasks.ParallelOptions
      { MaxDegreeOfParallelism = totalMaxConcurrancy},
      pth => {
      mainBrepsMT[pth] = Brep.CreateBooleanDifference(S.Branch(pth), D.Branch(pth), tolerance).ToList();
      });
    //end of the parallel engine

    //convert dictionaries to regular old data trees
    DataTree<Rhino.Geometry.Brep> mainBreps = new DataTree<Rhino.Geometry.Brep>();

    foreach(KeyValuePair<GH_Path,List<Rhino.Geometry.Brep>> p in mainBrepsMT)
    {
      mainBreps.AddRange(p.Value, p.Key);
    }

    R = mainBreps;

  }

The previous approach would return any problematic cutting Breps, however I haven’t tested this one to see if the whole thing fails with them. If it does, I could see allowing for a toggle to choose which method to use.

-Brian

4 Likes

Thank you so much @Brian_Washburn! I’ll test it out and report back!

Finally got to actually test this.

Drum roll please…

The winner is @Brian_Washburn with his 2023 remake of the classic!

But seriously, thank you!

I’m going to test with varying branch depths/structures etc but so far this appears to be the best option and i’m going to attempt to extrapolate the logic to “Solid Intersection” and “Solid Union” as well.

Are there any further optimizations you are aware of that could be implemented into the code?

I really appreciate it!

4 Likes

There may be something of interest in this thread:

1 Like

Brian this was awesome, thanks :index_pointing_at_the_viewer:

Hi @Brian_Washburn,
I just wanted to follow up on this and share the components I was able to create by adapting your code for the other boolean operations.

I also added basic warning handling to mimic the native components.

Attached in the GH are the native versions for reference/comparison:
-Solid Difference
-Solid Intersection
-Solid Union

R7 C# versions:
-Solid Difference (Multi-Threaded)
-Solid Intersection (Multi-Threaded)
-Solid Union (Multi-Threaded)

R8 Beta C# versions:
-Solid Difference (Multi-Threaded)
-Solid Intersection (Multi-Threaded)
-Solid Union (Multi-Threaded)

Thank you again, about to read through your other topic you shared.

Cheers!

Graph Space:

20230923_Multi-Threaded_Boolean_Components_01a.gh (22.1 KB)

6 Likes

Thanks Michael - I sometimes make perforated object to 3D print, so anything that can speed up SDiff interests me. Here are images comparing your version and stock:

1

#2 looks impressive - except there is no output. :sleepy:

The result is supposed to llok like this:

pic

Here’s the GH file with internalized geometry:

comparison.gh (3.4 MB)

I don’t do C coding because I can’t tolerate the syntax, but maybe you can figure out what happened to the output.

2 Likes

Hmm… well the lack of output explains the speed increase! Haha

This did happen to me at one point and I got it working but I forget what caused the missing output.

It may have been a null item in the input list but I’d have to get hands on and look.

The code basically just attempts to run the native SDiff from Rhino with parallel processing but doesn’t have a lot of error handling built in.

Also it’s mostly Brian Washburns brain child/work I just hacked some things together.

That being said I’ll take a look when I get a chance and see if anything stands out with your file

Try grafting GeoA so there is correspondence between the branches of the inputs.
image

-Brian

1 Like

Amazing - that worked.

Grafting never would have occurred to me because GeoA is a single Closed Brep. Obviously there’s something going on that I do not understand. Why would a Grafted list of 1 object be treated differently from the object itself? And is there a place where this is explained?

1 Like

If you peak into the parallel computing bit of code you’ll see that there is logic for evaluating branches and I imagine this is the main reason as the code wants branch structures to match to run them simultaneously.

That’s just my guess. I’m sure Brian can explain why

Thanks - that sounds reasonable. I 'll probably not inspect the code because C syntax makes me loose all patience (I can handle VB just fine), and I am not aware of any of Rhino’s internal links anyway. I just want the darn thing to work, and if that means setting the option to Graft makes that happen I’m perfectly happy to do that. The speed improvement is quite impressive for sure.

Interestingly, in testing the file you shared, having more cores makes an almost negligible difference in this example:

Wow - what CPU has 19 cores?

I just found another example of how this technique greatly improves results:

1

2

That’s for 27 closed Breps being merged into 1. This is going to be a great time saver for me when I make models comprised of multiple closed Breps - something I do quite often.

PS: About a hundred years ago we tried linking 7 small IBM mainframes together. The idea was to split FEM and Fluid Dynamics programs across multiple CPUs. What we found was that doing that actually slowed down overall execution times, apparently because of the additional processing needed to keep everything synchronized. So we split the 7 into groups of 3 & 4 and got much better results.

PS: I’m hoping Mcneel people will incorporate this technique in a future release. Like maybe the next one??

1 Like

GH2 is being rewritten ?entirely? to be multi-threaded. Take your same example file into GH2 and see if it makes a big difference or not.

I deal with a lot of boolean operations too in complex building models/millwork etc. that I work on which is what lead me down the path of trying to speed this all up. I was using Sasquatch plugin for the longest time but was pleasantly surprised that Brian’s solution ended up being faster!

Given that you can actually send data out of GH1 and into GH2 while having both actively running in seperate windows. It might be worth generating all your forms/cutter objects in GH1, passing into GH2, booleaning them there, then passing the data back out into GH1.

I haven’t recently tested GH2 boolean components but when it first came out I used them for 100s/1000s of wooden studs in a framing solver algorithm I was working on and it just chewed through it, it was awesome.

Long story short, check out if there’s an opportunity to pass through GH2 for further speed increase and thankfully GH2 is natively supporting multi-threading.

Yes, you’re right, one of the issues is that sometimes the overhead of “spinning up” multiple cores or timing them takes more computation than it’s worth but in particularly long lists such as 1000s of objects, the overhead upfront will long pay for itself in overall performance.

Processor 12th Gen Intel(R) Core™ i9-12900H, 2500 Mhz, 14 Core(s), 20 Logical Processor(s)

I’m running this CPU in my main build, it 's a 14 core and has 20 logical processors, the script we are using in this forum thread has a code line to use all possible cores - 1, to save a core (hence the 19) for the program in general/other tasks on your computer without locking your CPU at 100% while it solves the booleans (I think… I don’t know much about computer processes really)

1 Like

The saga continues…

Turns out the SUnionMT (as well as standard SUnion) behave in strange ways. At least with this geometry which is comprised of 27 Closed Breps. They either do not produce a valid Brep or a single one. But all 3 options are able to Bake their results that Rhino can export as a valid STL file.

Truth is that’s all that matters to me - but there is clearly something wonky going on. The STL is printing now; I’ll post a pic of the print sometime tomorrow. (It’s a 17 hour print.)

test2.gh (11.1 MB)

What’s the result if you bake the 27 closed breps in Rhino and then Union them with Rhino itself and not GH?

The answer is…Rhino creates something called an open polysurface - whatever that is.

1

The only thing I typically use Rhino for is to export it’s contents as an STL file. I have used it to create geometry from scratch only a very few times, and all of these were for very simple shapes. Before I started doing 3D printing about 6 years ado I reviewed what was then all the 3D design software I could find. I settled on GH because of it’s parametric nature - which i use a lot - and have just stuck with that ever since.

1 Like

I haven’t had a chance to look at this file but I suspect it doesn’t like something about your geometry you are joining.

For example if you have two cubes that are perfectly aligned at the corner edges, it will join them but return an Open Brep or invalid geometry sometimes. I suspect something similar may be happening here with the joining of the faces or perhaps something is not consistently oriented