Computing the containment relation for disjoint closed curves

Suppose we’d like to work with the containment relation of a set K of 2d curves with the following properties:

A. All curves are closed
B. All curves are mutually disjoint

This means we’d like to:

  1. Verify that K has property A and B (so we can throw an error if it doesn’t)
  2. Create a data structure that lets us take curves p and q in K and answer:
    i) Does p contain q?
    ii) Is p the parent of q? (p contains q, and no other curve in p contains q)
    iii) Is p a top level curve? (p has no parent)

Example 1

Suppose K looks like the diagram on the left. We can answer these questions using its containment graph:

  • p contains q if there is a path up from p to q
  • p is the parent of q if this path is length 1
  • p is a top level if it is a root node

Example 2 (Use case)

Here we have a complicated design that’ll be CNC manufactured. It has multiple panels with interior details, all represented by closed, mutually disjoint curves:

Before manufacturing, we’d like to nest them with @Petras_Vestartas’s OpenNest:

In order to properly align the results, we must group each internal detail with the top level curve that contains it. We can save a lot of time here if we can automatically answer questions 2.iii) and 2.i.

Solutions So far:

1.A)

Property A fails if the output of the “Closed” operation has any false values:

1.B)

Property B fails if the “Multiple Curves” operation finds any intersections:

2.i)

We can use “Point In Curve” to create a containment matrix; call it M.


M has a 1 at column i, row j exactly when curve j contains curve i, and zeros elsewhere:

This algorithm requires O(n^2) point-in-curve tests. For example 2, this is 462^2 = 213444 tests. Not great! Another limitation: M does not represent the containment graph as drawn in example 1, but rather its transitive closure. This means it lets us answer 2.i and 2.iii fairly easily, but it has a lot of redundant data that makes 2.ii harder to answer.

Ideas for accelerating this step include:

  1. Pre-compute bounding boxes and use Point-Rectangle intersection as a cheap pre-test
    • “Point In Curve” may already do this
  2. Split curves (or bounding boxes) into a quad tree, B-tree, or grid
    • This can probably give us O(n log(n)) time
    • Not clear how to do this in a very “Grasshopperish” way

2.ii)

Suppose that K has n curves and let

N' = M - M^2 - M^3 - \dots - M^{n-1}
N_{ij} = Max(0, N'_{ij})

Then N_ij = 1 exactly when curve j is the parent of curve i, and is zero otherwise. This works because M^k_{ij} counts the paths of length k between nodes i and j in the graph represented by M.

In Grasshopper, we can implement this matrix wrangling with some custom python code.

This algorithm requires around O(n^4) multiplications. Here, example 2 make grasshopper crash under an avalanche of approximately 45,558,341,136 multiplications.

Luckily, M is sparse, and this algorithm can be simplified:

  1. Use an adjacency list instead of an adjacency matrix
  2. Perform topological sort instead of matrix multiplication

This should have running time O(n + m), where m is the number of 1s in M.

2.iii)

This is conceptually easy: curve i is top level if and only if row i of M is all zero. Computing this efficiently goes back to bounding boxes and divide-and-conquer.

Questions:

  1. Is there any grasshopper component that already does some or all of this stuff more efficiently than my prototype?

  2. Is there any “grasshopperish” way to implement divide and conquer algorithms like quad trees?

  3. What are some other ways to accelerate these algorithms?

Thanks!

See this:

A better approach replicating ‘Curves in Curves’ - Grasshopper - McNeel Forum

BTW: See this as well (if you plan to use Rectangles/Boxes as a “first” fast filtering approach).

Box_ContainsBox_V1B.gh (131.0 KB)

Thanks peter! It looks like this ground has been blazed by fine minds already!

Takeaways from that thread:

  • R-trees can defeat the the O(n^2) bound. (Rhino.Geometry.RTree, wiki, paper, Radovan’s code, Qythium’s code)
  • Impala provides components for faster curve-curve intersection and point-curve containment (and it’s open source!)
  • Offsets/Minkowski sums could help cluster the inside curves

Á couple of things:

  1. A speedy clustering approach is not a bad thing … but if the core of the matter (the containment test) is slow … you are in a big rabbit hole.

For instance imagine 300 polylines (NOT curves) in a nested topology shema. The stupid way for testing the containment is to treat each as Curve and then do the RegionContainment test: This (with NO first pass filter active) yields 600 milliseconds (an eternity).


But … if you use a smarter way to test the containment then you get the 1/30 of the previous time (also with NO first pass filter active). In this case an RTree doesn’t cut the mustard with a better way … because the cost of Rectangles ccx (just as a prefilter) is not a nothing thing.

Moral: For speed is pointless to rev your engine to the max if the 1st gear is selected.

  1. Other than that you need HAC clustering for the general case (polys contain polys that contain polys that … etc etc). Compare the containment trees as flat or nested:

  1. For HAC you need just the flat contaiment tree (the left one shown above) and a proper recursion. No Minkowski or anything else.

  2. For nested/complex (GroupBy/Where “combos” etc etc) queries on HAC clustering you’ll need the approach as exposed in the boxes C# peviously attached (the custom class, that is).

Deep math!
Sorry to barge in…
If the amount of curves is really high, would’t be useful find the external outlines by some sort of visual approach?
I mean, like a screenshot of the view, where the bitmap have no black pixels (curves) it draw lines to split the whole area into subregions.
Then for each subregions you apply your methods.
But this could(/should?) greatly decrease calculation times… not?

I mean, even with 1 million of nested curves, and even on a 800x600 display i could instantly tell where and which are the groups.
I’m looking at this pic:

Even on this small thumbnail ^ you can see the groups… because no black pixels (curve) connect one curve to the other.

Again sorry to barge in, I have quitted school at 19yo so my math skills is uncomparable to what you guys are discussing about…
:sweat_smile:

Edit: i mean, if with 1 million curves it takes 1 hour, then it is faster if done manually, “by sight”…
So why not teach (program) the pc to do it by itself?
Like, script the mouse cursor (i’m joking, almost) to actually select the groups of curves on screen and grouping them (rhino command)… the script would work without even knowing how many curves there are! :laughing:
This would not be the final solution, but maybe integrate it in a “low resolution” to do the first step/subdivision…

Er … hmm … for 2.5K nested polys takes 666 milliseconds (no tricks) or ~66 (all tricks on) meaning that 1M “could” (hope dies last) take 266100 milliseconds > 4.3 minutes > that’s way more than the sum of the time advantage/diff that Lewis achieved for beating the 2nd pos man in all of his races in all of his 5 world titles (and counting).

I was referring at a qythium’s comment in the other thread you linked, where he talked about hours.
Also, does it scale linearly?
I thought that, for your(s) kind of algorithms, computation time would scale exponentially to the number of curves…

Well … I’m working on a secret (WIP) equation for elapsed time(s) estimations. It works when it works, mind.