I was wondering if anyone could help with this problem. I have a set of planar curves (all rectangles but I imagine the same process would work with any non concave shape) where most of them overlap with at least one other curve. Is there a process where you can input these curves and it will return the largest set of curves where non of them overlap.
I’ve made a basic attempt but I imagine the optimal way to do it will involve an iterative process. MaxOverlappingCull.gh (23.9 KB)
That’s a classic Cluster Analysis puzzle (where Clustering criterion has to do with Curve/Curve ccx events OR containment OR some “similar” other within the given collection (List or Tree)).
If there’s nested criterion/containment situations you’ll need HAC Clustering (Google that). H stands for Hierarchical as opposed to Flat Clustering.
The result is a Tree where each branch contains the related (i.e. Clustered) Curves … thus if you are after “single” Curves … you sample all where a given branch has just one item - all that for Flat Clustering. Or you can query the Tree with any other way (like find all Clusters with N items or items > N and < M or items with K ccx events or whatever … etc etc).
Note: If the Curves are a lot you’ll need a variety of ways to slash elapsed time.For instance if the Curves are Polylines and you are after containment Clustering … there’s ways to get 10++ times faster results compared with the classic Rhino Curve/Curve containment Method.
If you are familiar with C# (medium to advanced) notify for a simple (not HAC - that’s strictly internal) demo.
Thanks for the reply Peter, am I right in thinking then that it’s a problem beyond the capabilities of native GH components? I’m decent at Python but have only dabbled in C# I’m afraid.
I’m not sure how once the tree structure is formed would I go about finding the curves that if I cull I’m left with the maximum number of “independent” (ie non-overlapping) curves. Do you think that would be a simple query or need a proper optimization?
Well … I would suggest to wait a bit in case that a Python Samaritan could provide an answer (I guess non HAC).
If not I’ll post a Flat Clustering C# demo with a dual user defined criterion (ccx or containment for a start). But this means that you must have NON nested Curves (where nested means Curve contains Curve that contains Curve … blah, blah). For ccx is simpler (provited that the containment part is not in question). Note: I can’t post high performance Methods (these are strictly internal) … but if you have up to 500 Curves it would be “OK” (kinda).
With regard queries beyond finding singular items … you’ll need a custom Class to deal with more complex stuff (and then doing LINQ and the likes).
It is most unlikely that you can deal with similar (or “similar”) matters without code.
BTW: Given the opportunity … with randomly suffled collections … here’s an indication upon what HAC means (and how to represent the nesting via indices as path dimensions that also explain what contains what - singles are items with no childs):
I slightly shrink the rectangles first to avoid intersections between nearbours, I’m happy for them to be adjacent as long as the "shape"doesn’t overlap.
OK, that’s a very easy ccx Clusering case (and it could work “relatively” fast up to 1K curves (hope dies last)).
Moral: If no Python Samaritan on sight (within this day) … the salvation could be yours via C# (BUT using standard Rhino Methods and not the 20 ms thing shown above).
Update:
Half day past, no Python Samaritan on sight. Here’s a preview with 4 modes related with the next big thing:
LOL mode (for the brave) : get all shapes in contact, forget classic ccx :
For the ccx clusters, is there a way to cull curves until there are no ccx (contact is fine) in that cluster, in a way that maximizes the number of curves in that cluster.
I’ll add a 5th option soon (I’m out of the practice right now for a couple of hours).
I.e.
for each cluster (done with mode 1) try to find the max number of items in contact by removing this (or that) … blah, blah.
Or … for each cluster (done with mode 4) remove items that yield ccx events with items not contained in the cluster.
But the best way for “expanding” the query repertoire by a huge margin is a custom Class (as we do in real-life) : for each Item monitor his "“neighbors” (with regard ccx) and rate the ccx as well (events.Count == 1 means contact, events.Count > 1 means intersection - all that if R6,7,8,9 … continues to do ccx things that way). An IndexPair List where the first item is the index of the neighbor and the other the kind of ccx comes in mind. Then do a custom LINQ thingy and get any imaginable result. But … the ugly thing with such an approach is that you don’t know how to write the LINQ part (life sucks) … meaning that you can’t infinitely customize the C# to suit any future query task.
Absolutely anything is possible … but as I stated above … well … replacing the V1C with some non EntryLevel other (I have more than 100 C#s related with Clustering) that could deal with any imaginable search … means writing the right query (is just a couple of lines) and adding it as another option.
You can search for items that have 5 contacts, 23 “near miss” ones, are blue (LOL), have 66 vertices, 45.56 segments and their area is more than 6M square meters (bigger is better). BTW: If such a query yields null … blame Karma, what else?
The last example I gave is probably the most valuable to me, being able to input a “mess” of ccx curves and culling the minimum amount required to remove all (non-contact) ccx events
This creates 4 things that tell you what does ccx with what (using as always indices - from the item indices in the List): One Tree related with Contacts, one for Intersections (but no Contacts), one for classic Ccx (including Contacts) and … er … hmm … a Matrix (we all live in a Matrix you know - as the Prophetic Brothers stated rather clearly in the trilogy).
We can “expand” the conn Tree (the one that holds the Clustering output) by sampling all the indices - per Path - in a new subPath (path.AppendElement(0)) and all the contents of the HashSet in another subPath (path.AppendElement(1)). This yields a conn Tree with 3 dimensions.
Obviously by counting the beans we can locate the max combo (the N of the items in the cluster branches - where the 3rd dimension is 0 as above)
OK, this is not by the book and is a bit amateurish … but … well … can cut the mustard at least with that rule in mind.
As an exercise you can write a little P thingy that does the above. Why? well … if you have plans to become a pro you’ll discover that nothing in real-life can be achieved without code.
Had suspicions that matrices would be the best way to do it. Hadn’t seen your response, will definitely investigate that option at some point.
For the time being managed to get a slightly clunky python script working on clusters of intersecting curves, using the output of the Multiple Curves intersecting native component. Definitely not optimum but does the job for now.
P is not my game (and if you are after real-life AEC matters … er … shoudn’t be yours as well).
Get V1D that includes some help C#'s related with the add-on rules of yours plus a WIP thingy related with Classes, Matrices, The Matrix, verbalism and other freaky stuff.
Haven’t found time to finish the Matrix thing with some query examples due to bad things that happen in the practice (and other bad things related with a dead battery - again - in the middle of nowhere [or is the alternator/rectifier kaput ?]).