Grasshopper deleting duplicate lines efficiently

br16 · September 3, 2023, 4:14pm

Hi everyone, I attach a screen shot of a script which currentley takes almost 5 hours to remove duplicates. I know that with such a large amount of data it will take a while but wondered if anyoune had an idea of how to make it faster. Thanks Bran

anon39580149 · September 3, 2023, 4:28pm

Did you try to avoid the duplicate lines from the beginning ?

martinsiegrist · September 3, 2023, 4:50pm

How many lines?

If you really expect someone to help you, please post a Rhino file with the lines or a Grasshopper file with the relevant inputs internalized…

Joseph_Oster · September 3, 2023, 5:06pm

128 K points from the image. And 4.8 hours No thanks. When I have to do this (very rarely), I use Cull Duplicates (points) using the MidPt (Curve Middle) of each line.

Nathan_Bossett · September 3, 2023, 7:41pm

You haven’t given us a definition of what you mean by equality or any examples.

Is a line (segment?) a duplicate if it’s close but not quite, or are the duplicates exact down to the last decimal place? If the second (fuzzy duplicates), how do you decide which one to delete?

In any event, Grasshopper probably isn’t the best tool for this unless you just write a one stage script component.

Suggestion:
Come up with a definition of equality. Phrase it as a >, <, equals comparison. If it were a test for point equality, pseudocode:

If pt1.x > pt2.x then return greater
if pt1.x < pt2.x then return lesser
if pt1.y > pt2.y then return greater
if pt1.y < pt2.y then return lesser
if pt1.z > pt2.z then return greater
if pt1.z < pt2.z then return lesser
return equal (because they’re exactly the same)

Sort the lines based on that definition, using a existing library (Python, C#, etc.).

Duplicates will now be adjacent in the list, and there may be a series of more than two in a row. Walk through the sorted list and delete the items you don’t want, making any related changes along the way (such as merging fuzzy duplicates if you have a situation like one of two duplicates making the proper connection to another element on one end and the other having the proper connection point to another element on the other end).

If it’s just lines or segments, I’ll bet you can get the duplicate processing down to seconds on reasonable hardware. I don’t know what you need to do in Rhino after you identify the duplicates, but this problem should be a cup of coffee tops, not a workday.

michaelvollrath · September 3, 2023, 7:52pm

Difficult to see anything without a definition but a flattened list at that scale is always going to be slow.

I would probably try to maintain local Voronoi clusters within their own branches and check against duplicates locally.

But as another user pointed out, the best use of time would be studying if you can avoid creating duplicate lines in the first place. The answer is almost always yes.

Alternatively, rather than “removing duplicate lines” you could do some work with indices and list/cull per Voronoi cluster as needed. So you just choose not to list all possible lines (if a repeatable pattern is observable that is…)

An example would be if you had 10,000 cubes returned as a brep wireframe you could list edges 1, 3, and 5 and just ignore the others. Easier to “populate” lists of data going forward then subtract/compare looking backward (usually).

I would reduce your Voronoi count to a manageable number like 200 for testing, focus on an algorithm or workflow that does not create duplicate lines and then scale up from there…

Joseph_Oster · September 3, 2023, 8:11pm

P.S. With the midpoints hidden. one short red line (mistakenly deleted) shows up when you bump the ‘Tolerance’ slider from 0.004 to 0.005.

P.P.S. This cone is only one unit in height (the default) so it makes sense that tolerance is small?

martinsiegrist · September 4, 2023, 6:09am

At this point it seems a bit strange that you create voronoi cells and continue using proximity links…

Once you have the lines inside and outside your brep, what happens next?

Topic		Replies	Views
How to delete both lines if the lines are exact the same on the same position(as an duplicate)? Grasshopper	18	872	March 29, 2019
Delete duplicate data from a list Grasshopper	2	1571	July 11, 2020
Removing Dup Points Grasshopper	5	459	October 25, 2019
Remove duplicate numbers from a list Grasshopper windows	3	281	December 20, 2023
Compare file names and delete duplicate files in Grasshopper(need help) Grasshopper	7	740	June 9, 2020

Grasshopper deleting duplicate lines efficiently

Related topics