Cull Both Duplicates

Is there any way to erase both duplicates (when 2 points are duplicate) of a point list?

Just like:
newpointlist = Rhino.Geometry.Point3d.CullDuplicates(points,tol)

but only where both duplicate points are deleted?

Thanks
ZweiP

I’m sure there is a way. But there isn’t anything in RhinoCommon that will help.

If you want a real hack, you can try this:

(assuming you have an existing point list called pt_list with duplicates)

rs.EnableRedraw(False)
temp_lock=rs.NormalObjects(True,True)
rs.LockObjects(temp_lock)
rs.AddPoints(pt_list)
rs.Command("_SelDupAll")
rs.DeleteObjects(rs.SelectedObjects())
rs.Command("_SelPt")
unique_pts=rs.SelectedObjects()
culled_pt_list=[rs.PointCoordinates(pt) for pt in unique_pts]
rs.DeleteObjects(unique_pts)
rs.UnlockObjects(temp_lock)

There is perhaps a less hacky way I can think of -

  1. Get your point list, sort it with rs.SortPoints()
  2. Start at index 0, loop with compare point at index 0 to index 1, 2, 3…
  3. If duplicate within tolerance is found, add to a list of indices of dupes (including index 0)
  4. When the first non-duplicate is found, break out of the loop, and then remove the points at the indices found
  5. Recurse through the entire list.

–Mitch

@Helvetosaur @dale

Thank you for your reply. My algorithm to find the duplicate points, already works like your less hacky way.

def removeDuplicates(points):
# Create a dictionary to keep track of the Id
pointDict = {}
ptList = []

i = 0
for pt in points:
    pt3d = (round(pt.X,12),round(pt.Y,12),round(pt.Z,12))
    pointDict[pt3d] = i
    i = i + 1
    ptList.append(pt3d)
    
ptList = sorted(ptList, key=itemgetter(0,1,2))

ptLast = ptList[-1]

for i in range(len(ptList)-2,-1,-1):
    if (abs(ptList[i][0]-ptLast[0]) < tol) and (abs(ptList[i][1]-ptLast[1])) < tol and (abs(ptList[i][2]-ptLast[2]) < tol):
        del ptList[i]
        del ptList[i]
    else:
        ptLast = ptList[i]

#find the the ids with the new list
outputList = []
for pt in ptList:
    ptId = pointDict[pt]
    outputList.append(ptId)

return outputList

But the problem is that sometimes the points are not perfect duplicates (tolerance) and that (round(pt.X,12),round(pt.Y,12),round(pt.Z,12)) is not exact enough to find the duplicate one.

I thought that there is a better way in Rhino Common to select all duplicates. Because it can find them so fast. Before the sorting algorithm I messed around with distance calculation with each pt, but that took ages (1.mio Points).

Can I improve the algorithm above? :slight_smile:

Thanks

Looks very similar to the one I wrote CullDuplicatePoints question …

I would recommend to use the EpsilonEquals method from RhinoCommon.

in C# it would be

Point3d[] points; 
List<Point3d> unique = new List<Point3d>();

for (int i = 0; i < points.Length; ++i)
{
    Point3d p1 = points[i]; 
    bool isUnique = true;
    for (int j = i+1; j < points.Length; ++j)
    {
        Point3d p2 = points[j];
        if (p1.EpsilonEquals(p2, RhinoMath.ZeroTolerance)) // use any tolerance here 
        {
            isUnique = false;
            break;
        }
        if (isUnique) unique.Add(p1);
    }
}
// now the list contains unique points within the tolerance used.
1 Like

I guess mine is going to be pretty slow as well. For 10k points, it only took about a second, but for 100K points it took around 90 seconds. I think I won’t bother with a million…

–Mitch

@menno
Thanks I will try that. I hope that this will be faster and more accurate

@Miguel
Yeah your algorithm is great and I just changed the sorting part. Now x y z are also sorted.

@Helvetosaur
You are right for 10K points or even 100K it’s totally fine for me too.

OK, found where one of the major slowdowns was, I thought I would recreate the point list adding only unique points, but that is horribly slow. So now I am just deleting them from the original list, it can do 100K points in about 8 seconds. Running my 1 megapoint exercise now, but that seems to be hanging on the point sorting part (using rs.SortPoints). Will let you know when(if) it’s done… :stuck_out_tongue:

Oh, and perhaps one thing that I am doing is checking for all duplicates at a given point, if there are more than two it should find them all, and also, as the original point is used for checking, there is no “tolerance buildup” as you might have if points are just within tolerance of each other and they are checked and eliminated in sequence.

–Mitch