Issue with PointCloudKNeighbors in Python Grasshopper

Hello everyone,
I’m kind of new to python scripting in grasshopper and having an issue with the module “PointCloudKNeighbors”.

What I’m trying to do is generate a point cloud that would be divided into clusters of different sizes.
So I populated a geometry as per the sum of sizes the clusters then I’m trying to run a loop with python which would pick a point, find n (size of first cluster) closest points to it then pick point n+1 and do that again (with another size of cluster). So I am using PointCloudKNeighbors but once I change the amount, the list becomes unreadable and I cannot use its elements.

Anyone knows what’s the issue here and can solve it? (GH attached)

Thanks a lot !! (13.4 KB)

P is not my game (and I don’t like that thing at all) … but if your goal is some “version/variation” of KMeans see attached (an entry level take on classic KMeans - no // and the likes).

It’s C# but the logic is(?) clear(?). (142.6 KB)

1 Like

That’s a neat piece of code ! Thanks !!
However, my overall goal is to create clusters of different sizes and then clusters within these clusters and I can’t find a way of doing that with KMeans…

This would be an example:
Populate given geometry with 50 points, 3 clusters of different sizes
cluster 1: 8 points divided in 2 clusters
cluster 2: 12 points divided in 6 clusters
cluster 3: 30 points divided in 3 clusters

so in the python thing I (almost) did, there would be a step afterwards where I would cluster each branch, but that’s not a problem.
This is too specific I guess… :sweat_smile:

Subclusters in clusters is elementary (following the classic KMeans approach - N of clusters, that is): just get the clustered points within a given cluster and and give them a second (or Nth) spin: a case for a proper recursion (if the custering depth is worth the name).

That said: classic hard flat KMeans yields populations depending on topology meaning that the population per cluster is rather unpredictable (only the N of clusters matters). See the hard flat KMeans Method in the V1 above for the “standard” logic used. On the other hand if the pList doesn’t have some restrictive meaning you can cheat as well (take x items from a given cluster and leave the rest in piece … etc etc)

Your goal (i.e. cluster on a per predefined N of items) requires a different take (i.e a special break condition) : stop sampling points (in a given cluster) when the population on hand is > a given number (meaning that your intitial collection may not being fully consumed) AND/OR finish the cluster and “release” the not taken points for the next clusters etc etc. This opens the door for a bastard kind of flat soft clustering, mind.

1M Q: are you in fact after HAC clustering?

On the other hand see this (using rnd demo points with local density spots): what could be the solution in such a case according your goal? (184.6 KB)

Update: added a sort option and a 3rd demo case that begs for a fuzzy CMeans (Google that one) clustering (194.8 KB)

1 Like

Thank you so much Peter this was super helpful! I did the cheating method you mentioned and it’s good enough for now :wink:
Also your code is awesome, I’m still trying to understand the different logics of clustering so I’ll get back to you with questions

That’s why I provided the sort (to cluster center) option: you get the most connected nodes, so to speak.

BTW: if you are in the broad AEC matrket sector cheating is the Holly Grail of things.