 # Cluster points: kangaroo, galapagos, python, space syntax

I’ve experimented a little and managed to come up with a similar result than your above sketch.

Each cluster begins at a start point (closest point to each black point) and has a certain desired size. At each iteration the search radius is expanded by a step size and thus new closest points are found that are then added to the clusters. If a cluster has reached a desired size, or doesn’t find any new, closest points within a threshold, it stops expanding.

This has the unfortunate byproduct that some points (grey) don’t get allotted to a cluster, because all neighbouring clusters have reached their maximum size.
However, the script allows you to add these points to their closest, neighbouring cluster, if desired! This obviously changes the cluster size, surpassing the desired maximum size from before for the concerned clusters, so it’s optional.

``````import Rhino.Geometry as rg
from ghpythonlib import treehelpers

# Get the diagonal length of the points' bounding box
bbox = rg.BoundingBox(Points)
bbox_dlen = bbox.Diagonal.Length

# Get the start points from the points
start_pts = [Points[i] for i in StartIndices]

# Temporary point cloud keep track of unchecked points
temp_ptcloud = rg.PointCloud(Points)
# Temporary indices keep track of unchecked points inital indices
temp_indices = [i for i in range(len(Points))]

curr_step = StepSize # current step size
count = 0 # iteration counter

# Intialise an empty cluster list for each start point
clusters = [[] for _ in start_pts]
active = [True for _ in start_pts]

while temp_ptcloud.Count > 0 and curr_step < bbox_dlen:

for i in xrange(len(start_pts)):

# Skip inactive clusters
if not active[i]:
continue

# Get the current closest point indices
rc = rg.RTree.PointCloudClosestPoints(temp_ptcloud,
[start_pts[i]],
curr_step)
closest = list(list(rc))
if len(closest) < 1: # no current, closest points
continue

# Get current cluster points (if it isn't empty)
curr_cl_pts = None
if len(clusters[i]) > 0:
curr_cl_pts = [Points[k] for k in clusters[i]]

misses = 0 # counts closest points beyond threshold
for j in xrange(len(closest)):
# Skip index out of range errors
if closest[j] >= len(temp_indices):
continue
# Skip clusters that have reached the desired size
if len(clusters[i]) == ClusterSizes[i]:
active[i] = False
continue
# Get the current, closest point
closest_pt = Points[temp_indices[closest[j]]]
# Check whether the closest point is near the cluster points
if curr_cl_pts != None:
dists = [pt.DistanceTo(closest_pt) for pt in curr_cl_pts]
if min(dists) > Threshold + Threshold * 0.05: # 5% leeway
misses += 1
continue
# Save the closest indices to the current cluster
clusters[i].append(temp_indices[closest[j]])
# Delete the visited points and indices
temp_ptcloud.RemoveAt(closest[j])
temp_indices.pop(closest[j])

# Deactive the current cluster if no valid closest point were found
if misses == len(closest):
active[i] = False

# Grow the search radius and iteration count
curr_step += StepSize
count += 1

# Check whether all points are included in one of the cluster
if AllotRest and sum([len(c) for c in clusters]) < len(Points):
# Loop the remaining point indices in reverse
for i in xrange(len(temp_indices)-1, -1, -1):
rest_pt = Points[temp_indices[i]]

# # Identify the closest cluster
min_dist = float("inf")
closest = None
for j in xrange(len(clusters)):
cl_pts = [Points[k] for k in clusters[j]]
dist = min([pt.DistanceTo(rest_pt) for pt in cl_pts])
if dist < min_dist:
min_dist = dist
closest = j

if closest != None:
# Add the remaining point index to the closest cluster
clusters[closest].append(temp_indices[i])
# And remove it
del temp_indices[i]

# Outputs
Clustered = treehelpers.list_to_tree(clusters)
Unallotted = temp_indices
``````

problem_sortClusters_02.gh (18.9 KB)

5 Likes

Thank you @p1r4t3b0y You’re welcome. It is an interesting topic!
Try moving the black points around. The script should be quick enough to see live updating of the clusters.

2 Likes

Just to exhaust the simple, Voronoi (and dual graph) is quite good at space divisions (of course you cannot dial it in to point count like @p1r4t3b0y). I use mesh rays for the point inclusion test as it is much faster than the Point In Curves components.

VoroCluster.gh (13.1 KB)

@ForestOwl, what I forgot to mention yesterday is that the StepSize input should be set to a value smaller than the minimum distance between closest points in Points. The StepSize is the increment of the search radius, within which to iteratively search for close points to clusters.

Also, the Threshold input - which defines the maximum distance between close points and the cluster they are evaluated to join -, should be something like the average distance between closest points in Points or even a little bigger.

The quality of the clustering really depends on these two values to be dialed-in to model scale. Otherwise, the clusters may seem to have less clear boundaries and intermix more.

By the way, the script also works if you define some interior points as start points. Start points don’t have to be on the periphery.

1 Like