Find shortest connections between two sets of points (identical number)

This naive attempt does not work:

The challenge is: Find exactly one red point for each blue point. And do it so that the connections are as short as possible. But how?

Notes:

  • The background is that I want to create something like a random walk, but without any parameters.

  • A straight forward approach may be to optimize for the shortest total length of all connections.

maybe try Closest point | two lists of points - #10 by PeterFotiadis

there was a very similar post from a few weeks ago, link it down here

the main point is, if for each point you want to find the closest destination point, the order in which you process the starting points matters a lot

for instance, in this stupid basic case, connecting start points 1, 2 with destination points A, B
if you process point 1 first, that point will connect to A generating line 1-A
but if you process point 2 first, it’s point 2 that will connect to A generating line 2-A

so, for any given set of start and destination points with same amount of elements, the order in which the starting points are processed makes the world of difference

Hope dies last. See attached (using Point3dList collection that is more or less “fast”). The 3rd option creates unique pairs.

Point3d_ClosestInTwoLists_EntryLevel_V1.gh (144.9 KB)

The 2nd C# … well … “best” prox path it’s a complex thing (the correct Method [kinda a variation of the MCMF algorithm] is used mostly in Steiner Graphs and the likes). Get the simple classic looped proximity that works … until it doesn’t (from approx the mid of the collection and up).

there’s also a brute force algo, but you probably don’t want that since it checks for every possible pair of points → by permuting on list b → meaning it takes forever to calculate AND for large lists it doesn’t fit in comp memory-> then returning the list that has the smallest sum of distances (may not be unique) (does not check for intersection of lines)

to make this more managable, you could use a set number of random permutations (say max 100 000)


a smarter algorithm would be to iteratively remove the pair that gives the the smallest distance from the two lists → but this might not return the smallest possible sum of distances → however it should be good enough for most cases

Thank you!

I reduced it down to essentials (hope the Lord isn’t essential :grimacing:):


2024-04-30_shortest_with_script_by_Pfotiad0.gh (9.7 KB)

It finds the closest points, but it doesn’t match my special requirement of pairwise connections for all points.

Thanks, was thinking about that too. I have somewhere around ten points, but even then, if I’m not mistaken, that means quite a number of possibilities: 10! = 3628800

Interesting idea.

Genetic algorithm may also work, which would involve crossing over permutations.

To speed up things, one can precompute all possible distances and store them in a list. For ten points, that’s just: 10 × 10 = 100

Thanks, had a look at the two solutions, but neither does exactly pairwise connections.

Yes, this is a good point. A better algorithm is necessary.

3rd option does that as follows: finds the prox pair, then marks both positions/pt indices as used (by removing’m from the Point3dList) and goes on to the next pair. This means that every pair has unique pts/indices … or no pair shares pts/indices with any other.User controls the N of pairs computed.

Obviously I could very easily change that - as an option (“kinda” like the 1/2nd work).

the simplest loop (requires Anemone plugin)

closest_point_loop.gh (16.0 KB)

Thank you, that looks interesting.

Right! But I worry that then the order matters as pointed out in the post by @inno.

In the meantime, I implemented the brute force solution, which may be sufficient for my use-case:


2024-05-01_shortest.gh (11.8 KB)

Code:

# Finds the shortest connection between two lists of points where:
# 
#   * Each point is connected to exactly one point in the other list.
#
#   * "Shortest" is defined as the shortest total length of all connections.
#
# Felix E. Klee <felix.klee@inka.de>

import ghpythonlib.treehelpers as th
import Rhino
import rhinoscriptsyntax as rs

def sumOfDistances(indexes):
    sum = 0
    for i, ds in enumerate(distances):
        sum += ds[indexes[i]]
    return sum

# Heap's algorithm
def permutation(a, size):
    global bestSum, bestIndexes

    if size == 1:
        sum = sumOfDistances(a)
        if (sum < bestSum):
            bestSum = sum
            bestIndexes = a.copy()
        return
 
    for i in range(size):
        permutation(a, size - 1)
        sizeIsOdd = size & 1
        if sizeIsOdd:
            a[0], a[size - 1] = a[size - 1], a[0]
        else:
            a[i], a[size - 1] = a[size - 1], a[i]

def computeDistances():
    distances = []

    for i, point1 in enumerate(points1):
        distancesFromPoint1 = []
        for j, point2 in enumerate(points2):
            distancesFromPoint1.append(point1.DistanceTo(point2))
        distances.append(distancesFromPoint1)

    return distances

def findBestIndexes():
    global bestSum, bestIndexes
    bestIndexes = list(range(len(points1)))
    bestSum = sumOfDistances(bestIndexes)
    permutation(bestIndexes, len(bestIndexes))

def buildConnectionLines():
    lines = []
    for i, bestIndex in enumerate(bestIndexes):
        line = rs.AddLine(points1[i], points2[bestIndex])
        lines.append(line)
    return lines

def validateInput():
    if len(points1) != len(points2):
        raise ValueError("Point list lengths don't match.")

validateInput()
distances = computeDistances()
findBestIndexes()
connection = buildConnectionLines()
3 Likes

In the 3rd option the order has no meaning at all.

Reason?

After findind the closest pair (where obviously there’s just one possible pair around no matter the pts order in the 2 collections) … then the next steps (up to the user defined N of pairs) simply don’t include/compute the pts previously sampled. So for each “step” the algo works as if the previously sampled pts (in pairs) are removed from the 2 collections.

Meaning: order has nothing to do in this game… Anyway I could include a LINQ jitter option (just one line of code) to prove the obvious … but why bother?

Plus: for prox Pts matters ALWAYS use Point3dList otherwise you’ll wait toooo much for answers (if your collections are big). All that prior implementing a thread safe // approach (that may or may not be the faster bunny).

An interesting problem.
From the replies here it looks like the Hungarian algorithm is a way to solve this:

Out of curiosity - in what context are you encountering this problem?

1 Like