GH/ Python Code, For loop Not working "To many values to unpack" error

Having issues with trying to convert this code for finding duplicate lines with tolerance from the C# tutorial into a python script so i can develop it further to take 2 sets of lines and check duplicates in one list again the other list of lines.

This is the tutorial i have used to get this coded but having issues with the for loop conversion, starting a 32 mins in ( Geometry Gems #10: Remove Duplicate Lines) https://www.youtube.com/watch?v=NLp4st0bGUA

The rest of the code works fine if i give it a lineA, lineB input so its just the loop that’s breaking, with the error msg. “To many values to unpack”

import Rhino as rh

def RemoveDuplicates(lines,tol):
    cleanedlines = [lines]
    for i, in lines:
        line = cleanedlines[i]

        for j in reversed (lines):
            other= cleanedlines[j]

            dup = AreDuplicate(line,other,tol)
            if (dup == True):(cleanedlines.RemoveAt(j))
    return cleanlines
    
def AreSimilar(a,b,tol):
   similar = a.DistanceTo(b) < tol
   return similar
   
def AreDuplicate(lineA,lineB,tol):  
    if ((AreSimilar(lineA.From,lineB.From,tol)) and (AreSimilar(lineA.To,lineB.To,tol))or(AreSimilar(lineA.From,lineB.To,tol)) and (AreSimilar(lineA.To,lineB.From,tol))):
        return True
    return False

duplicates = RemoveDuplicates(lines,tol)

240412_Duplicate lines python Example WIP.gh (9.4 KB)

I know i can use, Rhino.Geomtery.GeometryBase.GeometryEquals(x,y) to find the duplicates but as this gives no tolerance option i also wanted to solve it with tolerance.

Thanks Matt

try this

import Rhino as rh

def RemoveDuplicates(lines,tol):
    #build a list of unique line indices
    tokeep =[]
    #empty list to store the lines
    cleanedlines = []
    #use range for 0 to the length of lines list -1 (before last item)
    for i in range(len(lines) -1):
        line = lines[i]
        #use range from the next item to the end of the list 
        #no need to check previous items
        #store duplicate as a variable here 
        #if no duplicates found, append lines[i]
        dup = False
        for j in  range(i+1, len(lines)):
            other= lines[j]
            
            dup_temp = AreDuplicate(line,other,tol)
            if (dup_temp): dup = True

        #if unique, store it's index
        if (dup == False):(tokeep.append(i))

    #the last item is always unique
    tokeep.append(len(lines)-1)
    for i in tokeep:
        cleanedlines.append(lines[i])
    
    return cleanedlines

#not sure abou the logic of these two functions - 
#seems to work but haven't checked 
def AreSimilar(a,b,tol):
   similar = a.DistanceTo(b) < tol
   return similar
   
def AreDuplicate(lineA,lineB,tol):  
    if ((AreSimilar(lineA.From,lineB.From,tol)) and (AreSimilar(lineA.To,lineB.To,tol))or(AreSimilar(lineA.From,lineB.To,tol)) and (AreSimilar(lineA.To,lineB.From,tol))):
        return True
    return False

#this variable name is confusing...
duplicates = RemoveDuplicates(lines,tol)
2 Likes

As an explenation as to why it’s prompting the “To many values to unpack” error.

for i, in lines:
(incorrect) attempts to unpack each item in lines into two variables (i and another variable which you indicate by placing the , ). This fails if lines doesn’t contain enough (this case 2) values to unpack.

for i in lines:
(correct) iterates over lines, providing only the elements themselves without their indices.

for i in range(len(lines)):
(correct) iterates over the indices of lines, poviding you the index(i) of each line within lines.

1 Like

“Using for line in lines loop is sufficient. DistanceToSquared tends to be faster than Distance .”

import Rhino.Geometry as rg

def RemoveDuplicates(lines, tol):
    cleaned_lines = []
    sq_tol = tol*tol
    for line in lines:
        is_duplicate = False
        for cleaned_line in cleaned_lines:
            if AreDuplicate(line, cleaned_line,sq_tol):
                is_duplicate = True
                break
        if not is_duplicate:
            cleaned_lines.append(line)
    return cleaned_lines


def AreDuplicate(lineA, lineB, tol):
    if AreSimilar(lineA.PointAt(0.5), lineB.PointAt(0.5), tol):
        return (AreSimilar(lineA.From, lineB.From, tol) and
                AreSimilar(lineA.To, lineB.To, tol)) or \
               (AreSimilar(lineA.From, lineB.To, tol) and
                AreSimilar(lineA.To, lineB.From, tol))
    return False

def AreSimilar(a, b, tol):
    return a.DistanceToSquared(b) < tol

a = RemoveDuplicates(lines,tol)

1 Like

Sorry for the delayed response to everyone’s help, this is great.

Next question which i have been trying to solve using the working codes given here was the option to have 2 lists of lines rather than just one as input. With the goal of checking one list against the other to find which are duplicates (in both lists) and which are unique in both lists.

The hacky way i have found to do this on a simple level is having one input as list and the other as item so grasshopper does the list management / looping for me, but this will soon break. I have tried with the nested loop. See code below and it runs without errors, but also gives only one index per set of duplicates which seems wrong even If i jitter the list order of one time or cull some items. as using the same list considers everything as duplicates so nothing is different,which makes sense

Note this code is just using the Rhino.Geomtery.GeometryBase.GeometryEquals(x,y) rather than the tolerance controlled checks from before as this was easier to Debug and later i can switch this out for the other definition.

Geo A List Access
Geo B List Access
Output Should be a list of Indices but only getting one.

import Rhino.Geometry as rhg

def RemoveDuplicates(geoA, geoB):
    same = [geoB]
    different_A = []
    different_B = []
    
    for i in range(len(geoA)):
        geo = geoA[i]

        for j in reversed(range(len(geoB))):
            other = geoB[j]

        dup = rhg.GeometryBase.GeometryEquals(geo,other)
        if (dup == True):same.RemoveAt(j)
        else: different_A.append(i), different_B.append(j) 
        return different_A,different_B

unique_A, unique_B = RemoveDuplicates(geoA,geoB)

240412_Duplicate lines python Example WIP.gh (18.2 KB)

Thanks Again.

just by looking at the code it looks like this line should have one less indent

otherwise you are returning different_A and different_B already at the end of the very first loop cycle, and returning something exits the function: you want to return results only when the for loop is ended

Edited :WIP Code below updated with corrected geo outputs but not working Indexes yet

Ok that’s interesting moving the return gives me a long list of indices, even when both list are the same.
I think my issue is the nesting levels where i am trying to get the unique indices of each geo list only.
The way i have been testing this is cull one item from one list or the other and or jittering one list so the order changes and it has to search further for duplicates but this in-turn makes the loop run longer.

import Rhino.Geometry as rhg

def RemoveDuplicates(geoA, geoB):
    index_A =[]
    index_B =[]

    for i in reversed (range(len(geoA))):
        geo = geoA[i]

        for j in reversed(range(len(geoB))):
            other = geoB[j]

            dup = rhg.GeometryBase.GeometryEquals(geo,other)
            if (dup == True): geoA.RemoveAt(i),geoB.RemoveAt(j)
        else:index_A.append(i),index_B.append(j)
    return geoA, geoB , index_A, index_B

unique_A, unique_B, index_A, index_B = RemoveDuplicates(geoA,geoB)

Hi,

I think you should almost never remove elements from a collection which was primarily made for adding elements such as a list. Apart from how unreadable the algorithm gets when you remove a collection within the same loop, you are not doing something effective, except saving memory.
If you ever implemented a linked list in C, you know how ineffective removal is.

Instead, you should prefer to only mask duplicates and then create a dedicated collection for output.
This should also simplify the algorithm.

See the logic

Given a collection of lines:
[ a , b, c, d, e, a, c, a, b, a ] # input

On the 1. run you can create a duplicated mask
[ 0 , 0, 0, 0, 0, 1, 0, 1, 0, 1 ]

On the 2. run you can create a duplicated mask
[ 0 , 0, 0, 0, 0, 1, 0, 1, 1, 1 ] 

meaning for 3. run you only need to visited 4 items left (marked as x)
[ 0 , 0, x, x, x, 0, x, 0 ,0, 0 ]  

... and so on...

The same is true if you compare two collections:
[ a , b, c, d, e, f, g, h, i, j ] # input a
[ a , a, a, b, b, b, b, k, l, m ] # input b

After 1. run, the "duplicate" mask will prevent checks for 1/3 
[ 1 , 1, 1, 0, 0, 0, 0, 0, 0, 0 ]  
After 2. run only 3 items are left:
[ 1 , 1, 1, 1, 1, 1, 1, 0, 0, 0 ]  

1 Like

Regarding removal. If you implement a list from scratch and you expose a RemoveAt(index) method, it means you either need to traverse the list from the beginning or end node just to get to the item, which you then need to remove, by either freeing the item or by disabling the allocated space. On top of that you need to modify the references at the adjacent nodes. Even if a Net/Python list is probably highly optimised, You perform hidden iterative steps . So you gain nothing by putting anything in one big nested loop…

No reason to go about it this way, and it’s error prone. Check my code or that of others above for a cleaner, safer method

Are you referring to my post or to Matts? If you refer to me, in which regards is my proposal not clean or unsafe?

@TomTom Sorry for the confusion, i was replying to op’s WIP code

@TomTom fyi, in the top right of the post, you see to whom the reply is intended (unless it’s immediately above)

Hi All,

Sorry for the confusion, I should have split the topic up into 2 different treads: One about solving the code for a single list which was answered and explained, thanks.

Then made another thread about checking duplicates between 2 lists and only pulling out unique items and their index. And not just reading your replies then trying to chop up the codes given to work for something slightly different.