[Python] List processing algorithm out of ideas

ivelin.peychev · February 14, 2019, 2:52pm

Theory:
I have two lists with equal number of items:

L1 = [0,1,2,3,4,5,6,7,7,8,8,8,9,10,10]
L2 = [12,412,51,523,52,54,65,74,35,22,14,1,3,76,159]

How can I find the duplicate items inside L1 get their index, and use that index to aquire the corresponding item in L2. Then get the value of these items from L2 and sum them up.
After that remove the duplicates from L1 or create another list (say L1a) without duplicates. Also create another list (L2a) where the items with indices equal to the duplicates from L1 are replaced with their sum.

Practical case:
I have a number of lines, such that all lines are in XY plane and parallel to Y axis.
Ergo, they all have constant X coordinate, but some lines just like the values in L1 overlap but have different lengths.

What I need is a sum of the lengths for each unique X coordinate.

Thanks a lot to all who reply in advance.

ivelin.peychev · February 14, 2019, 2:54pm

So far I figured out:

How to get the unique items,
How to pop the unique items in a different list
How to remove duplicates in L1 by converting to set then back to list

What I cannot figure out is how can I get two lists of X coordinates and Lengths that have equal number of items.
I always get either new duplicates or I get different number of items

I assume the answer is hidden in the collections module or by using enumerate(), but am not familiar with them and I don’t know where to look.

Mahdiyar · February 14, 2019, 3:16pm

I’m not sure if I exactly got what you want, but if I’m not mistaken you can achieve same result with very simpler algorithm:

import rhinoscriptsyntax as rs
yCoordinates = []
for line in lines:
    yCoordinates.append(rs.CurveStartPoint(line).Y)
    yCoordinates.append(rs.CurveEndPoint(line).Y)
maxY = max(yCoordinates)
minY = min(yCoordinates)
sumY = abs(minY-maxY)

IVELIN PEYCHEV.gh (13.6 KB)

Dancergraham · February 14, 2019, 3:20pm

Hi, can there be gaps between lines on the same X? Can they overlap but with different lengths? Can they be exact duplicates?

ivelin.peychev · February 14, 2019, 3:21pm

Thanks Mahdiyar,

But I know how to find the items with equal X coordinates and sum them up, I need the lists though with equal number of items.

Dancergraham · February 14, 2019, 3:23pm

How about using zip(l1,l2) to combine your two lists into types? Would that help you keep track of paired values?

ivelin.peychev · February 14, 2019, 3:24pm

Yes I tried that, but how to combine the arrays such that zip[0] stays the same and zip[1] is a sum of all items that have equal zip[0]?

This is the main question.

I tried something with 4-5 levels of for and if loops and ended up with lots of duplicates.

ivelin.peychev · February 14, 2019, 3:29pm

for the practical case,

Is there a way to find all lines with X==i, Y = 0 and Z = 0 and sum them up?
RhinoCommon, Grasshopper?

ivelin.peychev · February 14, 2019, 4:06pm

I think I found a solution here:

But I still get one phantom item more in one of the lists:

import rhinoscriptsyntax as rs
from ghpythonlib.treehelpers import list_to_tree

from collections import Counter
A = []
c = Counter()
zlst = []
for i in range(len(x)):
    zlst.append([str(y[i].X),round(x[i],0)])


for j,k in zlst:
    c.update({j:k})

for l in range(len(zlst)):
    A.append(zlst[l])

a,b = zip(*A)

AA = list(set(a))
BB = list(set(b))

#len(AA) = 221
#len(BB) = 222

Dancergraham · February 14, 2019, 4:06pm

I was thinking of this data structure:

from collections import defaultdict
lines = defaultdict(list)
for tup in zip(L1,L2):
    lines[tup[0]].append(tup)

nathancoatney · February 14, 2019, 4:13pm

Based on the practical case I’d try:

import rhinoscriptsyntax as rs
import scriptcontext as sc
import Rhino as R

line_guids = rs.GetObjects('select lines')

# get line geometry
lines = []
for lg in line_guids:
    lines.append(rs.coerceline(lg))

# dict to hold list of lines keyed on x coordinate
line_lists = {}
# populate the dict
for line in lines:
    if line.FromX not in line_lists:  # should probably test x to some window
        line_lists[line.FromX] = [line]
    else:
        line_lists[line.FromX].append(line)

# loop through the line_lists and do calcs
for x_start, line_list in line_lists.items():
    sum = 0
    count = len(line_list)
    for line in line_list:
        sum += line.Length
    print('x coord of {} has {} lines with summed length of {}'.format(x_start, count, sum))

Output:

x coord of 0.0 has 3 lines with summed length of 20.9950916265
x coord of -6.61721090246 has 1 lines with summed length of 5.94563031763
x coord of -3.77186291026 has 3 lines with summed length of 14.1364263773

Input:

Should probably think about the x coordinate key precision though. Also it doesn’t use list indices, if that is a requirement.

Dancergraham · February 14, 2019, 4:15pm

Or defaultdict(set) to remove duplicates. Then you can use for xval, line in lines.iteritems() to iterate over your list / set

ivelin.peychev · February 14, 2019, 4:23pm

Yes, that is a requirement, since the values in L1 could be scrambled.

ivelin.peychev · February 14, 2019, 4:25pm

Thank you @nathancoatney, @Dancergraham ,

I need to think through the ideas and how I can apply them.

Plus that thread in stackoverflow seems similar to mine. I need to see if I can combine all ideas.

nathancoatney · February 14, 2019, 4:56pm

What I mean is that it doesn’t correlate indexes, as it works with the geometry. One could do the same with indexes, just the comparing would be different, so you would have:

L1 = [0,1,2,3,4,5,6,7,7,8,8,8,9,10,10]
L2 = [12,412,51,523,52,54,65,74,35,22,14,1,3,76,159]

L1_indicies = {}
for i, l in enumerate(L1):
    if l not in L1_indicies:
        L1_indicies[l] = [i]
    else:
        L1_indicies[l].append(i)

print(L1_indicies)

L1a = []
L2a = []

for value, indices in L1_indicies.items():
    sum = 0
    for i in indices:
        sum += L2[i]
    L1a.append(value)
    L2a.append(sum)
    
print(L1a)
print(L2a)

That stackoverflow link has more pythonic ways to do the same it looks like.

Dancergraham · February 14, 2019, 5:14pm

You’re welcome
For the next step it may be helpful to sort the lines by y value, eg:

for xval,listt in lines.iteritems():
  print(sorted(listt,key = lambda x: x[1]))

ivelin.peychev · February 14, 2019, 5:14pm

I added a few lines and it’s now useful in my case:


import collections



L1_indicies = {}

for i, l in enumerate(L1):
    if l not in L1_indicies:
        L1_indicies[l] = [i]
    else:
        L1_indicies[l].append(i)

print(L1_indicies)

L1a = []
L2a = []

for value, indices in L1_indicies.items():
    sum = 0
    for i in indices:
        sum += L2[i]
    L1a.append(value)
    L2a.append(sum)
    
print(L1a)
print(L2a)

a = []
b = []

d = dict(zip(L1a,L2a))
od = collections.OrderedDict(sorted(d.items()))

for key, value in od.iteritems():
    temp = [key,value]
    a.append(temp[0])
    b.append(temp[1])

ivelin.peychev · February 14, 2019, 5:16pm

Thanks Graham,

Your examples are a bit advanced. I don’t know in which cases I have to use enumerate let alone lambda

Thanks anyways. Some day I’ll understand them better.

nathancoatney · February 14, 2019, 7:12pm

Glad it helped.

Maybe slightly more compact to avoid the zipping and the OrderedDict, sort the keys of of the dict instead of looping through the items:

for key in sorted(L1_indicies.keys()):
    indices = L1_indicies[key]
    sum = 0
    for i in indices:
        sum += L2[i]
    L1a.append(key)
    L2a.append(sum)

I think that will give you two lists in sorted order, which I think is the same as your a and b lists, and should save quite a bit of overhead if that is important.

Dancergraham · February 14, 2019, 7:29pm

I’ll dial it back a bit !

Simple is better than complex.

(import this)

Topic		Replies	Views
GH/ Python Code, For loop Not working "To many values to unpack" error Grasshopper windows , rhinocommon , python	12	566	April 24, 2024
Python : Make sub list duplicate point from 2 list Grasshopper windows , python	0	411	January 17, 2023
Matching duplicate curves and summing the value Grasshopper	0	325	October 6, 2020
Combine items with the same location in list Grasshopper	22	3578	March 28, 2020
Combine lists by adding same values Grasshopper windows	8	174	June 20, 2025

[Python] List processing algorithm out of ideas

Related topics