Check each item's repeat times within a list

Jack_Zeng · February 10, 2017, 7:55am

Hi All,

Actually I got 2 question…I guess it could be achieved by some basic syntax in Python.
When I have a list like [0,0,1,1,1,1,3,3,3,3,3,3,3,4,4,4,4,4,5,5 … ], how can I eliminate the repeated items to be a list like [0,1,3,4,5 … ]? Is there a way to apply on int, str and even geometries?

Another question is, how can we calculation each item’s repeat times with a list?
Take the same list for example, [0,0,1,1,1,1,3,3,3,3,3,3,3,4,4,4,4,4,5,5 … ] to [2,4,7,5,2 … ].

Thanks for any help,

Jack

nathanletwory · February 10, 2017, 8:11am

Seed a set() with your list(), then convert it back to a list: https://docs.python.org/2/library/stdtypes.html#set

l = [0,0,3,3,2,2]
l.sort()
s = set(l)
ll = list(s)
ll.sort()

c = [l.count(i) for i in ll]

print(l)
print(c)

edit: note that the counting works really only for sorted lists in a useful way, hence the explicit sorts in the snippet.
edit2: this should work for any data type you put in the list.

Jack_Zeng · February 10, 2017, 9:32am

great! Thanks a lot nathanletwory!
Can I ask why we need to sort() twice? I try to avoid the second sort() but seems it still works.

Jack

nathanletwory · February 10, 2017, 11:23am

I added the second sort as well because a set is an unordered collection of items. Creating a list from a set does not guarantee to have elements sorted correctly. In many cases with simple data like ints it seems to work ok, but it won’t necessarily always be the case. Sorting the list created from the set will ensure you have sorted elements.

/Nathan

Helvetosaur · February 10, 2017, 12:27pm

As Nathan did (I was just too slow):

orig_list=[1,2,2,4,5,4,4,5,2,1,2,4,1,4,3,4,3,4,1,2,2,3,5,4,4,4,5,4,1,5,4,6]
unique=list(set(orig_list))
unique.sort()
repeats=[]
print "Unique elements in orig_list:"
print unique
print "\nIn original list:"
for element in unique:
    count=orig_list.count(element)
    repeats.append(count)
    print "{} repeated {} times".format(element,count))

Unique elements in orig_list:
[1, 2, 3, 4, 5, 6]

In original list:
1 repeated 5 times
2 repeated 6 times
3 repeated 3 times
4 repeated 12 times
5 repeated 5 times
6 repeated 1 times

–Mitch

nathanletwory · February 10, 2017, 1:05pm

The repeats list is unnecessary (and the append() to it), but otherwise nice verbose

/Nathan

Helvetosaur · February 10, 2017, 1:09pm

That was just in case someone actually wanted to do something with that data later… Not much overhead involved…

–Mitch

nathanletwory · February 10, 2017, 3:06pm

Fair enough

MarcusStrube · February 10, 2017, 6:31pm

I would do this:

from collections import Counter
from itertools import groupby

u = [1,2,2,4,5,4,4,5,2,1,2,4,1,4,3,4,3,4,1,2,2,3,5,4,4,4,5,4,1,5,4,6]
s = [1,1,1,1,1,2,2,2,2,2,2,3,3,3,4,4,4,4,4,4,4,4,4,4,4,4,5,5,5,5,5,6]

for element, count in sorted(Counter(u).items()):
    print (element, count)

print ('\n---\n')

for element, group in groupby(s):
    print (element, len(list(group)))

clement · February 10, 2017, 6:45pm

Or this:

from collections import defaultdict

mylist = [6,0,0,1,1,1,1,3,3,3,3,3,3,3,4,4,4,4,4,5,5,2,2,4]
d = defaultdict(list)

for n in mylist: d[n].append(n)

print "Unique items:", d.keys()

for key, values in d.items():
    print "{} repeated {} times".format(key, len(values))

1 question, 100 solutions.

c.

MarcusStrube · February 10, 2017, 7:23pm

Sorry, no, but in Python there should be one – and preferably only one – obvious way to do it.

clement · February 10, 2017, 8:00pm

Oh yeah, just noted the next line:

“Although that way may not be obvious at first unless you’re Dutch”

c.

MarcusStrube · February 10, 2017, 8:16pm

I don’t really get that line, but Guido van Rossum is Dutch.

nathanletwory · February 10, 2017, 8:18pm

I do. I am dutch, too. ^.^

nathanletwory · February 10, 2017, 8:28pm

Btw, I was doing some timings on the proposed methods, but suffered a RUB (rapid unscheduled boot, spacex style). But I do recall some numbers.

First I created a list of 1.000.000 randomly picked ints between 0 and 13, random.seed(13) (13 for no obvious reason).

Then I timed each proposed solution (without the printing, just creating lists of the counts.

Fastest was the Counter method, at around 0.09 on my machine (start = time.time() … time.time() - start). Slowest was groupby with 2.something. My list comprehesion was a bit slower than the list.append() from @Helvetosaur (around 0.32 vs around 0.31). I don’t recall all of them, and I’m to lazy to redo at this moment. But indeed list comprehensions aren’t the fastest around - but I do like them.

Willem · February 10, 2017, 11:55pm

My 3 line take:

orig_list=[1,2,2,4,5,4,4,5,2,1,2,4,1,4,3,4,3,4,1,2,2,3,5,4,4,4,5,4,1,5,4,6]

values_counts = [(value,orig_list.count(value)) for value in sorted(list(set(orig_list)))]

for v,c in values_counts : print '{} repeated {}'.format (c,v)

-Willem

(I’m Dutch too BTW)

MarcusStrube · February 11, 2017, 5:53pm

Using groupby to solve the counting problem just works when the list is sorted as it groups subsequent equal elements. Guess then it’s faster than Counter.

To eliminate duplicates, @Jack_Zeng, you would, btw, also use groupby:

from itertools import groupby

l = [0,0,1,1,1,1,3,3,3,3,3,3,3,4,4,4,4,4,5,5,6]
l = [element for element, _ in groupby(l)]

print (l)

Okay, I got it, when someone understands he is Dutch. So right now, I’m Dutch, too. That’s cool!

Jack_Zeng · February 13, 2017, 9:30am

Wow, cool!
I didn’t expect there are so many ways to achieve it.
Amazing!!
I will have a look at groupby as well, seems it is an advanced library I should know~~:grin:
Thanks guys!!

Jack

Topic		Replies	Views
Evaluate difference between list and set Grasshopper windows	1	160	January 15, 2024
Python Question Scripting	2	722	October 7, 2013
Sort 2 lists get index values out Scripting rhino , python , scripting	15	772	October 18, 2018
Sort list by given pattern Grasshopper	6	4023	April 29, 2018
[Python] List processing algorithm out of ideas Grasshopper Developer	22	1819	April 21, 2019

Check each item's repeat times within a list

Related topics