Compare list items against each other


(differred) #1

I’m trying to find a way of comparing items within the same list.

Basically, I want to look at each list item and compare it to the previous item in the list, if the item isn’t within a certain range (say -50 or +50) then replace that item in the list with one that is within the range.

I’m not sure how to even begin with this, is it even possible?

Any suggestion/push in the right direction greatly appreciated and thanks in advanced!


(David Rutten) #2

It’s tricky, because you’re iterating over a list while modifying the list. You can absolutely compare each item in a list to the previous one and then even change that value, but the next item will still be compared against the original, unchanged, item rather than the new value. Is that what you want?


#3

This is a way to generate a list like that:

list50
list50.gh (8.0 KB)

list50b


(Pfotiad0) #4

Well I have a vast variety of “similar” defs but the bad news are that are carried over via code. Without that you’ll need something the likes of Anemone since GH is a-cyclic by design.

ReplaceItemsInTreesWithCondition_V1.gh (113.5 KB)

So see attached (at least as a fun case - modified 2 lines from an existed thingy in order to match your comparison criteria): if you provide data it works with these, if not it makes them for you (using random this, random that etc etc) and does the comparisions/replacement working with prev-current pair items in a given List in a given tree branch.


(differred) #5

Yes, that is absolutely what I’m after! Basically I’m trying to reduce outliers in a random list. How might I begin to do this?

Thanks for your help :slight_smile:


(David Rutten) #6

So we want to reduce outliers by limiting each value to be within plus-or-minus 1 of the previous value, i.e. v_n = v_{(n-1)}\pm1

Now imagine a sequence with a clear outlier, for example \left\{ \frac{6}{10}, \frac{8}{10}, \frac{3}{10}, \frac{16}{10}, \frac{460}{10}, \frac{9}{10},\frac{1}{10}, \frac{-3}{10} \right\}. If we run through all the numbers (except for the first one since there isn’t a previous number to compare it against, we can create a list of changes, which is just the current number minus the previous one:

\left\{0, \frac{2}{10}, \frac{-5}{10}, \frac{13}{10}, \frac{444}{10}, \frac{-415}{10}, \frac{-8}{10}, \frac{-4}{10}\right\}

We then limit the changes to be within the [-1.0, 1.0] range:

\left\{0, \frac{2}{10}, \frac{-5}{10}, \frac{10}{10}, \frac{10}{10}, \frac{-10}{10}, \frac{-8}{10}, \frac{-4}{10}\right\}

If we reapply these clipped values to our original numbers, we find that the outlier has just moved one place to the right. The numerators here are the numerators of the original, previous values, plus the clipped changes.

\left\{ \frac{6}{10}, \frac{6+2}{10}, \frac{8-5}{10}, \frac{3+10}{10}, \frac{16+10}{10}, \frac{460-10}{10},\frac{9-8}{10}, \frac{1-4}{10} \right\} =

\hspace{50mm} \left\{ \frac{6}{10}, \frac{8}{10}, \frac{3}{10}, \frac{13}{10}, \frac{26}{10}, \frac{450}{10},\frac{1}{10}, \frac{-3}{10} \right\}

Not only is the outlier still there, there’s also still values that are more than the limit apart; 13 and 26.

One way out of this mess is to use an approach similar to what @Joseph_Oster posted. Instead of using random values to compute partial sums, you can use the clipped changes. This will guarantee that all values are within the same tolerance as their neighbours, however it may drastically change the actual shape of your distribution. You see if you were to have several slightly too big consecutive upward steps, you limit them all but your sequence still goes up the maximum allowed amount. Then if there’s a single huge downward step it is also clipped to the same range, so you only go down by a little bit. The remainder of your sequence will be much higher than the original.

For removal of outliers in signals the LULU operator (LULU smoothing) is a pretty good solution, but it’ll definitely require custom code to implement.


(David Rutten) #7

Not to mention the total mess you’ll be in if the first value happens to be an outlier…