Pandas One-Hot Encoding get.dummies() generates extra " u "

Hello,
Within phthonremote this code generates extra "u"s using pandas `get.dummies().

E.g. new becomes u' new'.
I would be thankful if anyone could explain why.


test.gh (14.4 KB)
test02.csv (320 Bytes)

If you look carefully in your screenshot you’ll see that row 5 has new with a trailing space. The code is not creating anything new, it is only showing what is in your data. Fix your data and you’ll have only one entry called new in your panel output.

You can also easily clean up leading and trailing white space with " new".strip().

Thank you both @nathanletwory and @diff-arch,
My concern is not about the trailing space, though.
The output data (red frame) show extra character (u) at the beginning of each input.
for example excellent becomes u'excellent
I ran the code outside pythonremote and it does not happen.

That extra character only means that you have a unicode encoded string.

Whatever you do, don’t use the out socket for data that you inded to process further. Instead put the data in a proper output, like the a.

1 Like