Unicode characters to string?

Hello
In this code the string converted to unicode characters.
How we can convert these unicode characters (output) to a string in the same code?

string input = "rhino";
    string output = "";
    foreach (char c in input)
    {
      int value = (int) c;
      string hex = value.ToString("X4");
      output += string.Format(@"\u{0}", hex);;
    }
    // B: convert output to string "rhino" ?
    A = output;
    B = "";

Here’s a couple of methods:

// https://discourse.mcneel.com/t/unicode-characters-to-string/153347

string input = str;
string output = "";
foreach (char c in input)
{
    int value = (int) c;
    string hex = value.ToString("X4");
    output += string.Format(@"\u{0}", hex);;
}

// https://stackoverflow.com/questions/1615559/convert-a-unicode-string-to-an-escaped-ascii-string
string regExResult = Regex.Replace(
    output,
    @"\\u(?<Value>[a-zA-Z0-9]{4})",
    m => {
        return ((char) int.Parse( m.Groups["Value"].Value, NumberStyles.HexNumber )).ToString();
    } );

string regExUnescape = Regex.Unescape(output);

A = output;
B = regExResult;
C = regExUnescape;


unicode_re.gh (8.1 KB)

-Kevin

1 Like

Thank you very much @kev.r , I appreciate your help

Hi @kev.r

I was pleasantly surprised to see that this handles Unicode Surrogate pairs correctly. I wasn’t expecting it to, so as a test I fed it an Egyptian Hieroglyph: 𓉔 (“\uD80C\uDE54”) which reappeared correctly on the other side. Maybe I need to be less cynical about Microsoft.

Jeremy