Author: William Gerbert
I have an application that reads data from a server via winsock. The data sent are
in Unicode format and I need to parse out the constituent strings and display in a
ListView. They are sent as C strings so the data looks like this: array of
chars#0array of chars#0array of chars#0#0. Since the 'array of chars' is actually
an array of widechars it also contains #0 bytes in the msb of the character. I
tried StringReplace(Intext, #0, '', [rfReplaceAll]); but it does not convert, maybe
it cannot go past the first #0 in the input string?
Answer:
Yes. What you need to do here is work with PWideChars. It would have helped, of
course, to post a bit more specific information, e.g. what the type of Intext is.
Anyway, all you need is a way to get the address of the first widechar in the data.
Assuming intext is a String (even though it contains widechars) the process would
look like this:
1 procedure SplitServerWidecharList(const intext: string; list: TStrings);
2 var3 p: PWideChar;
4 begin5 Assert(Assigned(list));
6 list.Clear;
7 if intext <> '' then8 begin9 p := PWideChar(@intext[1]); {points to first widechar}10 while p^ <> #0000 do11 begin12 {Convert this widestring to Ansi and store it}13 list.add(WidecharToString(p));
14 {Find end of this widestring}15 while p^ <> #0000 do16 Inc(p);
17 {Hop to start of the next one }18 Inc(p);
19 end;
20 end;
21 end;
Can you be sure of the byte order of the received Unicode characters? The code above assumes little-endian byte order, if the data comes in in big-endian byte order you would have to swap the bytes in every widechar before you could process it as above.