Mega Search
23.2 Million


Sign Up

Make a donation  
How do I make unicode texts show on menus?  
News Group: embarcadero.public.delphi.ide

I am migrating an old BDS2006 application suite to XE5-XE7.
One reason is to make it possible to display unicode texts in the GUI.

We have used a language switching system where the user can switch
language and then all of the new texts are read from a language file
in the selected language. The language files are read as ini files
using the TInifile class and contain text strings for each component
on each form where the form class name is the section.
For European languages the files are plain ASCII text files, but for
Asian languages they are unicode.

Now I hoped that if I migrate to XE5 and select a language like
Japanese or Chinese the texts from the Unicode language files would
show up in the GUI. But for some reason I only get unintelligible
characters on the menus after switching to say Chinese.
Some other items on the GUI do translate and show up as Chinese
characters, but most do not.

I have tested various unicode schemes but it all looks the same.
My development PC is Windows7 x64 Professional fully updated.

With the åre-unicode applications I was able to display the Chinese
texts on Windows XP (US version) after setting Windows locale language
system to Chinese, but this setting is no longer available on Win7....

What could be the reason for this problem?
Are menu texts handled differently from other components?

Vote for best question.
Score: 0  # Vote:  0
Date Posted: 7-Jan-2015, at 12:56 AM EST
From: Bo Berglund
 
Re: How do I make unicode texts show on menus?  
News Group: embarcadero.public.delphi.ide
On Wed, 7 Jan 2015 00:56:07 -0800, Bo Berglund
 wrote:

Closing this thread with a final report:
I have now a fully working XE5 (Unicode) application where the
language files are all UTF-8 and whatever language is switched to it
shows up on the menus with the correct characters (including Chinese,
Trad and Simpl).
What I did was:
- Convert the application to work on XE5
- Modify the language file handling system to use TMemIniFile
- Supplement the ReadString function with a removal of quotes in code

It now works like a charm!
Thanks to all contributors in this thread!

Vote for best answer.
Score: 0  # Vote:  0
Date Posted: 22-Jan-2015, at 9:55 AM EST
From: Bo Berglund
 
Re: How do I make unicode texts show on menus?  
News Group: embarcadero.public.delphi.ide
Bo wrote:

> The Windows API PrivateProfile calls always remove these quotes
> on reading via the TIniFile class.

That is documented behavior in the API spec:

http://msdn.microsoft.com/en-us/library/windows/desktop/ms724353.aspx

{quote}
If the string associated with lpKeyName is enclosed in single or double quotation 
marks, the marks are discarded when the GetPrivateProfileString function 
retrieves the string.
{quote}

> To get rid of this I had to revert back to TIniFile

Or, you could simply remote the quotes yourself after TMemIniFile.ReadString() 
exits, before using the string value where needed.  Look at the RTL's AnsiExtractQuotedStr() 
and AnsiDequotedStr() functions.  For example:

{code}
S := FIniFile.ReadString(...);
if StartsText(#39, S) or StartsText(#34, S) then
  S := AnsiDequotedStr(S, S[1])
else
  S := Trim(S);
{code}

--
Remy Lebeau (TeamB)

Vote for best answer.
Score: 0  # Vote:  0
Date Posted: 8-Jan-2015, at 4:45 PM EST
From: Remy Lebeau (TeamB)
 
Re: How do I make unicode texts show on menus?  
News Group: embarcadero.public.delphi.ide
On Thu, 8 Jan 2015 12:07:20 -0800, Bo Berglund
 wrote:

>On Wed, 7 Jan 2015 16:00:32 -0800, Remy Lebeau (TeamB)
> wrote:
>
>It really does not matter in this case because the language files are
>not written by the application, only read.
>The developer will use a special debug function to create a master
>language file in English and this is the only time a file is written.
>So it is perfectly OK.

Now I have tested TMemIniFile for *reading* the language strings into
the Delphi 2007 version of the application. (I cannot really go to the
unicode version of this application yet because I am working on
finding all strings thta need translation; a smaller application has
been converted to XE5, though).

Anyway, I found a big (killer) problem in the implementation:
Our language strings are usually written on this format in the
language file:
[]
.Caption="some text"

The double quotes are used ín all texts so that we can store both
leading and trailing whitespace like this:

.Caption=" some text  "

Now, after switching from TIniFile to TMemIniFile as described the
result is that the display everywhere now contains the quotes!
So all menu items are named like "File", "Edit" etc with the quotes
displayed.

The Windows API PrivateProfile calls always remove these quotes on
reading via the TIniFile class. It also removes leading and trailing
whitespace so we cannot start or end language texts with whitespace
unless we use the quotes.

To get rid of this I had to revert back to TIniFile and this brings
back the problem with unicode compliance in the XE5 version of this
app after migration...

Question:
Has this compatibility problem surfaced before and been solved in
Delphi versions between D2007 and XE5?

Vote for best answer.
Score: 0  # Vote:  0
Date Posted: 8-Jan-2015, at 2:45 PM EST
From: Bo Berglund
 
Re: How do I make unicode texts show on menus?  
News Group: embarcadero.public.delphi.ide
On Wed, 7 Jan 2015 16:00:32 -0800, Remy Lebeau (TeamB)
 wrote:

>Bo wrote:
>
>> What happens to comments and such in the ini file on disk when
>> the UpdateFile call is made?
>
>TMemIniFile does not support comments.  It parses comments when loading the 
>file but they are discarded, and there is no way to specify new comments 
>when saving the file.
>
Thanks for the clarification!
It really does not matter in this case because the language files are
not written by the application, only read.
The developer will use a special debug function to create a master
language file in English and this is the only time a file is written.
So it is perfectly OK.

I just wanted to know ffor other potential uses of TMemIniFile.

Vote for best answer.
Score: 0  # Vote:  0
Date Posted: 8-Jan-2015, at 12:07 PM EST
From: Bo Berglund
 
Re: How do I make unicode texts show on menus?  
News Group: embarcadero.public.delphi.ide
Bo wrote:

> I looked up TMemIniFile and it seems to inherit from TCustomIniFile.
> So I modified the Field type thus:
> 
> FIniFile: TCustomIniFile;
> 
> Then where it is initialized:

> FIniFile := TMemIniFile.Create(FLanguageManager.CurrentLanguageFile);

Keep in mind that TMemIniFile's constructor has an optional Encoding parameter 
in D2009 and later, eg:

{code}
FIniFile := TMemIniFile.Create(FLanguageManager.CurrentLanguageFile, TEncoding.UTF8);
{code}

> Ran a syntax check and it was all OK.

Yes, by design.  By switching to the TCustomIniFile base class, you can use 
any INI implementation class you want (for instance, there is also a TRegistryIniFile 
class in the System.Win.Registry.pas unit which read/writes data using the 
Registry instead of an .ini file).

> You mention that the encoding is set for the file somewhere, but
> I cannot see that in the D2007 help.

That feature does not exist in D2007.  It was added in D2009, when Delphi 
switched its String type to Unicode.

> Was this not added until after D2009, maybe?

Yes.

> I would like to make the code work across the unicode border and
> TMemIniFile exists already before so this is probably a good change
> notwithstanding.

In D2007 and earlier, TMemIniFile reads/writes AnsiString values as-is, just 
like TIniFile does.  So, for instance, if the file is encoded as UTF-8, TMemIniFile.ReadString() 
(and TIniFile.ReadString()) will return an UTF-8 encoded AnsiString.

That is not the case in D2009 and later.  Strings are always UTF-16 encoded, 
so the file data has to be encoded/decoded accordiingly.

> However, the encoding stuff (once found) probably has to be depending
> on compiler version, I guess?

If you are trying to write code that can be compiled in multiple Delphi versions, 
then yes, you will have to use {$IFDEF} statements to wrap newer functionality 
as needed.  For example (assuming a UTF-8 encoded file):

{code}
{$IF CompilerVersion >= 24.0}
  {$LEGACYIFEND ON} // prior to XE4, $IF must use $IFEND instead of $ENDIF
{$IFEND}

FIniFile := TMemIniFile.Create(FLanguageManager.CurrentLanguageFile
  {$IF RTLVersion >= 20.0}
  , TEncoding.UTF8
  {$IFEND}
);
....
{$IF RTLVersion >= 20.0}
s := FIniFile.ReadString(...); // reads UTF-8 from file, auto decodes to 
UTF-16
{$ELSE}
s := FIniFile.ReadString(...); // reads UTF-8 from file
s := Utf8ToAnsi(s); // manual decode from UTF-8 to local ANSI
w := UTF8Decode(s); // manual decode from UTF-8 to UTF-16 WideString
{$IFEND}
{code}

--
Remy Lebeau (TeamB)

Vote for best answer.
Score: 0  # Vote:  0
Date Posted: 7-Jan-2015, at 4:16 PM EST
From: Remy Lebeau (TeamB)
 
Re: How do I make unicode texts show on menus?  
News Group: embarcadero.public.delphi.ide
Bo wrote:

> What happens to comments and such in the ini file on disk when
> the UpdateFile call is made?

TMemIniFile does not support comments.  It parses comments when loading the 
file but they are discarded, and there is no way to specify new comments 
when saving the file.

TIniFile, using Microsoft's PrivateProfile API, actually parses an existing 
file when saving, thus preserving existing content including comments.  TMemIniFile 
does not do that.  It creates a new file and writes its current memory content 
to that file.

> Is all of that stuff removed

Yes

> is it actually read into memory when the TMemInifile is created

It is read, but it is not saved in memory.

> so it can be later written together with the other data?

No.

--
Remy Lebeau (TeamB)

Vote for best answer.
Score: 0  # Vote:  0
Date Posted: 7-Jan-2015, at 4:00 PM EST
From: Remy Lebeau (TeamB)
 
Re: How do I make unicode texts show on menus?  
News Group: embarcadero.public.delphi.ide
On Wed, 7 Jan 2015 15:27:24 -0800, Bo Berglund
 wrote:

Further about the TMemIniFile.UpdateFile method.

From the help in D2007:
-----------------------
{quote}Call UpdateFile to copy INI file data stored in memory to the
copy of the INI file on disk UpdateFile overwrites all data in the
disk copy of the INI file with the INI file data stored in
memory.{quote} 

So my question now is:
What happens to comments and such in the ini file on disk when the
UpdateFile call is made?

Is all of that stuff removed or is it actually read into memory when
the TMemInifile is created so it can be later written together with
the other data?

Vote for best answer.
Score: 0  # Vote:  0
Date Posted: 7-Jan-2015, at 3:37 PM EST
From: Bo Berglund
 
Re: How do I make unicode texts show on menus?  
News Group: embarcadero.public.delphi.ide
On Wed, 7 Jan 2015 14:04:42 -0800, Remy Lebeau (TeamB)
 wrote:

>  a. use TMemIniFile to load the file using that charset/codepage.

I looked up TMemIniFile and it seems to inherit from TCustomIniFile.
So I modified the Field type thus:

  FIniFile: TCustomIniFile;

Then where it is initialized:

  FIniFile :=
TMemIniFile.Create(FLanguageManager.CurrentLanguageFile);

Ran a syntax check and it was all OK.

But now my question:
You mention that the encoding is set for the file somewhere, but I
cannot see that in the D2007 help.
Was this not added until after D2009, maybe?

I would like to make the code work across the unicode border and
TMemIniFile exists already before so this is probably a good change
notwithstanding.
However, the encoding stuff (once found) probably has to be depending
on compiler version, I guess?

Vote for best answer.
Score: 0  # Vote:  0
Date Posted: 7-Jan-2015, at 3:27 PM EST
From: Bo Berglund
 
Re: How do I make unicode texts show on menus?  
News Group: embarcadero.public.delphi.ide
Bo wrote:

> I have a little problem here, it seems...

It seems you have files in different encodings.  You have to know the encoding 
of a file in order to read it, especially in a Unicode envionment like D2009+. 
 So you need to either:

1. normalize your files to a single encoding (prefer UTF-8), and then make 
sure your code reads the files using that encoding (TIniFile does not support 
that, but TMemIniFile does).

2. detect which encoding is being used by a file (which is error prone if 
you detect wrong), then read the file using that encoding.  One way would 
be to put the charset/codepage value in the .ini file itself so you can read 
it first, then either:

  a. use TMemIniFile to load the file using that charset/codepage.

  a. skip any T...IniFile class altogether and just use the Ansi PrivateProfile 
API functions directly, and then manually convert the bytes to Unicode using 
TEncoding or UnicodeFromLocaleChars().

> Some of the files are just plain ASCII files (at least they look like
> that when inspecting them). A few are unicode, but with different
> encodings.

Meaning what?  Some are UTF-8 and some are UTF-16?  Unicode is a character 
set, not an encoding.  UTFs are encodings of Unicode.

> For the unicode tests I have tried to convert the ASCII files to
> unicode using UltraEdit's functions. It might or might not have
> worked, I don't know especially in case of the Chinese files...

Chinese cannot be stored as ASCII.

> I was not aware of TMemIniFile before...
> It seems like it does no longer use Windows API functions for reading
> and writing but treats the file properly by itself, right?

Correct.  TMemIniFile is a manual implementation of the INI format, and so 
it has more flexibility than Microsoft's API.

> If I have to specifically tell TMemIniFile what encoding is in use,
> then how can I know beforehand?

You have to detect it beforehand.  There are three options:

1. pick an encoding and stick with it for everything.  This is not going 
to be compatible with existing files, so you will have to convert them.

2. store the encoding somewhere that you can read it later.  Also not likely 
to be compatible with existing files.

3. if you need to read existing files, you will have to analyze their raw 
bytes to detect the encoding used.  Not 100% reliable for non-UTFs.

> What happens if the encoding on read is set to UTF-8 but the file itself
> is juat a plain ASCII text file?

ASCII is a subset of UTF-8, so that is perfectly fine.  However, it is the 
non-ASCII non-UTF encodings you have to worry about.

> Conversion of the file encodings is something I am not very good at.
> What would be the best practice to do such a re-encoding?

Well, first you have to know what the original encoding is.  Without that, 
the rest is useless.  But if you can figure out the original encoding correctly, 
converting it easy.  Plenty of conversion tools available.  Or just convert 
the files in code using the TStreamReader and TStreamWriter classes.

> I tried to convert ASCII => UTF-8 on my Swedish language file

Swedish is not ASCII-compatible, either.  I think you are confusing ASCII 
(characters 0-127) with ANSI (characters 0-255, where 128-255 are charset/language 
specific).

> and then looked at the binary of the file. No BOM up front! Should there
> not be a BOM if it is UTF-8???

Usually, no.  In fact, the Unicode standard specifically states that UTF-8 
encoded files should NOT use a BOM.  Many apps (especially legacy apps) that 
read text files do not handle a UTF-8 BOM.

> If not, then how can one know if it is in UTF-8???

If there is no BOM, you have to analyze the raw bytes.  UTF-8 is very easy 
to detect, as it uses a very distinct bit pattern (by design).

--
Remy Lebeau (TeamB)

Vote for best answer.
Score: 0  # Vote:  0
Date Posted: 7-Jan-2015, at 2:04 PM EST
From: Remy Lebeau (TeamB)