Character encoding problem -> ARCHICAD export list in UTF-8 --> GDL DATA I/O read the list in ANSI
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2021-06-09
01:03 AM
- last edited on
2021-09-14
09:07 AM
by
Noemi Balogh
I have modded the "Pen and Colors" object (contained in the german/austrian libary) so the object can also display the pen description and the thickness of pens in a graphic table. As there is no way to get the description and the thickness directly, I export the pen-set configuration as a list (csv/txt) out from attribute manager and import the values using DATA I/O in the gdl-object. So far everything works as it should...
BUT: in german language we have special characters like ä, ö, ü, ß...

ARCHICAD export the pen-set configuration encoded in UTF-8, but the DATA I/O gdl-addon reads the txt-files as encoded in ANSI (Windows 1252). So when the imported values are displayed with "funny" characters in the object instead of Äs, Üs, Ös...
At the moment I do a character encoding convertion using 'notepad++', but this is not useable for 'my users' (architects not developer

Is there a chance to define the character encoding the DATA I/O addon should use interpreting the list (txt/csv)?
Or is it possible to define the encoding for the exported txt-file in the attribute manager?
...or do you know another way to come to a right character display in the object?
I hope you can help me to solve this issue - and perhaps someone from GRAPHISOFT also reads this post and in the future ARCHICAD will use the same character encoding for read/write or import/export txt/ascii files...


thank you for your help,
best regards from vienna,
Yours, Klaus
- Labels:
-
Library (GDL)

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2021-06-09 01:26 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2021-06-09 02:01 AM
thank you for your fast reply.
Please let me see, if I have understood you correctly....
1) you read/import the value from the txt-list file using TEXT I/O or DATA I/O
2) in the value of the new parameter/variable you search for "special characters" and change them to "ASCII" characters (ascii decimal code below '128') (i.e. for "Ä" change to "Ae") using a "string exchange operation" (I do not know the right command/syntax doing this - perhaps you can help me :roll)
3) in the "object-output" you use the "changed string", which only contain characters in "ASCII encoding below 128" which get the same character in (nearly) all character encodings...
...have I understood your solution right?
Thank you for your help,
Yours, Klaus

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2021-06-09 10:12 AM
In your case you already have txt file, generated by another part of program (pen set names) and just want to read it. The only way how you can make it work - probably in notepad use search and replace and change all umlauts to something else.
I can share with you my scripts of coder-decoder for Hebrew. This is not exactly what you need, but maybe it will help you to better understand principle how to modify texts via GDL commands.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2021-06-11 09:45 AM
It is a much newer GDL Add-On so it might support UTF-8.
AMD Ryzen9 5900X CPU, 64 GB RAM 3600 MHz, Nvidia GTX 1060 6GB, 500 GB NVMe SSD
2x28" (2560x1440), Windows 10 PRO ENG, Ac20-Ac28

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2021-06-14 03:14 PM
The opened file is read as UTF-8 only if it has a BOM (for backwards compatibility).
Maybe you could give your users a simple bat/command file that adds the BOM.
Software Engineer, Library
Graphisoft SE, Budapest
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2021-06-18 02:07 AM


well, I hve done a lot of testing, scripting and googeling
what I find out:
I need not change i.e. "ü" to "ue" - all works fine when I use the notepad++ 'menue-command': "Convert to ANSI" --> all the special characters are displayed correctly...
when I convert the file to "UTF-8 BOM", it will not work - I get the "funny chars" in the gdl-object...
I am not successful creating a batch-cmd-file to convert the txt-file, although I do lot of googeling, testing etc. and though I am not a "Newbe" scripting cmd/bat-files



...so at the moment I have only the choice "extra converting work in notepad++" or getting the "funny chars" in the object displayed...
perhaps some of you have a better idea than loosing this war against the char-encoding possibilities


I am really gald for every hint some of you can give me...
best regards from (really hot

Yours, Klaus
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2021-06-18 11:42 AM
For me "UTF8 with BOM" works just fine.
It is important that the text file is saved with the correct encoding. In my case I write to an existing well formated file.
VS Code: create a new file, click in the right lower corner on "UTF8". Then a dropdown appears. Select "UTF with BOM". Save. Then refer to this file in Archicad or load it into the embedded libraray.
HTH

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2021-06-18 12:53 PM
catch17 wrote:First result on Google.
well, I hve done a lot of testing, scripting and googeling😉 in the meantime...
You can't do it with a BAT file. Take powershell instead.
This snippet indeed sets the BOM when converting from cp1252 zo utf-8.
Get-Content .\test.txt | Set-Content -Encoding utf8 test-utf8.txtWith a loop you can even convert a whole folder.
Greetings from hot Berlin to Vienna, Klaus
POSIWID – The Purpose Of a System Is What It Does /// «Furthermore, I consider that Carth... yearly releases must be destroyed»