Archicad C++ API
About Archicad add-on development using the C++ API.

How to convert wstring to Unistring in macOS

Anonymous
Not applicable
I need to put unicode words in my addon but I am facing some problems:
For example, when I have the following code :
const GS::UniString a = L"xxx";
or
DGTreeViewSetItemText(dialogID, itemID, CS,L"xxx");
It is no problem if I develop in windows
but in macOS it will show error: have no viable conversion for const wchar_t[4] to GS::UniString
Thanks for helping!!!!
4 REPLIES 4
Ralph Wessel
Mentor
This is an issue of text encoding. wchar_t is a data type rather than an encoding, and the result may be implementation-specific (depending on the compiler and platform). Try one of the following:

1) If this is text that is meant to be user-readable, it would be better to store it in a string resource rather than hard-coding it. Then the encoding can be dealt with by ARCHICAD and your add-on will be far easier to localise.

2) Use a prefix to specify the encoding, e.g. for utf-8:
GS::UniString a(u8"xxx", CC_UTF8);
Ralph Wessel BArch
Anonymous
Not applicable
Ralph is generally correct. UniString is implemented differently on Mac: it stores characters in UTF-8 encoding on Mac(e.g. 1-byte per character). On Windows it uses wchar_t which is 2-bytes per character. Thus const GS::UniString a = L"xxx"; will compile on Windows, but will fail on Mac. Graphisoft changed implementation of UniString somewhere between AC15 and AC21. In older versions it used wchart_t.

You may use u8"my string" literals on both platforms, or you may convert your L"my string" constants into appropriate format for every platform.

I'd implement it like that:

#ifndef SL

std::string ToUtf8(const std::wstring& widestring)
{
	size_t widesize = widestring.length();

	if (sizeof(wchar_t) == 2)
	{
		size_t utf8size = 3 * widesize + 1;
		char* utf8stringnative = new char[utf8size];
		const UTF16* sourcestart = 
	reinterpret_cast<const UTF16*>(widestring.c_str());
		const UTF16* sourceend = sourcestart + widesize;
		UTF8* targetstart = reinterpret_cast<UTF8*>(utf8stringnative);
		UTF8* targetend = targetstart + utf8size;
		ConversionResult res = ConvertUTF16toUTF8
	(&sourcestart, sourceend, &targetstart, targetend, strictConversion);
		if (res != conversionOK)
		{
			delete [] utf8stringnative;
			throw std::exception("La falla!");
		}
		*targetstart = 0;
		std::string resultstring(utf8stringnative);
		delete [] utf8stringnative;
		return resultstring;
	}
	else if (sizeof(wchar_t) == 4)
	{
		size_t utf8size = 4 * widesize + 1;
		char* utf8stringnative = new char[utf8size];
		const UTF32* sourcestart = 
	reinterpret_cast<const UTF32*>(widestring.c_str());
		const UTF32* sourceend = sourcestart + widesize;
		UTF8* targetstart = reinterpret_cast<UTF8*>(utf8stringnative);
		UTF8* targetend = targetstart + utf8size;
		ConversionResult res = ConvertUTF32toUTF8
	(&sourcestart, sourceend, &targetstart, targetend, strictConversion);
		if (res != conversionOK)
		{
			delete [] utf8stringnative;
			throw std::exception("La falla!");
		}
		*targetstart = 0;
		std::string resultstring(utf8stringnative);
		delete [] utf8stringnative;
		return resultstring;
	}
	else
	{
		throw std::exception("La falla!");
	}
	return "";
}

#ifdef _WIN32
#define SL(str) str
#else
#define SL(str) ToUtf8(str).c_str()
#endif
#endif

const GS::UniString a = SL(L"xxx");  // This will compile well now.
You may move the code into a separate header file and include it everywhere you want to initialize UniString with a string literal. Just remove the last line with a comment.

Like that:

#include "sl.h"

const GS::UniString a = SL(L"xxx");  // This will compile well now.
Ralph Wessel
Mentor
chebum wrote:
UniString is implemented differently on Mac: it stores characters in UTF-8 encoding on Mac(e.g. 1-byte per character). On Windows it uses wchar_t which is 2-bytes per character. Thus const GS::UniString a = L"xxx"; will compile on Windows, but will fail on Mac. Graphisoft changed implementation of UniString somewhere between AC15 and AC21. In older versions it used wchart_t.
A few observations:
  • 1) All text encoding is UTF-8 from AC21 – from the documentation:
    All C strings (char *, char []) are now UTF-8 encoded.
    2) Text encoding is not the same as the number of bytes consumed by a character. Many different encodings will use the same number of bytes, but are not equivalent. Constructing a UniString from wchar_t only works on Windows because it's commonly assumed to be UTF-16. It's a bad idea if you intend to write cross-platform code.

    3) UTF-8 doesn't have a fixed number of bytes per character. The only thing that can be said in relation to 8-bit encodings is that old string methods assuming fixed 8-bit characters with null termination won't crash on UTF-8 text.
chebum wrote:

#include "sl.h"

const GS::UniString a = SL(L"xxx");  // This will compile well now.
I'd strongly recommend using the mechanisms provided by GS. Writing your own methods for solved problems just wastes time and opens you up to more bugs and future migration problems.
Ralph Wessel BArch

Ralph, your expression throws me in an "Excess elements in scalar initializer" error.

Knowing that I am not familiar with the ACAPI expressions, can you tell me what am I doing wrong?
Thank you

Nic