2014-10-21 07:44 AM - last edited on 2023-08-01 02:05 PM by Doreena Deng
typedef unsigned short uchar_t; // 2-byte unicode charater (UTF16)Why?????? This is so confusing.
2014-10-21 11:51 PM
ReignBough wrote:I'm not sure why this is confusing, but I assume you've not worked with any text encoding other than ASCII? Early text encodings of this type used only 1 byte (8 bits) per character, so the types char or unsigned char equated to a single character. However, this isn't suitable for many languages that have far more than 256 characters.
When I read a uchar_t data type, I thought it was an unsigned char type. But when I look it up on Definisions.hpp, I found out that it is:typedef unsigned short uchar_t; // 2-byte unicode charater (UTF16)Why?????? This is so confusing.
2014-11-11 09:31 AM
2014-11-14 03:10 PM
ReignBough wrote:None of these definitions are standards-based, so it's really a matter of semantics. You could also read uchar_t as unicode character type.
Well, when we are coding and we want to specify that a character is 16-bits, we use/define wchar / WCHAR (wide char, exactly 16-bits, based on wchar_t) or char16 / CHAR16 (at least 16-bits). We use uchar / UCHAR for unsigned char (and char / CHAR for signed char) for exactly 8-bits and char8 / CHAR8 for at least 8-bits.