Category Archive: Unicode

Jun 02 2011

Display Unicode Characters on the Windows Console

Even in today’s mostly-Unicode world on Windows, the console (i.e. cmd.exe) still defaults to using OEM code pages (i.e. multibyte characters). ┬áTo set the console to Unicode mode, use the following code: #include <fcntl.h> #include <io.h> #include <stdio.h> int main(void) { _setmode(_fileno(stdout), _O_U16TEXT); wprintf(L”\x043a\x043e\x0448\x043a\x0430 \x65e5\x672c\x56fd\n”); return 0; } This information came from two great articles …

Continue reading »

Jun 02 2011

Always prefix a Unicode plain text file with a byte order mark

This comes from the MSDN page entitled “Using Byte Order Marks”. Byte order mark Description EF BB BF UTF-8 FF FE UTF-16, little endian FE FF UTF-16, big endian FF FE 00 00 UTF-32, little endian 00 00 FE FF UTF-32, big-endian