Category: Unicode

Display Unicode Characters on the Windows Console

Even in today’s mostly-Unicode world on Windows, the console (i.e. cmd.exe) still defaults to using OEM code pages (i.e. multibyte characters).  To set the console to Unicode mode, use the following code:

This information came from two great articles by Michael Kaplan: Conventional wisdom is retarded, aka What the @#%&* is _O_U16TEXT? Anyone who …

Continue reading

Always prefix a Unicode plain text file with a byte order mark

This comes from the MSDN page entitled “Using Byte Order Marks”. Byte order mark Description EF BB BF UTF-8 FF FE UTF-16, little endian FE FF UTF-16, big endian FF FE 00 00 UTF-32, little endian 00 00 FE FF UTF-32, big-endian