Skip to content

Instantly share code, notes, and snippets.

@LB--
Last active July 17, 2018 00:33
Show Gist options
  • Save LB--/735a911302ee9891a431514f6978e0a6 to your computer and use it in GitHub Desktop.
Save LB--/735a911302ee9891a431514f6978e0a6 to your computer and use it in GitHub Desktop.
LB's Windows Unicode Helper Functions - convert between UTF-8 and UTF-16/Unicode on Windows
This is free and unencumbered software released into the public domain.
Anyone is free to copy, modify, publish, use, compile, sell, or
distribute this software, either in source code form or as a compiled
binary, for any purpose, commercial or non-commercial, and by any
means.
In jurisdictions that recognize copyright laws, the author or authors
of this software dedicate any and all copyright interest in the
software to the public domain. We make this dedication for the benefit
of the public at large and to the detriment of our heirs and
successors. We intend this dedication to be an overt act of
relinquishment in perpetuity of all present and future rights to this
software under copyright law.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
For more information, please refer to <http://unlicense.org>
#include <string>
#include <string_view>
#include <stdexcept>
#include <Windows.h>
namespace LB::winunicode
{
/**
Given a UTF-8 narrow string, returns a UTF-16 wide string for use with Windows API functions.
If you use `true` for the template parameter, exceptions will not be thrown in the event of an error,
instead, each character in the source string will map directly to a character in the result string,
which is incorrect behavior, but will likely be better than nothing.
*/
template<bool ignore_errors = false>
auto wide_from_narrow(std::string_view const &s)
-> std::wstring
{
if(s.empty())
{
//empty string requires no calls to Windows API
return {};
}
if constexpr(!ignore_errors)
{
if(s.size() > static_cast<std::string_view::size_type>((std::numeric_limits<int>::max)()))
{
//Windows takes an int for the size, so going above that is undefined behavior for signed overflow
throw std::runtime_error{"String is too large to pass to Windows API"};
}
}
//ask Windows how much space we need to allocate in the buffer
auto const len = MultiByteToWideChar(CP_UTF8, 0, s.data(), static_cast<int>(s.size()), NULL, 0);
if(len <= 0) //this indicates an error condition
{
if constexpr(ignore_errors)
{
return {std::cbegin(s), std::cend(s)};
}
else if constexpr(true)
{
//TODO: call GetLastError and find out what the error was
throw std::runtime_error{"Cannot widen utf8 string: "+std::string{s}};
}
//same code as below
}
//allocate an appropriately sized buffer in an exception-safe container
std::wstring buf (static_cast<std::wstring::size_type>(len), L'\0');
//ask Windows to actually perform the conversion now
auto const result = MultiByteToWideChar(CP_UTF8, 0, s.data(), static_cast<int>(s.size()), &buf.front(), len);
if(result <= 0) //this indicates an error condition
{
//same code as above
if constexpr(ignore_errors)
{
return {std::cbegin(s), std::cend(s)};
}
else if constexpr(true)
{
throw std::runtime_error{"Cannot widen utf8 string: "+std::string{s}};
}
}
//return the correctly converted string
return buf;
}
/**
Given a UTF-16 wide string received from a Windows API function, returns a UTF-8 narrow string.
If you use `true` for the template parameter, exceptions will not be thrown in the event of an error,
instead, each character in the source string will map directly to a character in the result string,
which is incorrect behavior, but will likely be better than nothing.
*/
template<bool ignore_errors = false>
auto narrow_from_wide(std::wstring_view const &s)
-> std::string
{
if(s.empty())
{
//empty string requires no calls to Windows API
return {};
}
if constexpr(!ignore_errors)
{
if(s.size() > static_cast<std::wstring_view::size_type>((std::numeric_limits<int>::max)()))
{
//Windows takes an int for the size, so going above that is undefined behavior for signed overflow
throw std::runtime_error{"String is too large to pass to Windows API"};
}
}
//ask Windows how much space we need to allocate in the buffer
auto const len = WideCharToMultiByte(CP_UTF8, 0, s.data(), static_cast<int>(s.size()), NULL, 0, NULL, NULL);
if(len <= 0) //this indicates an error condition
{
if constexpr(ignore_errors)
{
std::string ret (s.size(), '\0');
for(std::size_t i = 0; i < ret.size(); ++i)
{
ret[i] = static_cast<std::string::value_type>(s[i]);
}
return ret;
}
else if constexpr(true)
{
//TODO: call GetLastError and find out what the error was
throw std::runtime_error{"Couldn't narrow wide string"};
}
//same code as below
}
//allocate an appropriately sized buffer in an exception-safe container
std::string buf (static_cast<std::string::size_type>(len), '\0');
//ask Windows to actually perform the conversion now
auto const result = WideCharToMultiByte(CP_UTF8, 0, s.data(), static_cast<int>(s.size()), &buf.front(), len, NULL, NULL);
if(result <= 0) //this indicates an error condition
{
//same code as above
if constexpr(ignore_errors)
{
std::string ret (s.size(), '\0');
for(std::size_t i = 0; i < ret.size(); ++i)
{
ret[i] = static_cast<std::string::value_type>(s[i]);
}
return ret;
}
else if constexpr(true)
{
throw std::runtime_error{"Couldn't narrow wide string"};
}
}
//return the correctly converted string
return buf;
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment