C++ Guide: Understanding the wstring Family Member – Wide Character String Type

C++ Guide:<span>wstring</span> – Wide Character String Type

1. What is <span>wstring</span>?

<span>std::wstring</span> is a wide character (wchar_t) string type provided by the C++ standard library. Unlike <span>std::string</span>, <span>std::wstring</span> is used to store Unicode characters, making it suitable for applications that require multilingual support, such as Chinese, Japanese, Korean, and other non-ASCII languages.

2. Underlying Data Type of <span>wstring</span>

📌 <span>std::string</span> vs. <span>std::wstring</span>

String Type Stored Character Type Common Encoding
<span>std::string</span> <span>char</span> (single-byte) ASCII / UTF-8
<span>std::wstring</span> <span>wchar_t</span> (wide character) UTF-16 / UTF-32

Note:

  • <span>std::wstring</span> uses <span>wchar_t</span> to store characters, which may vary in size across different operating systems:
    • Windows: <span>wchar_t</span> is typically 2 bytes (UTF-16)
    • Linux/macOS: <span>wchar_t</span> is typically 4 bytes (UTF-32)
  • <span>wstring</span> is suitable for programs that need to handle multi-byte character sets (MBCS) or wide character sets (WCS).

3. Basic Usage of <span>wstring</span>

📌 1️⃣ Creating <span>wstring</span>

#include <iostream>
#include <string>

int main() {
    std::wstring ws1 = L"你好,世界!"; // L prefix indicates wide string
    std::wstring ws2 = L"Hello, Wide World!";
    
    std::wcout << L"Wide character string: " << ws1 << std::endl;
    std::wcout << L"English string: " << ws2 << std::endl;

    return 0;
}

💡 Note:

  • Use the <span>L</span> prefix to indicate a wide character string, for example <span>L"你好"</span>.
  • <span>std::wcout</span> **is used to output <span>wstring</span>** (<span>std::cout</span> cannot directly output <span>wstring</span>).
  • It is necessary to set the locale to correctly display <span>wstring</span> (see section 5 below).

📌 2️⃣ Common Operations on <span>wstring</span>

#include <iostream>
#include <string>

int main() {
    std::wstring ws = L"宽字符字符串";

    // Length
    std::wcout << L"String length: " << ws.length() << std::endl;

    // Concatenate strings
    ws += L" - additional content";
    std::wcout << L"After concatenation: " << ws << std::endl;

    // Access character
    std::wcout << L"First character: " << ws[0] << std::endl;

    // Find substring
    size_t pos = ws.find(L"追加");
    if (pos != std::wstring::npos) {
        std::wcout << L"Found '追加', position: " << pos << std::endl;
    }

    return 0;
}

🛠 Common Functions:

Function Purpose
<span>length()</span> Get string length
<span>append()</span> or <span>+=</span> Concatenate strings
<span>find(L"substring")</span> Find substring
<span>substr(start, length)</span> Get substring
<span>compare()</span> Compare strings
<span>empty()</span> Check if empty

4. Converting between <span>string</span> and <span>wstring</span>

📌 1️⃣ <span>wstring</span><span>string</span> (narrow character)

#include <iostream>
#include <string>
#include <locale>
#include <codecvt>

std::string wstringToString(const std::wstring&amp; wstr) {
    std::wstring_convert<std::codecvt_utf8<wchar_t>> converter;
    return converter.to_bytes(wstr);
}

int main() {
    std::wstring ws = L"你好,世界!";
    std::string s = wstringToString(ws);
    std::cout << "Converted string: " << s << std::endl;
    return 0;
}

📝 <span>std::wstring_convert</span> is a encoding conversion tool introduced in C++11, which converts <span>wstring</span> to a UTF-8 encoded <span>string</span>.

📌 2️⃣ <span>string</span><span>wstring</span> (wide character)

#include <iostream>
#include <string>
#include <locale>
#include <codecvt>

std::wstring stringToWstring(const std::string&amp; str) {
    std::wstring_convert<std::codecvt_utf8<wchar_t>> converter;
    return converter.from_bytes(str);
}

int main() {
    std::string s = "Hello, 世界!";
    std::wstring ws = stringToWstring(s);
    std::wcout << L"Converted wstring: " << ws << std::endl;
    return 0;
}

5. Resolving the Issue of <span>std::wcout</span> Not Displaying Chinese Characters Correctly

In Windows cmd or Linux terminal, directly using <span>std::wcout</span> may not display <span>wstring</span> correctly.Solution:

📌 1️⃣ Windows (UTF-16)

#include <iostream>
#include <string>
#include <locale>

int main() {
    setlocale(LC_ALL, ""); // Set locale to support Chinese output

    std::wstring ws = L"你好,世界!";
    std::wcout << L"Correctly displaying wide character: " << ws << std::endl;

    return 0;
}

💡 <span>setlocale(LC_ALL, "")</span><span> allows </span><code><span>std::wcout</span> to correctly display Unicode characters.

📌 2️⃣ Linux/macOS (UTF-32)

The Linux terminal typically uses UTF-8, so it is recommended to use <span>wstring_convert</span> for conversion:

#include <iostream>
#include <string>
#include <locale>
#include <codecvt>

int main() {
    std::wstring ws = L"你好,世界!";

    // Convert wstring → string
    std::wstring_convert<std::codecvt_utf8<wchar_t>> converter;
    std::string utf8_str = converter.to_bytes(ws);

    std::cout << "UTF-8 display: " << utf8_str << std::endl;
    return 0;
}

6. Summary

Operation Method
Create <span>wstring</span> <span>std::wstring ws = L"你好";</span>
Output <span>wstring</span> <span>std::wcout << ws;</span> (requires <span>setlocale(LC_ALL, "")</span>)
Concatenate strings <span>ws += L"追加";</span>
Find substring <span>ws.find(L"substring")</span>
<span>wstring</span> to <span>string</span> <span>std::wstring_convert<std::codecvt_utf8<wchar_t>></span>
<span>string</span> to <span>wstring</span> <span>std::wstring_convert<std::codecvt_utf8<wchar_t>></span>

🚀 <span>wstring</span> is suitable for multilingual support and international applications, but be aware of the character encoding of the platform!

Leave a Comment