Understanding the C++ Standard Library Type: string

1.<span><span>string</span></span> Basics: Preparation and Core Concepts

Before using<span>string</span>, two prerequisites must be clear: Header file inclusion and namespace, which are fundamental to avoiding compilation errors.

1.1 Essential Preparations: Header Files and Namespaces

  • Include Header File: <span>string</span> type is defined in the <span><string></span> header file (note: not <span><cstring></span><span><cstring></span> is the header file for C language character array functions, such as <span>strlen</span> and <span>strcpy</span>), it must be explicitly included:

#include <string>  // Must include, otherwise cannot use string type

Namespace: <span>string</span> belongs to the <span>std</span> namespace, and must be specified using one of the following two methods:

  1. Global declaration: <span>using namespace std;</span> (recommended for simple scenarios, such as example code);
  2. Local qualification: <span>std::string</span> (recommended for large projects to avoid naming conflicts).

Example (correct usage preparation):

#include <iostream>#include <string>  // Include string header fileusing namespace std;  // Simplify std namespaceint main() {    string s;  // Correct: can define string type variable    return 0;}

1.2 Core Concept:<span><span>string</span></span> What is it?

<span>string</span> is essentially an alias for the template class<span>basic_string<char></span>, encapsulating:

  • Underlying storage: dynamically allocated character array (no manual memory management required, <span>string</span> will automatically resize / release);
  • Built-in operations: provides dozens of member functions (such as <span>size()</span> to get length, <span>append()</span> to concatenate strings) and overloaded operators (such as <span>+</span> for concatenation, <span>==</span> for comparison), directly supporting common string operations.

Simply put:<span>string</span> is a “character array with smart management”, easy to use like a basic type, but more powerful.

2.<span>string</span> Creation:

<span>string</span> supports various initialization methods, the core is “choose the appropriate creation scenario based on needs” (such as empty string, initialized with literals, initialized with other <span>string</span>). The following are sorted by “commonality” with examples:

Initialization Method Syntax Example Core Explanation
Default Initialization <span>string s;</span> Creates a empty string (length 0, no valid characters in the underlying storage), can be filled with content later through assignment / input.
Copy Initialization (<span>=</span>) <span>string s = "hello";</span> Initialized with a string literal (<span>"hello"</span>) or another <span>string</span>, essentially “copying the value”.
Direct Initialization (<span>()</span>) <span>string s("hello");</span> Same effect as copy initialization, suitable for initializing with literals or <span>string</span>.
List Initialization (<span>{}</span>) <span>string s{"hello"};</span> New in C++11, equivalent to direct initialization, recommended for use (uniform style, avoids narrowing conversions).
Repeated Character Initialization <span>string s(5, 'a');</span> Special initialization method: creates a string containing 5 <span>'a'</span> characters (result is <span>"aaaaa"</span>).
Initialize with <span>string</span> Fragment <span>string s1("hello world"); string s2(s1, 6);</span> Initializes <span>s2</span> with “all characters starting from index 6” of <span>s1</span> (result is <span>"world"</span>, index starts from 0).

Example: Effects of Various Initialization Methods

#include <iostream>#include <string>using namespace std;int main() {    string s1;                // Default initialization: s1 is an empty string ("")    string s2 = "hello";      // Copy initialization: s2 = "hello"    string s3("world");       // Direct initialization: s3 = "world"    string s4{ s2 + s3 };     // List initialization: s4 = "helloworld"    string s5(3, '!');        // Repeated character initialization: s5 = "!!!"    string s6(s4, 5);         // Fragment initialization: s4 from index 5 is "world", so s6 = "world"    cout << s1 << "|" << s2 << "|" << s3 << "|" << s4 << "|" << s5 << "|" << s6 << endl;    // Output: |hello|world|helloworld|!!!|world (s1 is empty, so the first position is blank)    return 0;}

Pitfall: String literals are of type <span>const char[]</span>, which is different from <span>string</span> type, and cannot be directly compared using <span>==</span> (must be converted to <span>string</span> first), but <span>string</span> overloaded operators support mixed use with literals (e.g., <span>s2 == "hello"</span> is allowed, the compiler will automatically convert).

3.<span>string</span> Core Operations

<span>string</span> operations are divided into member functions (called via <span>.</span>, such as <span>s.size()</span>) and overloaded operators (used directly, such as <span>s1 + s2</span>), the following summarizes the most core operations by “usage frequency”, covering 90% of daily scenarios.

3.1 Accessing Characters: Two Methods (<span><span>[]</span></span> and <span><span>at()</span></span>)

<span>string</span> characters are stored in “contiguous memory space”, indexed from 0 (same as arrays), supporting two access methods:

(1) Subscript Operator<span>[]</span> (recommended, efficient)

  • Syntax: <span>s[index]</span>, returns the character at index <span>index</span> in the <span>string</span> (readable and writable, i.e., can modify the character).
  • Note: <span>index</span> must be within the valid range (0 ≤ index < s.size()), otherwise it will trigger “undefined behavior” (e.g., program crash, garbled output), <span>[]</span> does not perform bounds checking.

(2) Member Function<span>at()</span> (safe, slightly slower)

  • Syntax: <span>s.at(index)</span>, functions the same as <span>[]</span>, but will check for out-of-bounds — if <span>index</span> is invalid, it will throw an <span>out_of_range</span> exception (must be caught with <span>try-catch</span>, explained in later chapters on exceptions), preventing the program from crashing directly.

Example: Accessing and Modifying Characters

#include <iostream>#include <string>using namespace std;int main() {    string s = "hello";    // Access characters    cout << s[0] << endl;    // Output 'h' (index 0)    cout << s.at(3) << endl; // Output 'l' (index 3)    // Modify characters    s[1] = 'E';              // Change 'e' at index 1 to 'E', s becomes "hEllo"    s.at(4) = 'O';           // Change 'o' at index 4 to 'O', s becomes "hEllO"    cout << s << endl;       // Output "hEllO"    // Out-of-bounds access comparison    // cout << s[10];        // Undefined behavior: may crash (no bounds checking)    // s.at(10);             // Throws out_of_range exception (bounds checking)    return 0;}

Usage Recommendation: Use <span>[]</span> for daily development (efficient), if the index comes from user input or other uncertain scenarios, use <span>at()</span> (safe).

3.2 Concatenating Strings: Three Methods (<span>+</span>, <span>+=</span>, and <span>append()</span>)

Concatenation is one of the most commonly used operations on strings, <span>string</span> provides various flexible concatenation methods to meet different scenarios:

(1) Overloaded Operator<span>+</span> (concatenates two strings, generates a new object)

  • Syntax: <span>s3 = s1 + s2</span> or <span>s3 = s1 + "literal"</span> (<span>s1</span> must be of <span>string</span> type, <span>s2</span> or <span>"literal"</span> can be either <span>string</span> or string literal).
  • Characteristics: Generates a new<span>string</span> object (does not modify the original <span>s1</span>, <span>s2</span>), suitable for scenarios where the original string needs to be preserved.

(2) Overloaded Operator<span>+=</span> (concatenates and modifies the original object, efficient)

  • Syntax: <span>s1 += s2</span> or <span>s1 += "literal"</span>, appends <span>s2</span> or literal to the end of <span>s1</span>, directly modifying s1 (does not generate a new object).
  • Characteristics: More efficient than <span>+</span> (avoids memory allocation for new objects), suitable for scenarios where the original string does not need to be preserved.

(3) Member Function<span>append()</span> (flexible concatenation, supports fragments)

  • Syntax: <span>s1.append(s2, pos, len)</span>, appends <span>len</span> characters starting from <span>pos</span> in <span>s2</span> to the end of <span>s1</span> (if <span>len</span> is omitted, it defaults to appending to the end of <span>s2</span>).
  • Characteristics: Supports concatenating part of a string, high flexibility.

Example: Comparison of Three Concatenation Methods

#include <iostream>#include <string>using namespace std;int main() {    string s1 = "hello";    string s2 = "world";    // 1. Concatenate with + (generate new object)    string s3 = s1 + " " + s2;  // s3 = "hello world", s1, s2 unchanged    cout << s3 << endl;         // Output "hello world"    // 2. Concatenate with += (modify original object)    s1 += " ";                  // s1 = "hello "    s1 += s2;                   // s1 = "hello world"    cout << s1 << endl;         // Output "hello world"    // 3. Concatenate with append() for fragments    string s4 = "abcdef";    s4.append(s1, 6, 5);        // s1 from index 6 is "world", so s4 = "abcdefworld"    cout << s4 << endl;         // Output "abcdefworld"    return 0;}

Pitfall: The <span>+</span> operator requires “at least one operand to be of <span>string</span> type”, cannot directly concatenate two string literals (e.g., <span>"hello" + "world"</span> will compile error, as both are of type <span>const char[]</span>, the compiler cannot recognize them as <span>string</span>).

3.3 Querying Attributes: Length, Empty Check, Capacity

<span>string</span> provides multiple member functions to query its own attributes, the core being “length (<span>size()</span>/<span>length()</span>)” and “empty check (<span>empty()</span>)”:

Member Function Syntax Example Core Functionality
<span>size()</span> <span>s.size()</span> Returns the <span>number of characters</span> in the string (i.e., length, type is <span>size_t</span>, an unsigned integer), same functionality as <span>length()</span>.
<span>length()</span> <span>s.length()</span> Equivalent to <span>size()</span> (historical reason: early <span>string</span> mimicked C language <span>strlen</span><span> naming, later unified to recommend </span><code><span>size()</span>).
<span>empty()</span> <span>s.empty()</span> Checks if the <span>string</span> is empty (length 0), returns a <span>bool</span> value (<span>true</span> if empty, <span>false</span> if not empty), more efficient than <span>size() == 0</span>.
<span>capacity()</span> <span>s.capacity()</span> Returns the <span>maximum number of characters</span> that can be stored in the currently allocated memory for the <span>string</span> (not including the trailing <span></span>), used for optimizing memory (avoiding frequent resizing).

Pitfall: The return type of <span>size()</span> is <span>size_t</span> (unsigned integer), when comparing with <span>int</span> (signed integer), be cautious of type conversion — if using an <span>int</span> variable to store the result of <span>size()</span>, it may lead to overflow (e.g., if string length exceeds <span>int</span> maximum value), it is recommended to use <span>auto</span> or <span>size_t</span> to receive:

// Recommended: use size_t to receive size() resultsize_t len1 = s1.size();// Or use auto for type deductionauto len2 = s1.size();// Not recommended: int may overflow (if s1 length exceeds 2^31-1)int len3 = s1.size();

3.4 Modifying Strings: Clearing, Replacing, Inserting

In addition to concatenation and single character modification, <span>string</span> also supports batch modification operations, the core being “clearing (<span>clear()</span>)”, “replacing (<span>replace()</span>)”, and “inserting (<span>insert()</span>)”:

(1) Clear String: <span>clear()</span>

  • Syntax: <span>s.clear()</span>, makes <span>s</span> an empty string (length 0), but does not release the allocated memory (<span>capacity()</span> remains unchanged).
  • Scenario: When needing to refill string content, avoiding redefining the variable.

(2) Replace String: <span>replace()</span>

  • Syntax: <span>s.replace(pos, len, new_str)</span>, replaces the <span>len</span> characters starting from <span>pos</span> in <span>s</span> with <span>new_str</span> (<span>new_str</span> can be either <span>string</span> or literal).
  • Scenario: For batch modification of part of the string content.

(3) Insert String: <span>insert()</span>

  • Syntax: <span>s.insert(pos, new_str)</span>, inserts <span>new_str</span> at the “index <span>pos</span> position” of <span>s</span> (original <span>pos</span> and subsequent characters shift).
  • Scenario: To add a string at a specified position.
#include <iostream>#include <string>using namespace std;int main() {    string s = "hello world";    // 1. Clear string    string s1 = s;    s1.clear();    cout << "s1 cleared: " << s1 << " (length: " << s1.size() << ")" << endl;    // Output: s1 cleared: (length: 0)    // 2. Replace string    string s2 = s;    s2.replace(6, 5, "C++");    // Replace 5 characters starting from index 6 ("world") with "C++"    cout << "s2 after replacement: " << s2 << endl;  // Output: s2 after replacement: hello C++    // 3. Insert string    string s3 = s;    s3.insert(5, ",");          // Insert "," at index 5 (original index 5 is space, becomes "hello, world")    cout << "s3 after insertion: " << s3 << endl;  // Output: s3 after insertion: hello, world    return 0;}

4.<span>string</span> Input and Output: Working with <span>cin</span>/<span>cout</span>

<span>string</span> supports standard input and output streams (<span>cin</span>/<span>cout</span>), but it is important to note the difference between “normal input (<span>>></span>)” and “whole line input (<span>getline()</span>)” — this is the most common pitfall for beginners.

4.1 Normal Input: <span>cin >> s</span> (split by whitespace)

  • Syntax: <span>cin >> s</span>, reads characters from <span>cin</span>, until encountering whitespace (space, newline, tab), and automatically ignores leading whitespace.
  • Characteristics: Only reads “one word” (not including whitespace), suitable for reading a single string (e.g., username, word).
#include <iostream>#include <string>using namespace std;int main() {    string name;    cout << "Please enter your username: ";    cin >> name;  // If input is "Zhang San", only "Zhang" is read (content after space is discarded)    cout << "Welcome, " << name << "!" << endl;  // Output: Welcome, Zhang!    return 0;}

4.2 Whole Line Input: <span>getline(cin, s)</span> (reads a whole line including whitespace)

  • Syntax: <span>getline(cin, s)</span>, reads characters from <span>cin</span>, until encountering a newline character (<span>\n</span>), and discards the newline character (not stored in <span>s</span>).
  • Characteristics: Reads “a whole line of content” (including spaces), suitable for reading strings with spaces (e.g., address, sentences).
#include <iostream>#include <string>using namespace std;int main() {    string address;    cout << "Please enter your address: ";    getline(cin, address);  // If input is "Beijing Haidian", reads the complete content (including spaces)    cout << "Your address is: " << address << endl;  // Output: Your address is: Beijing Haidian    return 0;}

5.<span>string</span> Iterators: The “Universal Tool” for Traversing Strings

Although<span>[]</span> and <span>at()</span> can access characters, <span>string</span> iterators are a more general traversal tool — they support traversing from any position, reverse traversal, and are compatible with STL algorithms (such as <span>sort</span>, <span>find</span>).

5.1 Basic Usage of Iterators

Example 1: Traverse and Modify String with Normal Iterator

#include <iostream>#include <string>using namespace std;int main() {    string s = "hello";    // Get iterator: s.begin() points to the first character, s.end() points to the position after the last character (past-the-end iterator)    for (string::iterator it = s.begin(); it != s.end(); ++it) {        *it -= 32;  // Convert lowercase letters to uppercase (ASCII: 'a'-'z' is 97-122, 'A'-'Z' is 65-90, difference is 32)    }    cout << s << endl;  // Output "HELLO"    return 0;}

Example 2: Reverse Traversal with Reverse Iterator

#include <iostream>#include <string>using namespace std;int main() {    string s = "hello";    // s.rbegin() points to the last character, s.rend() points to the position before the first character    for (string::reverse_iterator it = s.rbegin(); it != s.rend(); ++it) {        cout << *it;  // Outputs 'o', 'l', 'l', 'e', 'h' in order    }    cout << endl;  // Output "olleh"    return 0;}

5.2 Simplified Iterators: <span>auto</span> for Type Deduction

<span>string::iterator</span> and similar type names are quite long, C++11’s <span>auto</span> can automatically deduce iterator types, simplifying the code:

// Use auto to simplify normal iteratorfor (auto it = s.begin(); it != s.end(); ++it) {    *it -= 32;}// Use auto to simplify reverse iteratorfor (auto it = s.rbegin(); it != s.rend(); ++it) {    cout << *it;}

Leave a Comment