Previous Highlights:CMake Hello, WorldCMake VariablesCMake Official Tutorial (Basic Project Setup)CMake Official Tutorial (Library Creation)CMake Official Tutorial (Usage Requirements)CMake Official Tutorial (Installation and Testing)CMake Common Syntax (if Statements)CMake Common Syntax (Cache Variables)CMake Common Syntax (Environment Variables)CMake Common Syntax (Mathematical Calculations)
In mainstream programming languages, strings are a fundamental and core data type, which is crucial, and the same is true in CMake.
To facilitate users in performing various operations on strings, CMake provides the <span>string</span>
command, which can be categorized by functionality as follows:
- 1. Search and Replace
- 2. String Manipulation
- 3. Comparison
- 4. Hashing
- 5. Generation
- 6. JSON Handling
The official documentation provides detailed explanations:
https://cmake.org/cmake/help/latest/command/string.html
Here, we will introduce them according to their functionality and provide some usage scenarios.
Search and Replace
The general syntax for search and replace is as follows:
string(FIND <string> <substring> <out-var> [...])
string(REPLACE <match-string> <replace-string> <out-var> <input>...)
Actual usage scenarios:
# FIND returns the index of the first occurrence of the substring in the string, returns -1 if not found
set(str "Hello,This Is a string,This is a test")
string(FIND ${str} "This" pos)
message("pos=${pos}") #6 Position starts from 0
string(FIND ${str} "this" pos)
message("pos=${pos}") #-1
# REPLACE
string(REPLACE "a" "A" out "a,ab,abc,abcd")
message("out=${out}")
CMake also provides regex-based search and replace, which allows for even more possibilities.
# Extract the first match
string(REGEX MATCH <match-regex> <out-var> <input>...)
# Extract all matches
string(REGEX MATCHALL <match-regex> <out-var> <input>...)
# Batch replace text
string(REGEX REPLACE <match-regex> <replace-expr> <out-var> <input>...)
Here are some examples:
# Regex match version number (format: major.minor.patch)
set(CONTENT "Version: 2.5.1\nDate: 2024-07-20")
string(REGEX MATCH "Version: ([0-9]+\.[0-9]+\.[0-9]+)" MATCHED_STR ${CONTENT})
if(MATCHED_STR)
message("Full match:${MATCHED_STR}") #Version: 2.5.1
else()
message("Version number not found")
endif()
# Match all error codes starting with ERR
# Results stored as a list separated by ;
set(LOG "ERR404: Not Found\nERR500: Server Error\nWARN100: Low Disk")
string(REGEX MATCHALL "ERR[0-9]+" ERROR_CODES ${LOG})
message("Error code list:${ERROR_CODES}") # Output: ERR404;ERR500
# Batch replace text
set(CODE "#include <iostream.h>\n#include <stdio.h>")
# Replace all .h suffix header files (excluding C standard library headers like stdio.h)
string(REGEX REPLACE "([a-z]+)\.h>" "\1>" NEW_CODE ${CODE})
message("New code:\n${NEW_CODE}")
String Manipulation
The syntax overview is as follows:
# Append and Prepend
string(APPEND <string-var> [<input>...])
string(PREPEND <string-var> [<input>...])
# String concatenation
string(CONCAT <out-var> [<input>...])
string(JOIN <glue> <out-var> [<input>...])
# Case conversion
string(TOLOWER <string> <out-var>)
string(TOUPPER <string> <out-var>)
# Trimming and Length
string(LENGTH <string> <out-var>)
string(SUBSTRING <string> <begin> <length> <out-var>)
string(STRIP <string> <out-var>)
# Remove generator expressions
string(GENEX_STRIP <string> <out-var>)
# Repeat string multiple times
string(REPEAT <string> <count> <out-var>)
Examples for better understanding:
cmake_minimum_required(VERSION 3.12)
project(cmake_string)
# Append content to variable
set(MSG "Hello")
string(APPEND MSG " world! " "This Test!") # Modify original variable
message(${MSG})
# Prepend content to variable
set(PATH "/usr/bin")
string(PREPEND PATH "/local") # Modify original variable
message("PATH:${PATH}")
# Directly concatenate multiple inputs
string(CONCAT FULL_NAME "John" " " "Doe")
message("Name:${FULL_NAME}") # Output: John Doe
# Join with a separator
string(JOIN "/" INCLUDE_DIR "include" "src" "tests") # Automatically handle separators
message("INCLUDE_DIR:${INCLUDE_DIR}") # Output: include/src/tests
# Case conversion
string(TOLOWER "Hello World" lower_str)
message(${lower_str})
string(TOUPPER "important" upper_str)
message(${upper_str})
# Calculate byte length (not character count)
string(LENGTH "你好,World" len) # In UTF-8, "你好" occupies 6 bytes
message("Length: ${len}") # 12
# Substring extraction (by byte position)
string(SUBSTRING "abcdef" 2 substr) # Start from byte 2, take 3 bytes
message("Substring: ${substr}") # Output: cde
# Trim leading and trailing spaces
string(STRIP " Trim this " stripped)
message("Stripped: '${stripped}'") # Output: 'Trim this'
# Repeat string multiple times
string(REPEAT "-" 10 separator) # Generate separator line
message("${separator}") # Output: ----------
Comparison and Hashing
The syntax overview is as follows:
string(COMPARE <op> <string1> <string2> <out-var>)
string(<HASH> <out-var> <input>)
<span>string(COMPARE)</span>
is used to compare two strings, setting a boolean output variable based on the comparison result (e.g., equal, not equal, less than, etc.). The comparison operators include:
- •
<span>EQUAL</span>
: Strings are equal - •
<span>NOTEQUAL</span>
: Strings are not equal - •
<span>LESS</span>
: String is less than (lexicographically) - •
<span>GREATER</span>
: String is greater than (lexicographically) - •
<span>LESS_EQUAL</span>
: String is less than or equal to (CMake ≥3.7) - •
<span>GREATER_EQUAL</span>
: String is greater than or equal to (CMake ≥3.7)
<span>string(HASH)</span>
is used to generate a hash value for the input string, supporting various hash algorithms such as MD5, SHA1, SHA3_224, etc.
# Check if the current version is >= required version
set(REQUIRED_VERSION "2.8")
set(CURRENT_VERSION "3.1")
string(COMPARE GREATER_EQUAL "${CURRENT_VERSION}" "${REQUIRED_VERSION}" IS_COMPATIBLE)
if(IS_COMPATIBLE)
message("Version compatible")
else()
message(FATAL_ERROR "At least version ${REQUIRED_VERSION} is required")
endif()
#
set(TEXT "Hello, CMake!")
string(SHA256 HASH_VALUE "${TEXT}")
message("SHA256 hash value: ${HASH_VALUE}")
Generation
The syntax overview is as follows:
# Convert number to ASCII character
string(ASCII <number>... <out-var>)
# Convert string to hexadecimal
string(HEX <string> <out-var>)
# Dynamically configure string
string(CONFIGURE <string> <output_variable> [@ONLY] [ESCAPE_QUOTES])
# Generate valid C identifier
string(MAKE_C_IDENTIFIER <string> <out-var>)
# Generate random string
string(RANDOM [LENGTH <length>] [ALPHABET <alphabet>] [RANDOM_SEED <seed>] <output_variable>)
# Generate timestamp
string(TIMESTAMP <out-var> [<format string>] [UTC])
# Generate UUID
string(UUID <output_variable> NAMESPACE <namespace> NAME <name> TYPE <MD5|SHA1> [UPPER])
Classic usage scenarios are as follows:
Command | Typical Scenario |
<span>ASCII</span> |
Generate control characters (e.g., newline character<span>\n</span> =10), protocol header byte sequences |
<span>HEX</span> |
Debugging binary data (e.g., checksum file content), network protocol encoding |
<span>CONFIGURE</span> |
Dynamically generate code/configuration files (no additional file templates needed) |
<span>MAKE_C_IDENTIFIER</span> |
Normalize user input to variable names, generate enumeration constants |
<span>RANDOM</span> |
Temporary file/directory names, test data, one-time passwords |
<span>TIMESTAMP</span> |
Build log timestamps, version numbers containing dates (e.g.,<span>v1.2.3-20240720</span> ) |
<span>UUID</span> |
Unique resource identifiers in distributed systems, avoid naming conflicts |
Simple examples are as follows:
#
string(ASCII 656667 letters) #65=A, 66=B, 67=C
message("Letters:${letters}") # Output: ABC
#
string(HEX "Hello" hex_str)
message("HEX: ${hex_str}") # Output: 48656c6c6f
# Input string contains variable placeholders
set(USER_NAME "Alice")
set(AGE 30)
string(CONFIGURE "User: @USER_NAME@, Age: ${AGE}" configured_str)
message("${configured_str}") # Output: User: Alice, Age: 30
#
string(MAKE_C_IDENTIFIER "123-Data File" c_var)
message("C Identifier: ${c_var}") # Output: _123_Data_File
#
string(RANDOM LENGTH 8 ALPHABET "abcdefghijklmnopqrstuvwxyz0123456789" tmp_dir)
message("Temporary directory:${tmp_dir}") # Generate an 8-character random string containing letters and numbers
string(RANDOM LENGTH 6 ALPHABET "0123456789" RANDOM_SEED 42 random_code)
message("Fixed random code:${random_code}") # Outputs the same result each time
#
string(TIMESTAMP BUILD_TIME "%Y-%m-%dT%H:%M:%S" UTC)
message("UTC time: ${BUILD_TIME}")
# Based on DNS namespace generate Version 5 UUID (SHA1)
string(UUID LIB_UUID
NAMESPACE "6ba7b810-9dad-11d1-80b4-00c04fd430c8" # DNS namespace UUID
NAME "MyLibrary"
TYPE SHA1
UPPER
)
message("UUID: ${LIB_UUID}")
Several command details need to be elaborated:
The CONFIGURE command is used for dynamically replacing variable placeholders in strings, similar to <span>configure_file()</span>
, but directly operates on strings rather than files. The differences are:
Feature | <span>string(CONFIGURE)</span> |
<span>configure_file()</span> |
Input Type | Directly operates on strings | Requires template files (e.g., <span>.h.in</span> ) |
Output Target | Variable | Generates files |
Flexibility | Suitable for simple dynamic content | Suitable for complex templates (e.g., multi-line code generation) |
Performance | Lightweight | File IO may affect performance |
The two optional parameters mean:
- •
<span>@ONLY</span>
: Only replace<span>@VAR@</span>
format variables (do not replace<span>${VAR}</span>
or<span>$VAR</span>
) - •
<span>ESCAPE_QUOTES</span>
: Retain quotes in the replaced string (quotes are removed by default)
The MAKE_C_IDENTIFIER command converts each <span><string></span>
non-alphanumeric character (i.e., non-letter, non-digit characters) into underscores, storing the result in <span><output_variable></span>
. (Additional rule: If the first character of the input string is a digit, an underscore will be added at the beginning of the converted result.)
In the UUID command, the <span>NAMESPACE</span>
parameter must be a valid UUID string (following RFC 4122 format). Here are common predefined namespace UUIDs and rules for custom namespaces:
Namespace Type | UUID Value | Typical Use |
DNS | <span>6ba7b810-9dad-11d1-80b4-00c04fd430c8</span> |
Generate UUID based on domain name (e.g., <span>example.com</span> ) |
URL | <span>6ba7b811-9dad-11d1-80b4-00c04fd430c8</span> |
Generate UUID based on URL (e.g., <span>https://...</span> ) |
OID | <span>6ba7b812-9dad-11d1-80b4-00c04fd430c8</span> |
Generate UUID based on object identifier (ISO OID) |
X.500 DN | <span>6ba7b814-9dad-11d1-80b4-00c04fd430c8</span> |
Generate UUID based on X.500 directory name |
Custom Namespace | User can generate any UUID that complies with RFC 4122 | For internal uniqueness guarantee within projects or organizations |
JSON
The bracket syntax can be used to handle strings containing special characters or nested brackets, avoiding the hassle of escaping, in the following form:
[=[String Content]=]
The starting symbol is <span>[=[</span>
, the ending symbol is <span>]=]</span>
, and the number of <span>=</span>
must be consistent (can be 0 or more), allowing us to easily define JSON:
set(tjson
[=[
{
"webs":{
"web":[
{
"name":"cmake",
"url":"cmake.org.cn"
},{
"name":"ffmpeg",
"url":"ffmpeg.club"
}
,{
"name":"tt",
"url":"tt.club"
}
]
}
}
]=])
CMake ≥ 3.19 supports JSON operations, providing query, modify, and compare operations, with relatively complex syntax as follows:
Query Operations
1. <span>GET</span>
: Get the value of a JSON element
Syntax:
string(JSON <out-var> [ERROR_VARIABLE <error-var>]
GET <json-string> <member|index> [<member|index> ...])
Function: Get the value of a JSON element by path (member name or index), with the result type automatically converted based on the JSON data type:
- • Object/Array → Returns JSON string
- • Boolean →
<span>ON</span>
/<span>OFF</span>
- • Numeric/String → String
- • null → Empty string
Example:
set(JSON_DATA [[
{
"project": {
"name": "MyApp",
"version": [1, 2, 3],
"debug": true
}
}
]])
# Get nested value
# project->name
string(JSON NAME GET "${JSON_DATA}" project name) # NAME = "MyApp"
# project->version[0]
string(JSON VERSION GET "${JSON_DATA}" project version 1) # VERSION = "2"
# project->debug
string(JSON DEBUG GET "${JSON_DATA}" project debug) # DEBUG = "ON"
2. <span>TYPE</span>
: Get the type of a JSON element
Syntax:
string(JSON <out-var> [ERROR_VARIABLE <error-var>]
TYPE <json-string> <member|index> [<member|index> ...])
Function: Returns the type of the specified element (in string form):<span>NULL</span>
, <span>NUMBER</span>
, <span>STRING</span>
, <span>BOOLEAN</span>
, <span>ARRAY</span>
, <span>OBJECT</span>
Example:
string(JSON TYPE_NAME TYPE "${JSON_DATA}" project version)
message("Type: ${TYPE_NAME}") # Output: ARRAY
3. <span>LENGTH</span>
: Get the length of an array/object
Syntax:
string(JSON <out-var> [ERROR_VARIABLE <error-var>]
LENGTH <json-string> [<member|index> ...])
Function: Returns the number of elements in an array or object.
Example:
string(JSON VER_LEN LENGTH "${JSON_DATA}" project version)
message("Version Length: ${VER_LEN}") # Output: 3
4. <span>MEMBER</span>
: Get the name of an object member
Syntax:
string(JSON <out-var> [ERROR_VARIABLE <error-var>]
MEMBER <json-string>
[<member|index> ...] <index>)
Function: Returns the name of the member at the specified index position in the object (index starts from 0).
Example:
set(JSON_DATA [[
{
"project": {
"name": "MyApp",
"version": [1, 2, 3],
"debug": true
}
}
]])
string(JSON MEMBER_NAME MEMBER "${JSON_DATA}" project 0)
message("First key: ${MEMBER_NAME}")
The above code expects to return the member name at index position 0, and based on the variable definition, the output result should be name, but the actual result is debug.
The possible reasons are:
- 1. JSON object members are sorted alphabetically during parsing, causing index 0 to correspond to
<span>debug</span>
instead of<span>name</span>
. - 2. CMake’s JSON parsing implementation does not preserve the insertion order of object members.
Therefore, the advice given in actual usage is:
- • Avoid relying on JSON object order: The order of key names is unpredictable, access directly by key name.
- • Use
<span>GET</span>
instead of<span>MEMBER</span>
:<span>string(JSON GET)</span>
can directly obtain the target value without focusing on the index.
Modification Operations
5. <span>SET</span>
: Set or add elements
Syntax:
string(JSON <out-var> [ERROR_VARIABLE <error-var>]
SET <json-string>
<member|index> [<member|index> ...] <value>)
Function: Modify or add the value of an element at the specified path. If the path does not exist and the parent is an object, a new member will be created (the value of value must be JSON).
Example:
set(JSON_DATA [[
{
"project": {
"name": "MyApp",
"version": [1, 2, 3],
"debug": true
}
}
]])
# Modify version number to [1, 5, 0]
# project->version[1] = 5
string(JSON NEW_JSON SET "${JSON_DATA}" project version 15)
message("New JSON:\n${NEW_JSON}")
# Insert test object under project, value is {"hello":"world"}
string(JSON NEW_JSON SET "${JSON_DATA}" project test [=[{"hello":"world"}]=])
message("NEW_JSON:\n${NEW_JSON}")
6. <span>REMOVE</span>
: Delete elements
Syntax:
string(JSON <out-var> [ERROR_VARIABLE <error-var>]
REMOVE <json-string> <member|index> [<member|index> ...])
Function: Delete the specified path element, returning the modified JSON string.
Example:
string(JSON NEW_JSON REMOVE "${JSON_DATA}" project debug)
message("After remove:\n${NEW_JSON}")
Comparison Operations
7. <span>EQUAL</span>
: Compare if two JSONs are equal
Syntax:
string(JSON <out-var> [ERROR_VARIABLE <error-var>]
EQUAL <json-string1> <json-string2>)
Function: If the two JSON structures and values are exactly the same, then <span><out-var></span>
will be <span>TRUE</span>
, otherwise it will be <span>FALSE</span>
.
Example:
set(JSON1 [[{"a": 1}]])
set(JSON2 [[{"a": 1}]])
string(JSON IS_EQUAL EQUAL "${JSON1}" "${JSON2}")
message("Equal: ${IS_EQUAL}") # Output: TRUE
Error Handling
- •
<span>ERROR_VARIABLE</span>
: If this parameter is specified, the error message will be stored in<span><error-var></span>
, otherwise a fatal error will be triggered. - • Error Format: The value of
<span><error-var></span>
is the error description, such as<span>"No such member 'invalid_key'"</span>
.
Example:
string(JSON VALUE ERROR_VARIABLE ERR GET "${JSON_DATA}" invalid_key)
if(ERR)
message(WARNING "Error: ${ERR}") # Output: Error: ...No such member...
endif()