According to MISRA 2008, concatenation of wide and narrow string literals leads to undefined behavior. This was once considered implicitly undefined behavior until C90 [ISO/IEC 9899:1990]. However, C99 defined this behavior [ISO/IEC 9899:1999], and C11 further explains in subclause 6.4.5, paragraph 5 [ISO/IEC 9899:2011]:
In translation phase 6, the multibyte character sequences specified by any sequence of adjacent character and identically-prefixed string literal tokens are concatenated into a single multibyte character sequence. If any of the tokens has an encoding prefix, the resulting multibyte character sequence is treated as having the same prefix; otherwise, it is treated as a character string literal. Whether differently-prefixed wide string literal tokens can be concatenated and, if so, the treatment of the resulting multibyte character sequence are implementation-defined.
Nonetheless, it is recommended that string literals that are concatenated should all be the same type so as not to rely on implementation-defined behavior or undefined behavior if compiled on a platform that supports only C90.
This noncompliant code example concatenates wide and narrow string literals. Although the behavior is undefined in C90, the programmer probably intended to create a wide string literal.
wchar_t *msg = L"This message is very long, so I want to divide it " "into two parts."; |
If the concatenated string needs to be a wide string literal, each element in the concatenation must be a wide string literal, as in this compliant solution:
wchar_t *msg = L"This message is very long, so I want to divide it " L"into two parts."; |
If wide string literals are unnecessary, it is better to use narrow string literals, as in this compliant solution:
char *msg = "This message is very long, so I want to divide it " "into two parts."; |
The concatenation of wide and narrow string literals could lead to undefined behavior.
Rule | Severity | Likelihood | Remediation Cost | Priority | Level |
---|---|---|---|---|---|
STR10-C | Low | Probable | Medium | P4 | L3 |
Tool | Version | Checker | Description |
---|---|---|---|
Astrée | encoding-mismatch | Fully checked | |
Axivion Bauhaus Suite | CertC-STR10 | ||
ECLAIR | CC2.STR10 | Fully implemented. | |
Helix QAC | C0874 | ||
LDRA tool suite | 450 S | Fully implemented | |
Parasoft C/C++test | CERT_C-STR10-a | Narrow and wide string literals shall not be concatenated | |
PC-lint Plus | 707 | Fully supported | |
SonarQube C/C++ Plugin | NarrowAndWideStringConcat | ||
RuleChecker | encoding-mismatch | Fully checked |
Search for vulnerabilities resulting from the violation of this rule on the CERT website.
MISRA C++:2008 | Rule 2-13-5 |
[ISO/IEC 9899:2011] | Section 6.4.5, "String Literals" |