You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Using pointer arithmetic such that the result does not point into or just past the end of the same object, using such pointers in arithmetic expressions, or dereferencing pointers that do not point to a valid object in memory results in potentially exploitable undefined behavior and must be avoided.

Likewise, using an array subscript such that the resulting reference does not refer to an element in the array also results in potentially exploitable undefined behavior and must be avoided.

The C99 standard [[ISO/IEC 9899:1999]] identifies four distinct situations in which undefined behavior (UB) may arise as a result of invalid pointer operations:

UB

Description

43

Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that does not point into, or just beyond, the same array object.

44

Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that points just beyond the array object and is used as the operand of a unary * operator that is evaluated.

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="d72892b6-f423-4fc0-860a-e16bd344dfec"><ac:plain-text-body><![CDATA[

[46

CC. Undefined Behavior#ub_46]

An array subscript is out of range, even if an object is apparently accessible with the given subscript (as in the lvalue expression a[1][7] given the declaration int a[4][5]).

]]></ac:plain-text-body></ac:structured-macro>

59

An attempt is made to access, or generate a pointer to just past, a flexible array member of a structure when the referenced object provides no elements for that array.

Noncompliant Code Example (Forming Out Of Bounds Pointer)

In the following noncompliant code example the function f() attempts to validate the index before using it as an offset to the statically allocated table of integers. However, the function fails to reject negative index values. When index is less than zero, the behavior of the addition expression in the return statement of the function has undefined behavior 43. On some implementations the addition alone may trigger a hardware trap. On other implementations, using the result of the addition or dereferencing it may cause a similar manifestation of undefined behavior.

enum { TABLESIZE = 100 };

static int table[TABLESIZE];

int* f(int index) {
  if (index < TABLESIZE)
    return table + index;

  return NULL;
}

Compliant Solution

One compliant solution is to detect and reject invalid values of index when using them in the pointer arithmetic expression would result in the formation of an invalid pointer.

enum { TABLESIZE = 100 };

static int table[TABLESIZE];

int* f(int index) {
  if (0 <= index && index < TABLESIZE)
    return table + index;

  return NULL;
}

Another, slightly simpler and potentially more efficient compliant solution is to use an unsigned type to avoid having to check for negative values while still rejecting out of bounds positive values of index.

enum { TABLESIZE = 100 };

static int table[TABLESIZE];

int* f(size_t index) {
  if (index < TABLESIZE)
    return table + index;

  return NULL;
}

Noncompliant Code Example (Dereferencing Out Of Bounds Pointer)

The noncompliant code example below shows the flawed logic in the Windows Distributed Component Object Model (DCOM) Remote Procedure Call (RPC) interface that was exploited by the W32.Blaster.Worm. The error is that the while loop in the GetMachineName() function (used to extract the host name from a longer string) is not sufficiently bounded. When the character array pointed to by pwszTemp does not contain the backslash character among the first MAX_COMPUTERNAME_LENGTH_FQDN + 1 elements the final valid iteration of the loop will dereference the past the end pointer resulting in exploitable undefined behavior 44. In this case, the actual exploit allowed the attacker to inject executable code into a running program. Economic damage from the Blaster worm has been estimated to be at least $525 million [[Pethia 03]].

error_status_t _RemoteActivation(
      /* ... */, WCHAR *pwszObjectName, ... ) {
   *phr = GetServerPath(
              pwszObjectName, &pwszObjectName);
    /* ... */
}

HRESULT GetServerPath(
  WCHAR *pwszPath, WCHAR **pwszServerPath ){
  WCHAR *pwszFinalPath = pwszPath;
  WCHAR wszMachineName[MAX_COMPUTERNAME_LENGTH_FQDN+1];
  hr = GetMachineName(pwszPath, wszMachineName);
  *pwszServerPath = pwszFinalPath;
}

HRESULT GetMachineName(
  WCHAR *pwszPath,
  WCHAR wszMachineName[MAX_COMPUTERNAME_LENGTH_FQDN+1])
{
  pwszServerName = wszMachineName;
  LPWSTR pwszTemp = pwszPath + 2;
  while ( *pwszTemp != L'\\' )
    *pwszServerName++ = *pwszTemp++;
  /* ... */
}

Compliant Solution

In this compliant solution, the while loop in the GetMachineName() function is bounded so that the loop terminates when a backslash character is found, the null termination character (L'\0' is discovered, or the end of the buffer is reached. This code does not result in a buffer overflow, even if no L'
'
character is found in wszMachineName.

HRESULT GetMachineName(
  wchar_t *pwszPath,
  wchar_t wszMachineName[MAX_COMPUTERNAME_LENGTH_FQDN+1])
{
  wchar_t *pwszServerName = wszMachineName;
  wchar_t *pwszTemp = pwszPath + 2;
  wchar_t *end_addr
    = pwszServerName + MAX_COMPUTERNAME_LENGTH_FQDN;
  while ( (*pwszTemp != L'\\')
     &&  ((*pwszTemp != L'\0'))
     && (pwszServerName < end_addr) )
  {
    *pwszServerName++ = *pwszTemp++;
  }

  /* ... */
}

This compliant solution is for illustrative purposes and is not necessarily the solution implemented by Microsoft. This particular "solution" may not be correct, because there is no guarantee that a L'
'
is found.

Noncompliant Code Example (Apparently Accessible Out Of Range Index)

The noncompliant example below declares matrix to consist of 7 rows and 5 columns in row-major order. The function init_matrix then iterates over all 35 elements in an attempt to initialize each to the value given by the function argument x. However, since multidimensional arrays are declared in C in row-major order and the function iterates over the elements in column-major order, when the value of j reaches the value COLS during the first iteration of the outer loop the function attempts to access element matrix[0][5]. Since the type of matrix is int[7][5], the j subscript is out of range and the access has undefined behavior 46.

static const size_t COLS = 5;
static const size_t ROWS = 7;

static int matrix[ROWS][COLS];

void init_matrix(int x) {
  for (size_t i = 0; i != COLS; ++i)
    for (size_t j = 0; j != ROWS; ++j)
      matrix[i][j] = x;
}

Compliant Solution

The compliant solution below takes care to avoid using out-of-range indices by initializing matrix elements in the same row-major order as multidimensional objects are declared in C.

static const size_t COLS = 5;
static const size_t ROWS = 7;

static int matrix[ROWS][COLS];

void init_matrix(int x) {
  for (size_t i = 0; i != ROWS; ++i)
    for (size_t j = 0; j != COLS; ++j)
      matrix[i][j] = x;
}

Noncompliant Code Example (Pointer Past Flexible Array Member)

In the following noncompliant example the function f attempts to iterate over the elements of the flexible array member buf, starting with the second element. However, since function g does not allocate any storage for the member, the expression first++ in find() will attempt to form a pointer just past the end of buf when there are no elements. This attempt results in undefined behavior 59.

struct S {
  size_t len;
  char   buf[];   /* flexible array member */
};

char* find(const struct S *s, int c) {
  char *first = s->buf;
  char *last  = s->buf + s->len;

  while (first++ != last)   /* undefined behavior here */
    if (*first == (unsigned char)c)
      return first;

  return NULL;
}

void g() {
  struct S *s = (struct S*)malloc(sizeof (struct S));
  s->len = 0;
  /* ... */
  char *where = find(s, '.');
  /* ... */
}

Compliant Solution

The compliant solution avoids incrementing the pointer unless a value past the end is known to exist.

struct S {
  size_t len;
  char   buf[];   /* flexible array member */
};

char* find(const struct S *s, int c) {
  char *first = s->buf;
  char *last  = s->buf + s->len;

  while (first != last)   /* avoid incrementing here */
    if (*++first == (unsigned char)c)
      return first;

  return NULL;
}

void g() {
  struct S *s = (struct S*)malloc(sizeof (struct S));
  s->len = 0;
  /* ... */
  char *where = find(s, '.');
  /* ... */
}

Automated Detection

The Coverity Prevent Version 5.0 ARRAY_VS_SINGLETON checker can detect the access of memory past the end of a memory buffer/array. The NEGATIVE_RETURNS checker can detect when the loop bound may become negative. The OVERRUN_STATIC and OVERRUN_DYNAMIC checker can detect the out of bound read/write to array allocated statically or dynamically.

Compass/ROSE could be configured to catch violations of this rule. The way to catch the NCE is to first hunt for example code that follows this pattern:

for (LPWSTR pwszTemp = pwszPath + 2; *pwszTemp != L'\\'; *pwszTemp++;)

In particular, the iteration variable is a pointer, it gets incremented, and the loop condition does not set an upper bound on the pointer.

Once this case is handled, we can handle cases like the real NCE, which is effectively the same semantics, just different syntax.

Klocwork can detect violations of this rule with the ABV.ITERATOR and SV.TAINTED.LOOP_BOUND checker.  See Klocwork Cross Reference

Related Vulnerabilities

CVE-2008-1517 results from a violation of this rule. Before Mac OSX version 10.5.7, the xnu kernel accessed an array at an unverified, user-input index, allowing an attacker to execute arbitrary code by passing an index greater than the length of the array and therefore accessing outside memory [xorl 2009].

Search for vulnerabilities resulting from the violation of this rule on the CERT website.

Other Languages

TO DO.

References

[[ISO/IEC 9899:1999]] Section 6.7.5.2, "Array declarators"
[[ISO/IEC PDTR 24772]] "XYX Boundary Beginning Violation," "XYY Wrap-around Error," and "XYZ Unchecked Array Indexing"
[[CWE]] CWE-119: Failure to Constrain Operations within the Bounds of a Memory Buffer
[[CWE]] CWE-129: Unchecked Array Indexing
[[Finlay 03]]
[[Microsoft 03]]
[[Pethia 03]]
[[Seacord 05a]] Chapter 1, "Running with Scissors"
[[Viega 05]] Section 5.2.13, "Unchecked array indexing"
[[xorl 2009] ] "CVE-2008-1517: Apple Mac OS X (XNU) Missing Array Index Validation"


ARR02-C. Explicitly specify array bounds, even if implicitly defined by an initializer      06. Arrays (ARR)      ARR31-C. Use consistent array notation across all source files

  • No labels