FIO02-J. Do not assume that read() has filled all the elements of an array

The contracts of the read methods for the InputStream and Reader families are complicated. According to the Java API [API 2006] for the class InputStream, the read(byte[], int, int) method provides the following behavior:

The default implementation of this method blocks until the requested amount of input data len has been read, end of file is detected, or an exception is thrown. Subclasses are encouraged to provide a more efficient implementation of this method.

However, the read(byte[]) method states that it:

Reads some number of bytes from the input stream and stores them into the buffer array b. The number of bytes actually read is returned as an integer. The number of bytes read is, at most, equal to the length of b.

Note that the read() methods return as soon as they find available input data. So it is quite possible for read() to quit reading data before the array is filled, because there might not be enough data available to fill the array.

Ignoring the result returned by the read() methods is a violation of rule EXP00-J. Do not ignore values returned by methods. Security issues can arise even when return values are considered, because the default behavior of the read() methods lacks any guarantee that the entire buffer array will be filled. Therefore, when using read() to fill an array, the program must check the return value of read(), and handle the case where the array is only partially filled. In such cases, the program may try to fill the rest of the array, or work only with the subset of the array that was filled, or throw an exception.

This rule applies only to the read() methods that take an array argument. To read a single byte, use the form of InputStream.read() that takes no arguments and returns an int, indicating the byte read. to read a single character, use the form of Reader.read() that takes no arguments and returns an int, indicating the next character read.

Noncompliant Code Example (`read()`)

This noncompliant code example attempts to read 1024 bytes encoded in UTF-8 from an InputStream and to return them as a String. It explicitly specifies the the encoding to build the string, in compliance with IDS17-J. Use compatible encodings on both sides of file or network IO.

public static String readBytes(InputStream in) throws IOException {
  byte[] data = new byte[1024];
  if (in.read(data) == -1) {
    throw new EOFException();
  }
  return new String(data, "UTF-8");
}

The programmer's misunderstanding of the general contract of the read() methods can result in failure to read the intended data in full. It is possible that the data is less than 1024 bytes long, with additional data available from the InputStream.

Compliant Solution (Multiple calls to `read()`)

This compliant solution reads all the desired bytes into its buffer, accounting for the total number of bytes read and adjusting the remaining bytes' offset, thus ensuring that the required data are read in full. It also avoids splitting multibyte encoded characters across buffers by deferring construction of the result string until the data have been read in full, see IDS13-J. Do not assume every character in a string is the same size for more information.

public static String readBytes(InputStream in) throws IOException {
  int offset = 0;
  int bytesRead = 0;
  byte[] data = new byte[1024];
  while (true) { 
    bytesRead += in.read(data, offset, data.length - offset);
    if (bytesRead == -1 || offset >= data.length)
      break;
    offset += bytesRead;
  }
  String str = new String(data, "UTF-8");
  return str;
}

Compliant Solution (`readFully()`)

The no-argument and one-argument readFully() methods of the DataInputStream class guarantee that they read all of the requested data or throw an exception. These methods throw EOFException if they detect the end of input before the required number of bytes have been read; they throw IOException if some other input/output error occurs.

public static String readBytes(FileInputStream fis) throws IOException {
  byte[] data = new byte[1024];
  DataInputStream dis = new DataInputStream(fis);
  dis.readFully(data);
  String str = new String(data, "UTF-8");
  return str;
}

Risk Assessment

Failure to comply with this rule can result in the wrong number of bytes being read or character sequences being interpreted incorrectly.

Rule	Severity	Likelihood	Remediation Cost	Priority	Level
FIO02-J	low	unlikely	medium	P2	L3

Automated Detection

TODO

Related Vulnerabilities

Search for vulnerabilities resulting from the violation of this rule on the CERT website.

Related Guidelines

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="85c64436-4e5f-47ca-b165-d64726c977c5"><ac:plain-text-body><![CDATA[

[[MITRE 2009

AA. Bibliography#MITRE 09]]

[CWE ID 135

http://cwe.mitre.org/data/definitions/135.html] "Incorrect Calculation of Multi-Byte String Length"

]]></ac:plain-text-body></ac:structured-macro>

Bibliography

[API 2006]	Class `InputStream`, `DataInputStream`
[Chess 2007]	8.1 Handling Errors with Return Codes
[Harold 1999]	Chapter 7: Data Streams, Reading Byte Arrays
[Phillips 2005]

FIO01-J. Do not expose buffers created using the wrap() or duplicate() methods to untrusted code 12. Input Output (FIO) FIO03-J. Create files with appropriate access permissions

Space shortcuts

Page tree

Noncompliant Code Example (`read()`)

Compliant Solution (Multiple calls to `read()`)

Compliant Solution (`readFully()`)

Risk Assessment

Automated Detection

Related Vulnerabilities

Related Guidelines

Bibliography

Space shortcuts

Page tree

FIO02-J. Do not assume that read() has filled all the elements of an array

Noncompliant Code Example (read())

Compliant Solution (Multiple calls to read())

Compliant Solution (readFully())

Risk Assessment

Automated Detection

Related Vulnerabilities

Related Guidelines

Bibliography

Noncompliant Code Example (`read()`)

Compliant Solution (Multiple calls to `read()`)

Compliant Solution (`readFully()`)