CERT
Skip to end of metadata
Go to start of metadata

A Uniform Resource Locator (URL) specifies both the location of a resource and also a method to access it. A Uniform Resource Identifier (URI) contains a string of characters used to identify a resource; this is a more general concept than a URL.

According to the Java API [API 2006], class URL documentation

Two hosts are considered equivalent if both host names can be resolved into the same IP addresses; else if either host name can't be resolved, the host names must be equal without regard to case; or both host names equal to null.

The concept of virtual hosting allows a web server to host multiple websites on the same computer, sometimes sharing the same IP address. Unfortunately, when the URL class was designed, this technique was unanticipated. Consequently, when two completely different URLs resolve to the same IP address, the URL class will consider them to be equal.

Noncompliant Code Example

Consider an application that allows an organization's employees to access an external mail service via http://mailwebsite.com. The application is designed to deny access to other websites by behaving as a makeshift firewall. However, a crafty or malicious user can nevertheless access an illegal website http://illegalwebsite.com that is hosted on the same computer as the legitimate website and consequently shares the same IP address. Even worse, an attacker can register multiple websites (for phishing purposes) until one is registered on the same computer, consequently defeating the firewall.

Compliant Solution

The URI class was introduced in Java version 1.4. According to the Java API [API 2006], URI class documentation

A URI may be either absolute or relative. A URI string is parsed according to the generic syntax without regard to the scheme, if any, that it specifies. No lookup of the host, if any, is performed, and no scheme-dependent stream handler is constructed.

This compliant solution uses a URI object instead of a URL. The filter appropriately blocks the website when present with a string different from http://mailwebsite.com, because the comparison fails.

Additionally, the URI class also performs normalization (removing extraneous path segments like '..') and relativization of paths [API 2006] and [Darwin 2004]. Because a URI object lacks methods for opening the URI, programs must construct a URL when opening the resource is required, as shown below.

Risk Assessment

Using the equals() or hashcode() methods of a URL object may produce unexpected results.

Guideline

Severity

Likelihood

Remediation Cost

Priority

Level

IDS15-J

low

probable

medium

P4

L3

Related Vulnerabilities

Search for vulnerabilities resulting from the violation of this guideline on the CERT website.

Bibliography

[API 2006] Class URL and Class URI
[Darwin 2004] 18.8 URI, URL, or URN?
[Harold 1997] Chapter 3: Classes, Strings, and Arrays, The Object Class (equality)
[Techtalk 2007] "More Joy of Sets"


IDS14-J. Perform lossless conversion of String data between differing character encodings            IDS16-J. Do not locale-dependent methods on locale-sensitive data without specifying the appropriate locale

1 Comment

  1. [Harold 97] writes -

    "These two URLs point to the same page on the Web. However the first URL goes over a 100 Megabit per second (Mbps) FDDI connection and the second over a 10 Mbps Ethernet connection. So these URLs are probably best considered to be unequal. In fact, Java considers [both these] URLs to be unequal."

    Does anyone know how this is handled nowadays?