You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 23 Next »

Java's regular expression facilities are wide ranging and powerful which can lead to unwanted modification of the original regular expression string to form a pattern that matches too widely, possibly resulting in far too much information being matched.

The primary means of preventing this vulnerability is to sanitize a regular expression string coming from untrusted input. Additionally, the programmer should look into ways of avoiding using regular expressions from untrusted input, or perhaps provide only a very limited subset of regular expression functionality to the user

Constructs and properties of Java regular expressions to watch out for include:

  • match flags used in non-capturing groups (These override matching options that may or may not have been passed into the compile() method.
  • greediness
  • grouping

Since Java regular expressions are similar to Perl, it is a good idea to apply lessons learned from Perl regex.

Noncompliant Code Example

This class does not sanitize the incoming regular expression, and as a result, exposes too much information to the user.

This program searches a database of users for searches that match a regular expressions to present search suggestions to the user.

 A non-malicious use would be to enter "C" to match Charles and Cecilia. 
 A malicious use would be to enter "?:)(^C,[0-9]+?,[0-9]+?$)|(?:" which
 grabs the IPs that made the search.

The outer parentheses defeat the grouping protection. Using the OR operator allows injection of any arbitrary regex. Now this use will reveal all times and IPs the keyword 'Bono' was searched.

/* Say this logfile contains:
 * CSV style: search string, time (unix), ip (integer)
 *
 * Alice,1267773881,2147651708
 * Bono,1267774881,2147651708
 * Charles,1267775881,1175563058
 * Cecilia,1267773222,291232332
 *
 * and the CSVLog class has a readLine() method which retrieves a single line from the CSVLog and returns null when at EOF
 */
private CSVLog logfile;
 
// an application repeatedly calls this function that searches through
// the search log for search suggestions for autocompletion
public Set<String> suggestSearches(String search)
{
   Set<String> searches = new HashSet<String>();
    
   // construct regex from user's string   
   // the regex matches valid lines and the grouping characters will limit the 
   // returned regex to the search string
   String regex = "^(" + search + ".*),[0-9]+?,[0-9]+?$";
   Pattern p = Pattern.compile(regex);
   String s;
   while ((s = logfile.readLine()) != null) { //gets a single line from the logfile
       Matcher m = p.matcher(s);
       if (m.find()) {
           String found = m.group(1);
           searches.add(found);
       }
   }
        
   return searches;
}

Compliant Solution

Solutions include parsing the CSV into a class prior to matching or whitelisting only certain characters (such as letters and digits). Blacklisting might be difficult due to the variability of the regex language.

/* Say this logfile contains:
 * CSV style: search string, time (unix), ip (integer)
 *
 * Alice,1267773881,2147651708
 * Bono,1267774881,2147651708
 * Charles,1267775881,1175563058
 * Cecilia,1267773222,291232332
 *
 * and the CSVLog class has a readLine() method which retrieves a single line from the CSVLog and returns null when at EOF
 */
private CSVLog logfile;
 
// an application repeatedly calls this function that searches through the search log 
// for search suggestions for autocompletion
public Set<String> suggestSearches(String search)
{
   Set<String> searches = new HashSet<String>();

   //filter search
   StringBuilder sb = new StringBuilder(search.length());
   for (int i = 0; i < search.length(); ++i) {
       char ch = search.charAt(i);
       if (Character.isLetterOrDigit(ch))
           sb.append(ch);
   }
   search = sb.toString();
    
   //construct regex from user's string   
   //the regex matches valid lines and the grouping characters will limit the 
   // returned regex to the search string
   String regex = "^(" + search + ".*),[0-9]+?,[0-9]+?$";
   Pattern p = Pattern.compile(regex);
   String s;
   while ((s = logfile.readLine()) != null) { //gets a single line from the logfile
       Matcher m = p.matcher(s);
       if (m.find()) {
           String found = m.group(1);
           searches.add(found);
       }
   }
        
   return searches;
}

Risk Assessment

Rule

Severity

Liklihood

Remediation Cost

Priority

Level

IDS18-J

medium

probable

high

P8

L2

References

CWE ID 625 Permissive Regular Expressions

CVE-2005-1949 Arbitrary command execution in ePing plugin for e107 portal due to an overly permissive regular expression parsing an IP

  • No labels