Perl's capture variables ($1, $2, etc.) are assigned the values of capture expressions after a regular expression (regex) match has been found. If a regex fails to find a match, the contents of the capture variables can remain undefined. The perlre manpage [Wall 2011] contains this note:

NOTE: Failed matches in Perl do not reset the match variables, which makes it easier to write code that tests for a series of more specific cases and remembers the best match.

Consequently, the value of a capture variable can be indeterminate if a previous regex failed. The value can also be overwritten on subsequent regex matches. Always ensure that a regex was successful before reading its capture variables.

Noncompliant Code Example

This noncompliant code example demonstrates the hazards of relying on capture variables without testing the success of a regex.

my $data = "[    4.693540] sr 1:0:0:0: Attached scsi CD-ROM sr0";

my $cd;
my $time;

$data =~ /Attached scsi CD-ROM (.*)/;
$cd = $1;
print "cd is $cd\n";

$data =~ /\[(\d*)\].*/;  # this regex will fail
$time = $1;
print "time is $time\n";

This code produces the following output:

cd is sr0
time is sr0

The surprising value for the $time variable arises because the regex fails, leaving the capture variable $1 still holding its previously assigned value sr0.

Compliant Solution

In this compliant solution, both regular expressions are checked for success before the capture variables are accessed.

my $data = "[    4.693540] sr 1:0:0:0: Attached scsi CD-ROM sr0";

my $cd;
my $time;

if ($data =~ /Attached scsi CD-ROM (.*)/) {
  $cd = $1;
  print "cd is $cd\n";
}

if ($data =~ /\[(\d*)\].*/) {  # this regex will fail
  $time = $1;
  print "time is $time\n";
}

This code produces the following output:

cd is sr0

This output might not be what the developer expected, but it clearly reveals that the latter regex failed to find a match.

Risk Assessment

Recommendation

Severity

Likelihood

Remediation Cost

Priority

Level

STR30-PL

Medium

Probable

Medium

P8

L2

Automated Detection

Tool

Diagnostic

Perl::Critic

RegularExpressions::ProhibitCaptureWithoutTest

Bibliography

 


2 Comments

  1. I would go further and say that you should not use the capture variables for anything other than assignment to your own variable.  There are many gotchas.  For example calling a subroutine as foo($1) will have bizarre effects if the implementation of foo() does a regexp match before unpacking its argument list.

     

     

     
    1. Agreed. Reworded intro & title.