Skip to content

Avoid unintentional matching of subwords in notifications#820

Merged
mherger merged 3 commits intoLMS-Community:public/8.4from
earlchew:issue-622
Nov 29, 2022
Merged

Avoid unintentional matching of subwords in notifications#820
mherger merged 3 commits intoLMS-Community:public/8.4from
earlchew:issue-622

Conversation

@earlchew
Copy link
Contributor

@earlchew earlchew commented Oct 2, 2022

Use \b to bracket words when creating regular expressions to avoid words matching subwords (eg "play" matching "playlist" and "display").

Fixes #622.

Use \b to bracket words when creating regular expressions
to avoid words matching subwords (eg "play" matching "playlist"
and "display").

Fixes LMS-Community#622.

my @namelist;
push @namelist, '\b' . $_ . '\b' for (@$names);
$regexp .= '(?:' . join('|', @namelist) . ')';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR!

Could this be

$regexp .= '(?:' . join('|', map { "\b$_\b" } @$names) . ')';

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelherger you're becoming worse than me 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh... that's @earlchew's suggestion, not mine 😀.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even the use of map?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

map is great. If that's worse than you, I can live with it 😄.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing against map or your one-liner. It was making me smile because you rightfully often pointed out some of Perl code where I condense things way too much and I found this one was not bad 😄

@mherger mherger force-pushed the public/8.3 branch 2 times, most recently from 6a78514 to bc6b853 Compare October 26, 2022 04:56
Use map { } to simplify construction of the regular expression
for each subword.

Fixes LMS-Community#622.
@earlchew
Copy link
Contributor Author

Here are the test outcomes after refactoring a little more aggressively, and using the map function as suggested:

$ PERLBREW_PERL=5.16.3 perl --version

This is perl 5, version 16, subversion 3 (v5.16.3-1-local*) built for x86_64-linux-thread-multi
(with 1 registered patch, see perl -V for more detail)
...

$ PERLBREW_PERL=5.16.3 perl -w /tmp/test.pl

Test Nothing
(?^:)
(?^:)
(?^:)
(?^:)

Test Empty
Use of uninitialized value in concatenation (.) or string at /tmp/test.pl line 9.
(?^:)
(?^:(?:))
(?^:)
(?^:)

Test One List of One
(?^:123)
(?^:(?:\b123\b))
(?^:(?:\b123\b))
(?^:(?:\b123\b))

Test One List of Two
(?^:(?:123|456))
(?^:(?:\b123\b|\b456\b))
(?^:(?:\b123\b|\b456\b))
(?^:(?:\b123\b|\b456\b))

Test Two Lists of Two
(?^:(?:123|456),(?:789|012))
(?^:(?:\b123\b|\b456\b),(?:\b789\b|\b012\b))
(?^:(?:\b123\b|\b456\b),(?:\b789\b|\b012\b))
(?^:(?:\b123\b|\b456\b),(?:\b789\b|\b012\b))

The test exercises a couple of key test cases:

sub __requestRE_0 {
        my $possibleNames = shift || return qr /./;
        my $regexp = '';

        my $i = 0;

        for my $names (@$possibleNames) {
                $regexp .= ',' if $i++;
                $regexp .= (scalar @$names > 1) ? '(?:' . join('|', @$names) . ')' : $names->[0];
        }

        return qr /$regexp/;
}

sub __requestRE_1 {
        my $possibleNames = shift || return qr /./;
        my $regexp = '';

        my $i = 0;

        for my $names (@$possibleNames) {
                $regexp .= ',' if $i++;

                # Bracket each name using word boundaries to avoid
                # "play" matching "play", "playlist", and "display".

                my @namelist;
                push @namelist, '\b' . $_ . '\b' for (@$names);
                $regexp .= '(?:' . join('|', @namelist) . ')';
        }

        return qr /$regexp/;
}

sub __requestRE_2 {
	my $possibleNames = shift || return qr /./;
	my @regexp = ();

	for my $names (@$possibleNames) {

		# Bracket each name using word boundaries to avoid
		# "play" matching "play", "playlist", and "display".

		push @regexp,
			'(?:' . join('|', map { "\\b$_\\b" } @$names) . ')'
			if scalar @$names;
	}

	return qr /${\(join(',', @regexp))}/;
}

print "\nTest Nothing\n";
@names = ( );
print __requestRE_0(\@names) . "\n";
print __requestRE_1(\@names) . "\n";
print __requestRE_2(\@names) . "\n";

print "\nTest Empty\n";
@names = ( [] );
print __requestRE_0(\@names) . "\n";
print __requestRE_1(\@names) . "\n";
print __requestRE_2(\@names) . "\n";

print "\nTest One List of One\n";
@names = ( ["123"] );
print __requestRE_0(\@names) . "\n";
print __requestRE_1(\@names) . "\n";
print __requestRE_2(\@names) . "\n";

print "\nTest One List of Two\n";
@names = ( ["123", "456"] );
print __requestRE_0(\@names) . "\n";
print __requestRE_1(\@names) . "\n";
print __requestRE_2(\@names) . "\n";

print "\nTest Two Lists of Two\n";
@names = ( ["123", "456"], [789, "012"] );
print __requestRE_0(\@names) . "\n";
print __requestRE_1(\@names) . "\n";
print __requestRE_2(\@names) . "\n";

Copy link
Member

@michaelherger michaelherger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your effort! You could actually add the test script you created as a real test to the t/ subfolder 😉. Would be the first real unit test in LMS' long history!

Comment on lines 2408 to 2410
push @regexp,
'(?:' . join('|', map { "\\b$_\\b" } @$names) . ')'
if scalar @$names;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an if becomes multi-line I prefer to have it wrapping the conditional statement. Could you please change this?

Is the double backslash in "\\b" required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR "\b" is interpreted as a backspace (BS) whereas "\\b" eq '\b' and "\\b" eq '\\b'.

$pattern1 = '\\bbear';
$pattern2 = '\bbear';
$pattern3 = "\\bbear";
$pattern4 = "\bbear";

die unless
    $pattern1 eq $pattern2 &&
    $pattern1 eq $pattern3 &&
    $pattern1 ne $pattern4;;

print ' ' . ord($_) . ' ' . $_ for split(//, $pattern1);
print "\n";

print ' ' . ord($_) . ' ' . $_ for split(//, $pattern4);
print "\n";

$re1 = qr /$pattern1/;
$re4 = qr /$pattern4/;

$candidate = "black-bear";

die unless $candidate =~ $re1;
die if $candidate =~ $re4;

}

return qr /$regexp/;
return qr /${\(join(',', @regexp))}/;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, just for readability, as regexes can be confusing enough themselves: could you store the result of the join in a variable $regexp and return as before?

Change layout of multiline conditional.

Fixes LMS-Community#622.
@mherger mherger changed the base branch from public/8.3 to public/8.4 November 29, 2022 07:38
@mherger mherger merged commit 8e88769 into LMS-Community:public/8.4 Nov 29, 2022
@mherger
Copy link
Contributor

mherger commented Nov 29, 2022

Thanks! I merged to 8.4 for now. Should be easy to backport should there be a good need.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

__requestRE should match words provided to Slim::Control::Request::subscribe

4 participants