Skip to content

s.t.cmd.CommandLineParser has exponential performance #485

Closed
@retronym

Description

@retronym
⚡ for N in 10000 20000 40000 80000; do (yes /path/to/fileeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee.scal
10000
error: source file '/path/to/fileeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee.scala' could not be found
one error found

real	0m2.662s
user	0m6.006s
sys	0m0.323s
20000
error: source file '/path/to/fileeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee.scala' could not be found
one error found

real	0m6.835s
user	0m13.575s
sys	0m0.627s
40000
error: source file '/path/to/fileeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee.scala' could not be found
one error found

real	0m24.827s
user	0m50.809s
sys	0m1.815s
80000
error: source file '/path/to/fileeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee.scala' could not be found
one error found

real	2m4.690s
user	4m5.447s
sys	0m6.882s

I thought I fixed this in scala/scala@0c520b5, but it was only fixed on JDK versions in which String.substring was cheap (sharing the backing char[]).

The code in question is in https://github.com/scala/scala/blob/0022db978beadb8759cc2ede6d8ef5e4a8a1c58c/src/compiler/scala/tools/cmd/CommandLineParser.scala#L75

  • Write a unit test to show the current behaviour of this code
  • Write a JMH benchmark that ranges across different input sizes (like the bash command above) to show the performance problem
  • Rewrite the recursive method commandLine to avoid taking ever-smaller substrings of the input, perhaps by using String.subsequence (which is cheap), or some analagous method keeping track of how much input remains to be parsed.

I believe that most build tools aren't affected by this performance bug, as they tend to programattically provide the list of sources to the compiler, rather than providing them as part of the list of compiler options.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions