Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: 🎸 maxMatches to limit matches and exit early #238

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ This package provides methods for traversing the file system and returning pathn
* [Output control](#output-control)
* [absolute](#absolute)
* [markDirectories](#markdirectories)
* [maxMatches](#maxmatches)
* [objectMode](#objectmode)
* [onlyDirectories](#onlydirectories)
* [onlyFiles](#onlyfiles)
Expand Down Expand Up @@ -419,6 +420,18 @@ fs.sync('*', { onlyFiles: false, markDirectories: false }); // ['index.js', 'con
fs.sync('*', { onlyFiles: false, markDirectories: true }); // ['index.js', 'controllers/']
```

#### maxMatches

* Type: `number`
* Default: `Infinity`

Limits the number of matches.

```js
fs.sync('*', { maxMatches: Infinity }); // ['a.js', 'b.js', ...]
fs.sync('*', { maxMatches: 1 }); // ['a.js']
```

#### objectMode

* Type: `boolean`
Expand Down
1 change: 1 addition & 0 deletions src/providers/provider.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ export default abstract class Provider<T> {
errorFilter: this.errorFilter.getFilter(),
followSymbolicLinks: this._settings.followSymbolicLinks,
fs: this._settings.fs,
maxMatches: this._settings.maxMatches,
stats: this._settings.stats,
throwErrorOnBrokenSymbolicLink: this._settings.throwErrorOnBrokenSymbolicLink,
transform: this.entryTransformer.getTransformer()
Expand Down
45 changes: 45 additions & 0 deletions src/readers/stream.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -137,5 +137,50 @@ describe('Readers → ReaderStream', () => {
done();
});
});

describe('maxMatches', () => {
it('can be used to limit matches', (done) => {
const reader = getReader();
const maxMatches = 2;
const readerOptions = getReaderOptions({
maxMatches,
entryFilter: () => true
});

reader.stat.yields(null, new Stats());

const entries: Entry[] = [];

const stream = reader.static(['1.txt', '2.txt', '3.txt'], readerOptions);

stream.on('data', (entry: Entry) => entries.push(entry));
stream.once('end', () => {
assert.strictEqual(entries.length, maxMatches);
assert.strictEqual(entries[0].name, '1.txt');
assert.strictEqual(entries[1].name, '2.txt');
done();
});
});

it('is ignored if less or equal than 1', (done) => {
const reader = getReader();
const readerOptions = getReaderOptions({
maxMatches: -1,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we need check here Infinity instead of -1, because -1 will be converted to the Infinity in the Settings.

entryFilter: () => true
});

reader.stat.yields(null, new Stats());

let matches = 0;

const stream = reader.static(['1.txt', '2.txt', '3.txt'], readerOptions);

stream.on('data', () => matches++);
stream.once('end', () => {
assert.strictEqual(matches, 3);
done();
});
});
});
});
});
18 changes: 17 additions & 1 deletion src/readers/stream.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,27 @@ export default class ReaderStream extends Reader<NodeJS.ReadableStream> {
const filepaths = patterns.map(this._getFullEntryPath, this);

const stream = new PassThrough({ objectMode: true });
let matches = 0;

stream._write = (index: number, _enc, done) => {
if (options.maxMatches === matches) {
// This is not ideal because we are still passing patterns to write
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Help would be appreciated here @mrmlnc - I think the current solution is okayish, however ideally we would backpressure the stream in for (let i = 0; i < filepaths.length; i++) { based on the async stream.write task. Happy to follow another route as well if you have ideas.

// even though we know the stream is already finished. We can't use
// .writableEnded either because finding matches is asynchronous
// The best we could do is to await the write inside the for loop below
// however that would mean that this whole function would become async
done();
return;
}

return this._getEntry(filepaths[index], patterns[index], options)
.then((entry) => {
if (entry !== null && options.entryFilter(entry)) {
stream.push(entry);
matches++;
}

if (index === filepaths.length - 1) {
if (index === filepaths.length - 1 || options.maxMatches === matches) {
stream.end();
}

Expand All @@ -37,6 +49,10 @@ export default class ReaderStream extends Reader<NodeJS.ReadableStream> {
};

for (let i = 0; i < filepaths.length; i++) {
if (stream.writableEnded) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The writableEnded property exist in the Stream only start with Node.js 12.9.0+. The fast-glob package supports Node.js 8+. This works now because writableEnded has undefined value.

break;
}

stream.write(i);
}

Expand Down
33 changes: 33 additions & 0 deletions src/readers/sync.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -113,5 +113,38 @@ describe('Readers → ReaderSync', () => {

assert.strictEqual(actual.length, 0);
});

describe('maxMatches', () => {
it('can be used to limit matches', () => {
const reader = getReader();
const maxMatches = 2;
const readerOptions = getReaderOptions({
maxMatches,
entryFilter: () => true
});

reader.statSync.returns(new Stats());

const actual = reader.static(['1.txt', '2.txt', '3.txt'], readerOptions);

assert.strictEqual(actual.length, maxMatches);
assert.strictEqual(actual[0].name, '1.txt');
assert.strictEqual(actual[1].name, '2.txt');
});

it('is ignored if less or equal than 1', () => {
const reader = getReader();
const readerOptions = getReaderOptions({
maxMatches: -1,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same about -1 and Infinity.

entryFilter: () => true
});

reader.statSync.returns(new Stats());

const actual = reader.static(['1.txt', '2.txt', '3.txt'], readerOptions);

assert.strictEqual(actual.length, 3);
});
});
});
});
4 changes: 4 additions & 0 deletions src/readers/sync.ts
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ export default class ReaderSync extends Reader<Entry[]> {

public static(patterns: Pattern[], options: ReaderOptions): Entry[] {
const entries: Entry[] = [];
let matches = 0;

for (const pattern of patterns) {
const filepath = this._getFullEntryPath(pattern);
Expand All @@ -26,6 +27,9 @@ export default class ReaderSync extends Reader<Entry[]> {
}

entries.push(entry);
if (options.maxMatches === ++matches) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, this is an incorrect place to filter by the number of entries found. The main problem is that only static patterns are processed at this place, which, by their nature (read as filepath), will always be in a single instance. There are two types of patterns in the fast-glob package: static and dynamic.

So, the most correct place to filter the entries — filters. You can use the unique filter as example. The logic is simple — you need to stop reading the directory in depth, as soon as the number of found entries exceeds a specified number in the maxMatches option. But even in this case we have a problem — we need to synchronize the number of found entries between filters (deep — directories, entries — directories, symbolic links and files). Right now I don't see a good solution for this problem.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. It feels like the option fits the fast-glob package, but do you suggest I use nodelib/fs-walk instead?

Copy link
Owner

@mrmlnc mrmlnc Nov 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand your problem correctly (find the first file with some extension), then this package (@nodelib/fs.walk) seems to suit you more (just add filter function with includes). If you need patterns, then, yes, here you need fast-glob.

The filters I mentioned above need to be updated in fast-glob.

break;
}
}

return entries;
Expand Down
10 changes: 10 additions & 0 deletions src/settings.ts
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,13 @@ export type Options = {
* @default false
*/
markDirectories?: boolean;
/**
* Exit after having gathered `maxMatches` matches.
* If given, expects a positive number greater or equal to 1.
*
* @default Infinity
*/
maxMatches?: number;
/**
* Returns objects (instead of strings) describing entries.
*
Expand Down Expand Up @@ -165,6 +172,9 @@ export default class Settings {
public readonly globstar: boolean = this._getValue(this._options.globstar, true);
public readonly ignore: Pattern[] = this._getValue(this._options.ignore, [] as Pattern[]);
public readonly markDirectories: boolean = this._getValue(this._options.markDirectories, false);
// If 0 or negative maxMatches is given, we revert to infinite matches
// eslint-disable-next-line @typescript-eslint/strict-boolean-expressions
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove this if I:

  • ignore negative and zero maxMatches
  • introduce a new class method that can bound maxMatches between 1 and Infinity
  • assert maxMatches to be gte 1 in the constructor

let me know what's preferred :)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, the first option looks good to me because this logic is already exist for the deep and concurrency options.

public readonly maxMatches: number = Math.max(0, this._getValue(this._options.maxMatches, Infinity)) || Infinity;
public readonly objectMode: boolean = this._getValue(this._options.objectMode, false);
public readonly onlyDirectories: boolean = this._getValue(this._options.onlyDirectories, false);
public readonly onlyFiles: boolean = this._getValue(this._options.onlyFiles, true);
Expand Down
8 changes: 8 additions & 0 deletions src/tests/smoke/max-matches.smoke.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
import * as smoke from './smoke';

smoke.suite('Smoke → MarkDirectories', [
joscha marked this conversation as resolved.
Show resolved Hide resolved
{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this test does not work as expected. This behaviour related to the comment about wrong place for filtering.

You can add debug: true to the test and the console will contain the following lines:

  Smoke → maxMatches
FIXTURES                                  NODE_GLOB  FAST_GLOB
----------------------------------------  ---------  ---------
fixtures/file.md                          +          +
fixtures/first                            +          +
fixtures/first/file.md                    +          +
fixtures/first/nested                     +          +
fixtures/first/nested/directory           +          +
fixtures/first/nested/directory/file.md   +          +
fixtures/first/nested/file.md             +          +
fixtures/second                           +          +
fixtures/second/file.md                   +          +
fixtures/second/nested                    +          +
fixtures/second/nested/directory          +          +
fixtures/second/nested/directory/file.md  +          +
fixtures/second/nested/file.md            +          +

    √ pattern: 'fixtures/**/*' (sync) (17ms)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't able to run this one - it seemed to error on my machine

pattern: 'fixtures/**/*',
fgOptions: { maxMatches: 1 }
}
]);
1 change: 1 addition & 0 deletions src/types/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ export type ReaderOptions = fsWalk.Options & {
entryFilter: EntryFilterFunction;
errorFilter: ErrorFilterFunction;
fs: FileSystemAdapter;
maxMatches: number;
stats: boolean;
};

Expand Down