Skip to content

fs.readFile is slower in v10 #25741

Closed
Closed
@zbjornson

Description

@zbjornson
  • Version: v10.15.0
  • Platform: Ubuntu 16, Win 10. Haven't tested macOS
  • Subsystem: fs

I'm seeing a 7.6-13.5x drop in read throughput between 8.x and 10.x in both the readfile benchmark and our real-world benchmarks that heavily exercise fs.readFile. Based on my troubleshooting, I think it's from #17054.

The readfile benchmark (Ubuntu 16):

Test v8.15.0 v10.15.0 8 ÷ 10
concurrent=1 len=1024 6,661 7,066 1.06x
concurrent=10 len=1024 23,100 21,079 0.91x
concurrent=1 len=16777216 156.6 11.6 13.5x
concurrent=10 len=16777216 584 76.6 7.6x

From what I can extract from the comments in #17054, either no degradation or a 3.6-4.8x degradation was expected for the len=16M cases.

As for why I think it's because of #17054, the benchmark below compares fs.readFile against an approximation of how fs.readFile used to work (one-shot read), measuring time to read the same 16 MB file 50 times.

// npm i async

const fs = require("fs");
const async = require("async");

function chunked(filename, cb) {
	fs.readFile(filename, cb);
}

function oneshot(filename, cb) {
	fs.open(filename, "r", 0o666, (err, fd) => {
		if (err) return cb(err);
		const onerr = err => fs.close(fd, () => cb(err));
		fs.fstat(fd, (err, stats) => {
			if (err) return onerr(err);
			const data = Buffer.allocUnsafe(stats.size);
			fs.read(fd, data, 0, stats.size, 0, (err, bytesRead) => {
				if (err) return onerr(err);
				if (bytesRead !== stats.size) {
					const err = new Error("Read fewer bytes than requested");
					return onerr(err);
				}
				fs.close(fd, err => cb(err, data));
			});
		});
	});
}

fs.writeFileSync("./test.dat", Buffer.alloc(16e6, 'x'));

function bm(method, name, cb) {
	const start = Date.now();
	async.timesSeries(50, (n, next) => {
		method("./test.dat", next);
	}, err => {
		if (err) return cb(err);
		const diff = Date.now() - start;
		console.log(name, diff);
		cb();
	});
}

async.series([
	cb => bm(chunked, "fs.readFile()", cb),
	cb => bm(oneshot, "oneshot", cb)
])
Node.js OS fs.readFile() (ms) one-shot (ms)
v10.15.0 Ubuntu 16 7320 370
v8.15.0 Ubuntu 16 693 378
v10.15.0 Win 10 2972 493

We've switched to fs.fstat() and then fs.read() as a work-around (above; note how many lines of code it is), but I wouldn't be surprised if this has negatively impacted other apps/tools. As far as the original justification: I'm not convinced that was an appropriate fix or a problem that needed fixing. Web servers (where DoS attacks are relevant) should generally be using fs.createReadStream.pipe(response) for serving files (ignoring other use cases for now). Other sorts of apps like build tools, compilers and test frameworks (DoS attacks irrelevant) are the ones that more often need to read an entire file as fast as possible. In some senses, the fix made the DoS situation worse by increasing overhead (reducing number of useful CPU cycles) and by infinitely partitioning reads (each new request delays all existing requests -- the requests are waiting for their own ticks plus some number of ticks from all of the other requests). We built our web app to load-shed while maintaining quality of service for as many clients as possible, so interleaving is not what we want. -- Happy to discuss that more on- or off-line.

Is anyone else able to verify that this degradation exists and/or was expected?

cc: @davisjam

Metadata

Metadata

Assignees

No one assigned

    Labels

    fsIssues and PRs related to the fs subsystem / file system.performanceIssues and PRs related to the performance of Node.js.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions