Skip to content

fs.ReadFile throws asynchronously with encoding on large enough input #2767

Closed
@timoxley

Description

@timoxley

Given 'utf8' encoding is specified, and a large enough input file, fs.ReadFile will throw an asynchronous error.

Example program

var fs = require('fs');
fs.readFile('data.txt', 'utf8', function(err, data) {
  console.log(err, data && data.length)
})

Input: 268 megabytes of data

Success.

> dd if=/dev/zero of=data.txt  bs=1000000  count=268
> node -e "fs = require('fs'); fs.readFile('data.txt', 'utf8', (err, data) => console.log(err, data && data.length))"
null 268000000

Input: 269 megabytes of data

No Success.
Worse, the error isn't forwarded to the callback – the process is throwing asynchronously.

> dd if=/dev/zero of=data.txt  bs=1000000  count=269
> node -e "fs = require('fs'); fs.readFile('data.txt', 'utf8', (err, data) => console.log(err, data && data.length))"
buffer.js:359
    throw new Error('toString failed');
    ^

Error: toString failed
    at Buffer.toString (buffer.js:359:11)
    at FSReqWrap.readFileAfterClose [as oncomplete] (fs.js:378:21)

The error is not something specific to fs.readFile though, the same error is produced if we toString on the buffer directly, without the 'utf8' parameter, unsurprisingly. The problem is the uncatchable throw.

> node -e "fs = require('fs'); fs.readFile('data.txt', (err, data) => console.log(err, String(data).length))"

While perhaps the throw in the Buffer.prototype.toString makes sense:

Buffer.prototype.toString = function() {
  if (arguments.length === 0) {
    var result = this.utf8Slice(0, this.length);
  } else {
    var result = slowToString.apply(this, arguments);
  }
  if (result === undefined)
    throw new Error('toString failed');
  return result;
};

node/lib/buffer.js

Lines 352 to 361 in f8152df

Buffer.prototype.toString = function() {
if (arguments.length === 0) {
var result = this.utf8Slice(0, this.length);
} else {
var result = slowToString.apply(this, arguments);
}
if (result === undefined)
throw new Error('toString failed');
return result;
};

It does seem like poor behaviour to have uncatchable errors being thrown in core APIs where the user has explicitly attached an errback.

Would a try…catch around the buffer = buffer.toString(context.encoding); in fs.ReadFile be appropriate?

function readFileAfterClose(err) {
  var context = this.context;
  var buffer = null;
  var callback = context.callback;

  if (context.err)
    return callback(context.err);

  if (context.size === 0)
    buffer = Buffer.concat(context.buffers, context.pos);
  else if (context.pos < context.size)
    buffer = context.buffer.slice(0, context.pos);
  else
    buffer = context.buffer;

  // maybe ?
  if (context.encoding) {
    try {
      buffer = buffer.toString(context.encoding); // currently this line is not wrapped in a try/catch
    } catch (e) {
      if (!err) err = e; // which error gets priority?
    }
  }

  callback(err, buffer);
}

node/lib/fs.js

Lines 377 to 378 in f8152df

if (context.encoding)
buffer = buffer.toString(context.encoding);

fs.readFile is understandably hot code, I haven't done benchmarking on the impact of a try…catch here.

Given that there appears to be an upper limit to what can be stringified safely, perhaps the max length could be tested for before attempting the stringification, and a 'this is too big' Error would ideally pop out of the callback.

Suggest at a minimum a more helpful error message "our buffering is good but have you tried the streams?"

Metadata

Metadata

Assignees

No one assigned

    Labels

    fsIssues and PRs related to the fs subsystem / file system.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions