Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

@ayende
Copy link

@ayende ayende commented Aug 24, 2015

The idea is to avoid 4KB allocation for buffer whenever we need to work with large number of files.
Consider the following code:

foreach (var file in System.IO.Directory.GetFiles(dirToCheck, "*.dat"))
{
    using (var fs = new FileStream(file, FileMode.Open))
    {
             // do something to read from the stream
    }
}

The problem is that each instance of FileStream will allocate an independent buffer. If we are reading 10,000 files, that will result in 40MB(!) being allocated, even if we are very careful about allocations in general.

This PR adds a method, SetExternalBuffer(byte[]) which allows the caller to send an external buffer, which they can manage on their own, to the FileStream.

This gives callers the chance to pool the buffer or share it between multiple FileStream instances (obviously with only one stream using the buffer at at time.

Sample usage:

byte[] buffer = new byte[4096];
foreach (var file in System.IO.Directory.GetFiles(dirToCheck, "*.dat"))
{
    using (var fs = new FileStream(file, FileMode.Open))
    {
             fs.SetExternalBuffer(buffer);
             // do something to read from the stream
    }
}

And now we have no more allocations. The cost of calling this code went from 40MB to 4KB.

… won't be allocating its own buffers.

That allow to use common / pool buffer when using large number of FileStream
@dnfclas
Copy link

dnfclas commented Aug 24, 2015

Hi @ayende, I'm your friendly neighborhood .NET Foundation Pull Request Bot (You can call me DNFBOT). Thanks for your contribution!

In order for us to evaluate and accept your PR, we ask that you sign a contribution license agreement. It's all electronic and will take just minutes. I promise there's no faxing. https://cla2.dotnetfoundation.org.

TTYL, DNFBOT;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your example, the method is called SetExternalBuffer

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, slip of the keyboard while writing the comment. I think that the name SetBuffer is better, with the externalBuffer name indicating what is going on.

@dnfclas
Copy link

dnfclas commented Aug 24, 2015

@ayende, Thanks for signing the contribution license agreement so quickly! Actual humans will now validate the agreement and then evaluate the PR.

Thanks, DNFBOT;

@jkotas
Copy link
Member

jkotas commented Aug 24, 2015

@ayende Thank you for suggesting this improvement. Could you please follow the API review process for this one since it is a public API addition?

From Contributing to .NET Core:

DON'T add API additions without filing an issue and discussing with us first. See API Review Process.

@stephentoub
Copy link
Member

A few thoughts:

  • As @jkotas mentioned, this needs to follow the API review process.
  • Such issues are better submitted over in the https://github.com/dotnet/corefx repo. The FileStream in mscorlib is now only used in .NET Core by mscorlib itself as an implementation detail; any code outside of mscorlib using FileStream uses the version from System.IO.FileSystem.dll, which is in corefx.
  • In general we have issues with a variety of our stream around such internal allocations, and I agree we should think through some mechanism to support it. Passing in the buffer might be the right answer, though if we were to do that I think we'd want to add a constructor to take the buffer instead of bufferSize, as otherwise it could conflict with the bufferSize that's supplied, and the fact that FileStream currently lazily allocates the buffer is an implementation detail... if it didn't lazily allocate it, calling SetExternalBuffer after construction wouldn't really help much.
  • Today there are a variety of situations where FileStream will allocate additional buffers; the PR at Improve FileStream WriteAsync buffering and performance corefx#2929 should address this, and such a SetExternalBuffer method wouldn't work particularly well without that.

@ayende
Copy link
Author

ayende commented Aug 25, 2015

I closed the PR and opened: https://github.com/dotnet/corefx/issues/2961

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants