memmap reads from directory store

Ive only recently started using zarr but im impressed. well done.

I want to share an experience and a possible enhancement.
In one of my use cases i use vindex heavily across the whole array. I know this is likely a worst use case scenario as zarr is reading many many chunks for a small amount of data in each one.
I was previously using numpy memmap arrays for a similar use and it was much faster so i wondered if i used an uncompressed DirectoryStore if it would read chunks as a memmap. no luck, still reading full chunks. So i had a go at subclassing DirectoryStore to do this.

```

class MemMapReadStore(zarr.DirectoryStore):
    """Directory store using MemMap for reading chunks
    """
    def __getitem__(self, key):
        filepath = os.path.join(self.path, key)
        if os.path.isfile(filepath):
            #are there only 2 types of files? .zarray and the chunks?
            if key == '.zarray':
                with open(filepath, 'rb') as f:
                    return f.read()
            else:
                return np.memmap(filepath,mode='r')
        else:
            raise KeyError(key)

```

Its working well for me but I dont really know the inner workings of zarr so who knows what i might have broken and other features it wont play well with.  I thought the idea might be a basis for an enhancement though. Worth sharing at least.

Speed  up depends on access pattern, compression etc  but for the example im testing im seeing 22 times speed up  v a compressed zarr array of the same dimensions and chunking.

Its only working for reads as that was all i needed and i see the way you write replaces the whole chunk so memmap writes are not doable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

memmap reads from directory store #265

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

memmap reads from directory store #265

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions