Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocate disk space with posix_fallocate before mmapping. #2325

Merged

Conversation

de-vri-es
Copy link
Contributor

If using mmap to write to a file without allocating disk space ahead of time, the kernel will allocate disk space as new pages are mapped. This can fail, for example when the hard disk is full. If it does fail, this causes the program to be killed by a SIGBUS signal.

See also #637.

The method currently used by pcl is to seek to where the end of the file should be and write a byte. This does update the file size, but it creates a sparse file on most filesystems. There is no disk space allocated for all the seeked-over bytes, so writing to the memory mapped file still causes disk space allocation, which can fail.

This PR adds calls to posix_fallocate [1] before memory mapping the file to ensure that the required disk space is allocated, or the function fails cleanly.

There should be no performance penalty from this change, since the disk allocation has to be done either way. In fact, allocating it all in one go may well be faster.

[1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_fallocate.html

@taketwo taketwo merged commit 3ceaf43 into PointCloudLibrary:master May 30, 2018
@taketwo
Copy link
Member

taketwo commented May 30, 2018

Thanks for contributing!

@de-vri-es
Copy link
Contributor Author

Thanks for the quick review and merge :)

@matt-deboer
Copy link

matt-deboer commented Jun 5, 2018

FYI, This commit breaks my build on OSX 10.13.3 with the following error:

io/include/pcl/io/impl/pcd_io.hpp:709:9: error: no member named 'posix_fallocate' in the global namespace

I'm temporarily working around it by adding this patch into io/include/pcl/io/lzf.h:

#ifdef __APPLE__
#include <fcntl.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
static int posix_fallocate(int fd, off_t offset, off_t len)
{
    off_t c_test;
    int ret;
    if (!__builtin_saddll_overflow(offset, len, &c_test)) {
        fstore_t store = {F_ALLOCATECONTIG, F_PEOFPOSMODE, 0, offset + len};
        // Try to get a continous chunk of disk space
        fcntl(fd, F_PREALLOCATE, &store);
        if (ret < 0) {
            // OK, perhaps we are too fragmented, allocate non-continuous
            store.fst_flags = F_ALLOCATEALL;
            ret = fcntl(fd, F_PREALLOCATE, &store);
            if (ret < 0) {
                return ret;
            }
        }
        ret = ftruncate(fd, offset + len);
    } else {
        // offset+len would overflow.
        ret = -1;
    }
    return ret;
}
#endif

based on this KDE discussion: https://git.reviewboard.kde.org/r/129267/diff/2

@taketwo
Copy link
Member

taketwo commented Jun 5, 2018

Thanks for letting us know.

@SergioRAgostinho what's your opinion as a Mac user? Should we include this, or rather ifdef the calls (i.e. no preallocation)?

@de-vri-es
Copy link
Contributor Author

Without pre-allocation it would at the very least be necessary to make sure the file size is large enough for the mmap to succeed as was done previously with the seek and write.

Either way, it makes sense to handle the file resizing / disk space allocation in a single function since it is used 4 times. Might as well call it posix_fallocate then.

Alternatively it can be moved to a wrapper called pcl_fallocate or similar with different implementations for OSX and Linux/BSD. We could even drop the offset argument then since we don't need it, and get rid of the intrinsic __builtin_saddll_overflow in the OSX implementation.

@taketwo
Copy link
Member

taketwo commented Jun 5, 2018

I like the wrapper function approach. We already have pcl_open, pcl_close, pcl_lseek to abstract platform specifics (though these are macros). They, however, are defined multiple times in different files. Ideally we would consolidate them (and the new pcl_fallocate) in a single file.

And getting rid of the offset argument also sounds good to me.

@de-vri-es
Copy link
Contributor Author

I can probably find the time soon to do this, but someone else should also feel free if they can beat me to it.

@taketwo
Copy link
Member

taketwo commented Jun 6, 2018

Thanks, would be great.

@de-vri-es
Copy link
Contributor Author

Initial PR in #2341.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants