-
Notifications
You must be signed in to change notification settings - Fork 982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory mapping/nrows #3526
Comments
This part of the error message is reliable: "This is a 32bit process."
Then, either i) wherever you are running sessionInfo() is not the same place you're seeing Here are two examples from sessionInfo() on Windows :
It's the part in brackets at the end that's significant. You need to use the 64bit version of R on your server. Please confirm this solves it. |
Hi Matt, thank you for your message. sessionInfo() indicates Platform: x86_64-w64-mingw32 (64-bit). |
To double check please post the output of You could also try
|
I closed everything and opened again in another R-session. I am getting now .Machine$sizeofpointer = 8 There is an issue with the copy/paste from the server, I have to post outputs manually. |
So the initial issue related to 32bit was most likely a result of different R processes running in each case, as Matt suggested. For example you might have R setup in |
Yes, I guess there was an issue with different R processes. |
Yes, glad you're using 64bit R ok now. |
Yes, I understand that I will have to select a subset of columns. When I do that with fread, I get the same error. So, my question is, does fread always have to map the entire file to memory? "It's quite unheard of (and considered bad practice) to have a single file so large." --> Totally agree, unfortunately I do not generate this file. I am trying to get the file being split before any analysis. |
@parayamelo Start with installing some bash for windows, it makes life much easier and increase productivity, especially with software like R, and actually most of open source in general. One of those two should be best https://stackoverflow.com/questions/771756/what-is-the-difference-between-cygwin-and-mingw |
Good idea @jangorecki . I will tell the people responsible for the server. I use Linux, so I am more use to bash commands. I am trying to replicate the same error on my Linux machine. |
Yes: virtual memory though and it needs to be a contiguous block. The error message correctly states: "There is probably not enough contiguous virtual memory available." It would be technically possible to memory map in chunks but that would complicate the algorithm considerably. I think our time is better spent elsewhere. I'd expect a 60-80 GB file to memory map ok on your 64GB RAM server, using Windows virtual memory. But 170GB is almost 3x the RAM. There might be some Windows configuration settings you could investigate to increase virtual memory. Then select a small subset of columns allowing the for the final data.table in RAM too of course. Unlikely to work but might be worth a shot. |
Thanks Matt! I will look for alternative solutions. And thanks for taking the time to look into my issue. |
Sorry for hacking the thread. @parayamelo perhaps you want to try disk.frame? http://diskframe.com It will handle large datasets, please let me know if run into bugs. |
Thank you @xiaodaigh |
Hello,
I am using data.table v1.12.2, and trying to read a file with fread whose size is ~170gb. It is a Windows server machine. It says I have 45.3gb available out of 64gb memory. I get the error
"Opened 168.7GB file ok but could not memory map it. This is a 32bit process. Please upgrade to 64bit."
sessionInfo tells me I am running R v3.5.3 on a 64bit platform. I try to read fewer rows, i.e., option "nrows = 100000", but I always get the same error, i.e., it is mapping to entire file to memory. It is like it is not recognizing the "nrows" option. I also tried with less nthreads (it has maximum 8), but I get the same result. I also tried reading 1 column and fewer rows, but same problem.
Is there a way around this? Due to server policies, I can not install anything new, so my hands are tied to what it is already installed in the server.
Thank you.
The text was updated successfully, but these errors were encountered: