Issue with reading files though coffea-casa xcache for large scale analysis #420
Open
Description
User reported issue with running their analysis on coffea.casa with the large amounts of data samples:
OSError: File did not vector_read properly: [ERROR] Operation expired
The reproducer is https://github.com/sihyunjeon/test_coffea-casa
What it does is:
- It reads the sample root file names from json file;
- Puts "xcache" instead of the full xrootd link;
- Runs the "Processor" which just takes AK8 jets and dump the mass of it.
Now when it runs on all files given in the json (~500 files) it fails with the error message you see at the very end of ipynb file (the vector read error).
If you uncomment "# break # FIXME" in In[4], it will run on only 3 files and this has no issues on running ipynb.
The problem is that xcache has some capacity (understandably) which then has issues for full scale analysis.
If I run on N different physics processes and they are cached, N+1 th physics process crash with vector read problem
(and I think this is mainly due to the connection issue). If I try to make N+1 th physics process work by only processing
that one, then some other physics process that previously worked stops working and faces vector read problem,
and so on...
Metadata
Assignees
Labels
No labels