Skip to content

ensure atomic writes when saving a file #200

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 9, 2019
Merged

Conversation

basnijholt
Copy link
Member

@basnijholt basnijholt commented Jun 20, 2019

Right now, if a program crashes in the middle of saving, you lose all your data. This ensures that the old file is first moved, then the new file is saved, and only then the old file is removed.

This is not a hypothetical scenario but happens every day for some people that I work with ATM.

@basnijholt basnijholt force-pushed the safe_saving branch 2 times, most recently from fc9ba2f to 187e064 Compare June 20, 2019 22:09
@basnijholt basnijholt changed the base branch from master to stable-0.8 June 20, 2019 22:10
@basnijholt basnijholt force-pushed the safe_saving branch 2 times, most recently from 2b06b62 to 82a300b Compare June 20, 2019 22:11
@basnijholt
Copy link
Member Author

basnijholt commented Jun 20, 2019

Maybe adding a "." in front of the fname isn't the best name to choose.

@basnijholt basnijholt changed the title make a backup of the data file before saving WIP: make a backup of the data file before saving Jun 20, 2019
@akhmerov
Copy link
Contributor

This is not a hypothetical scenario but happens every day for some people that I work with ATM.

Maybe that is also worth investigating as a separate issue? Saving usually shouldn't cause a problem if adaptive is stable.

@akhmerov
Copy link
Contributor

The implementation is rather fragile and potentially hitting bad corner cases (overwriting a different file, times being messed up, etc); I think there should be a more systematic way to do an atomic write.

@basnijholt
Copy link
Member Author

Maybe that is also worth investigating as a separate issue? Saving usually shouldn't cause a problem if adaptive is stable.

The reason for the crash is not Adaptive, but merely low priority nodes that can be evicted at any time, that suddenly stop.

@basnijholt
Copy link
Member Author

The implementation is rather fragile and potentially hitting bad corner cases (overwriting a different file, times being messed up, etc); I think there should be a more systematic way to do an atomic write.

Do you have a suggestion?

@akhmerov
Copy link
Contributor

https://github.com/untitaker/python-atomicwrites seems like an option or any other alternatives.

Or we could use all the relevant tricks from it.

@akhmerov
Copy link
Contributor

I also have to bring up another approach: we provide a way to get an in-memory data in adaptive and the user uses atomicwrites on their own. Separation of concerns and all.

@basnijholt basnijholt force-pushed the safe_saving branch 3 times, most recently from 9a45665 to 015c436 Compare June 20, 2019 23:20
@basnijholt
Copy link
Member Author

I also have to bring up another approach: we provide a way to get an in-memory data in adaptive and the user uses atomicwrites on their own. Separation of concerns and all.

There is already _get_data and _set_data that every learner implements. The problem is that you can't use the periodic_saver like this

I do not mind to add the dependency, it's used by 19723 other packages and it is tiny. See the commit I've made. Also, I don't see a reason why people wouldn't want a 'safe' save.

@basnijholt basnijholt force-pushed the safe_saving branch 2 times, most recently from 2115f48 to 0cc2fc7 Compare June 20, 2019 23:25
@basnijholt basnijholt changed the title WIP: make a backup of the data file before saving ensure atomic writes when saving a file Jun 21, 2019
Right now, if a program crashes in the middle of saving, you lose
all your data.

Right now, if a program crashes in the middle of saving, you lose
all your data. This ensures that the old file is first moved, then
the new file is saved, and only then the old file is removed.
@jbweston jbweston merged commit c821c49 into stable-0.8 Jul 9, 2019
@basnijholt basnijholt deleted the safe_saving branch July 29, 2019 10:15
@basnijholt basnijholt mentioned this pull request Aug 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants