-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
An IO error occurs when doing checkpoint (db_paths, cf_paths not supported for checkpoint+backups) #8647
Comments
The root cause is that the path of source SST file is just ‘dbname_/sstname’:
This way can not get the right path of SST/blob file when db_paths specified. |
Hi, @ajkr. I'm trying to fix it, please assign it to me. Thank you :) |
The same issue may also apply to A few things to note in checkpoint_impl:
|
I think that introducing a new API maybe a good choice. The new API retrieves the all files in the database, just like GetLiveFiles. Different from GetLiveFiles, the files returned by the new API are put into a map. The key is files path, the value is a list of all files in the path.
I think that all the files should be copied/linked the checkpoint dir directly, regardless of the checkpoint dir’s hierarchy. The first reason is simplicity. Besides, The option file generated by checkpoint can not be rewritten currently.
In fact, the checkpoint dir must not exist in current implementation. So we don't need to think about these cases. |
I feel it's better to have a
My concern is, if we allow overriding those path options, it's possible a user sets them to the same directory. We may assume he/she knows the consequence of this behavior. But in checkpoint_impl, files are only copied/linked from a path specified in the original db's options, to the same options specified in the checkpoint's options. I believe the Also a minor issue, checkpointing will write to the checkpoint_path + ".tmp". It does not check if there is a file/directory at checkpoint_path + ".tmp". We can change the suffix to a random non-exist location rather than a fixed one. These are just my findings when I tried to fix this issue. Hope they can help you a little. |
Hi, autopear. Thanks for your opinion. I agree with you that it's better to have a CheckpointOptions to specify the hierarchy of checkpoint dir. I suggest creating a new issue which introduce the CheckpointOptions after #8648 is addressed. |
@autopear @zaorangyang I do not believe there is a reason to try to specify the directory paths via CheckpointOptions as you suggest. First, I believe this can get outrageously complicated to support. Each ColumnFamily can say where its files are for each level, meaning you would have to have a map of Paths to attempt to cover all of the cases. |
#8968 would fix this to fail at checkpoint (backup) creation time and declare this as unsupported in the API comments. I think the best path to basic support for db_paths/cf_paths would be to allow files to be in any configured path, with search starting with the path id in the manifest. This way, you would at least have a functioning DB in the checkpoint directory, or if backup is restored to a single directory. But there are many places where we need to inject this search of cf_paths, requiring more than a drop-in replacement for |
Expected behavior
Checkpoint should be executed correctly when db_paths specified
Actual behavior
It throws a IO error -- "IO error: No such file or directory: while link file to checkpont_dir.tmp/000009.sst: main/000009.sst: No such file or directory"
Steps to reproduce the behavior
Take a look at the following example:
The text was updated successfully, but these errors were encountered: