Skip to content

Commit 64d6b28

Browse files
fdmananakdave
authored andcommitted
btrfs: remove unnecessary check_parent_dirs_for_sync()
Whenever we fsync an inode, if it is a directory, a regular file that was created in the current transaction or has last_unlink_trans set to the generation of the current transaction, we check if any of its ancestor inodes (and the inode itself if it is a directory) can not be logged and need a fallback to a full transaction commit - if so, we return with a value of 1 in order to fallback to a transaction commit. However we often do not need to fallback to a transaction commit because: 1) The ancestor inode is not an immediate parent, and therefore there is not an explicit request to log it and it is not needed neither to guarantee the consistency of the inode originally asked to be logged (fsynced) nor its immediate parent; 2) The ancestor inode was already logged before, in which case any link, unlink or rename operation updates the log as needed. So for these two cases we can avoid an unnecessary transaction commit. Therefore remove check_parent_dirs_for_sync() and add a check at the top of btrfs_log_inode() to make us fallback immediately to a transaction commit when we are logging a directory inode that can not be logged and needs a full transaction commit. All we need to protect is the case where after renaming a file someone fsyncs only the old directory, which would result is losing the renamed file after a log replay. This patch is part of a patchset comprised of the following patches: btrfs: remove unnecessary directory inode item update when deleting dir entry btrfs: stop setting nbytes when filling inode item for logging btrfs: avoid logging new ancestor inodes when logging new inode btrfs: skip logging directories already logged when logging all parents btrfs: skip logging inodes already logged when logging new entries btrfs: remove unnecessary check_parent_dirs_for_sync() btrfs: make concurrent fsyncs wait less when waiting for a transaction commit Performance results, after applying all patches, are mentioned in the change log of the last patch. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
1 parent 0e44cb3 commit 64d6b28

File tree

1 file changed

+15
-106
lines changed

1 file changed

+15
-106
lines changed

fs/btrfs/tree-log.c

Lines changed: 15 additions & 106 deletions
Original file line numberDiff line numberDiff line change
@@ -5265,6 +5265,21 @@ static int btrfs_log_inode(struct btrfs_trans_handle *trans,
52655265
mutex_lock(&inode->log_mutex);
52665266
}
52675267

5268+
/*
5269+
* This is for cases where logging a directory could result in losing a
5270+
* a file after replaying the log. For example, if we move a file from a
5271+
* directory A to a directory B, then fsync directory A, we have no way
5272+
* to known the file was moved from A to B, so logging just A would
5273+
* result in losing the file after a log replay.
5274+
*/
5275+
if (S_ISDIR(inode->vfs_inode.i_mode) &&
5276+
inode_only == LOG_INODE_ALL &&
5277+
inode->last_unlink_trans >= trans->transid) {
5278+
btrfs_set_log_full_commit(trans);
5279+
err = 1;
5280+
goto out_unlock;
5281+
}
5282+
52685283
/*
52695284
* a brute force approach to making sure we get the most uptodate
52705285
* copies of everything.
@@ -5428,99 +5443,6 @@ static int btrfs_log_inode(struct btrfs_trans_handle *trans,
54285443
return err;
54295444
}
54305445

5431-
/*
5432-
* Check if we must fallback to a transaction commit when logging an inode.
5433-
* This must be called after logging the inode and is used only in the context
5434-
* when fsyncing an inode requires the need to log some other inode - in which
5435-
* case we can't lock the i_mutex of each other inode we need to log as that
5436-
* can lead to deadlocks with concurrent fsync against other inodes (as we can
5437-
* log inodes up or down in the hierarchy) or rename operations for example. So
5438-
* we take the log_mutex of the inode after we have logged it and then check for
5439-
* its last_unlink_trans value - this is safe because any task setting
5440-
* last_unlink_trans must take the log_mutex and it must do this before it does
5441-
* the actual unlink operation, so if we do this check before a concurrent task
5442-
* sets last_unlink_trans it means we've logged a consistent version/state of
5443-
* all the inode items, otherwise we are not sure and must do a transaction
5444-
* commit (the concurrent task might have only updated last_unlink_trans before
5445-
* we logged the inode or it might have also done the unlink).
5446-
*/
5447-
static bool btrfs_must_commit_transaction(struct btrfs_trans_handle *trans,
5448-
struct btrfs_inode *inode)
5449-
{
5450-
bool ret = false;
5451-
5452-
mutex_lock(&inode->log_mutex);
5453-
if (inode->last_unlink_trans >= trans->transid) {
5454-
/*
5455-
* Make sure any commits to the log are forced to be full
5456-
* commits.
5457-
*/
5458-
btrfs_set_log_full_commit(trans);
5459-
ret = true;
5460-
}
5461-
mutex_unlock(&inode->log_mutex);
5462-
5463-
return ret;
5464-
}
5465-
5466-
/*
5467-
* follow the dentry parent pointers up the chain and see if any
5468-
* of the directories in it require a full commit before they can
5469-
* be logged. Returns zero if nothing special needs to be done or 1 if
5470-
* a full commit is required.
5471-
*/
5472-
static noinline int check_parent_dirs_for_sync(struct btrfs_trans_handle *trans,
5473-
struct btrfs_inode *inode,
5474-
struct dentry *parent,
5475-
struct super_block *sb)
5476-
{
5477-
int ret = 0;
5478-
struct dentry *old_parent = NULL;
5479-
5480-
/*
5481-
* for regular files, if its inode is already on disk, we don't
5482-
* have to worry about the parents at all. This is because
5483-
* we can use the last_unlink_trans field to record renames
5484-
* and other fun in this file.
5485-
*/
5486-
if (S_ISREG(inode->vfs_inode.i_mode) &&
5487-
inode->generation < trans->transid &&
5488-
inode->last_unlink_trans < trans->transid)
5489-
goto out;
5490-
5491-
if (!S_ISDIR(inode->vfs_inode.i_mode)) {
5492-
if (!parent || d_really_is_negative(parent) || sb != parent->d_sb)
5493-
goto out;
5494-
inode = BTRFS_I(d_inode(parent));
5495-
}
5496-
5497-
while (1) {
5498-
if (btrfs_must_commit_transaction(trans, inode)) {
5499-
ret = 1;
5500-
break;
5501-
}
5502-
5503-
if (!parent || d_really_is_negative(parent) || sb != parent->d_sb)
5504-
break;
5505-
5506-
if (IS_ROOT(parent)) {
5507-
inode = BTRFS_I(d_inode(parent));
5508-
if (btrfs_must_commit_transaction(trans, inode))
5509-
ret = 1;
5510-
break;
5511-
}
5512-
5513-
parent = dget_parent(parent);
5514-
dput(old_parent);
5515-
old_parent = parent;
5516-
inode = BTRFS_I(d_inode(parent));
5517-
5518-
}
5519-
dput(old_parent);
5520-
out:
5521-
return ret;
5522-
}
5523-
55245446
/*
55255447
* Check if we need to log an inode. This is used in contexts where while
55265448
* logging an inode we need to log another inode (either that it exists or in
@@ -5686,9 +5608,6 @@ static int log_new_dir_dentries(struct btrfs_trans_handle *trans,
56865608
log_mode = LOG_INODE_ALL;
56875609
ret = btrfs_log_inode(trans, root, BTRFS_I(di_inode),
56885610
log_mode, ctx);
5689-
if (!ret &&
5690-
btrfs_must_commit_transaction(trans, BTRFS_I(di_inode)))
5691-
ret = 1;
56925611
btrfs_add_delayed_iput(di_inode);
56935612
if (ret)
56945613
goto next_dir_inode;
@@ -5835,9 +5754,6 @@ static int btrfs_log_all_parents(struct btrfs_trans_handle *trans,
58355754
ctx->log_new_dentries = false;
58365755
ret = btrfs_log_inode(trans, root, BTRFS_I(dir_inode),
58375756
LOG_INODE_ALL, ctx);
5838-
if (!ret &&
5839-
btrfs_must_commit_transaction(trans, BTRFS_I(dir_inode)))
5840-
ret = 1;
58415757
if (!ret && ctx && ctx->log_new_dentries)
58425758
ret = log_new_dir_dentries(trans, root,
58435759
BTRFS_I(dir_inode), ctx);
@@ -6053,12 +5969,9 @@ static int btrfs_log_inode_parent(struct btrfs_trans_handle *trans,
60535969
{
60545970
struct btrfs_root *root = inode->root;
60555971
struct btrfs_fs_info *fs_info = root->fs_info;
6056-
struct super_block *sb;
60575972
int ret = 0;
60585973
bool log_dentries = false;
60595974

6060-
sb = inode->vfs_inode.i_sb;
6061-
60625975
if (btrfs_test_opt(fs_info, NOTREELOG)) {
60635976
ret = 1;
60645977
goto end_no_trans;
@@ -6069,10 +5982,6 @@ static int btrfs_log_inode_parent(struct btrfs_trans_handle *trans,
60695982
goto end_no_trans;
60705983
}
60715984

6072-
ret = check_parent_dirs_for_sync(trans, inode, parent, sb);
6073-
if (ret)
6074-
goto end_no_trans;
6075-
60765985
/*
60775986
* Skip already logged inodes or inodes corresponding to tmpfiles
60785987
* (since logging them is pointless, a link count of 0 means they

0 commit comments

Comments
 (0)