Skip to content

HBASE-22842 Tmp directory should not be deleted when master restart used for user scan snapshot feature #485

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 16, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@
import org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore;
import org.apache.hadoop.hbase.regionserver.HRegion;
import org.apache.hadoop.hbase.replication.ReplicationUtils;
import org.apache.hadoop.hbase.security.access.SnapshotScannerHDFSAclHelper;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.hbase.util.FSTableDescriptors;
import org.apache.hadoop.hbase.util.FSUtils;
Expand Down Expand Up @@ -318,20 +319,26 @@ void checkTempDir(final Path tmpdir, final Configuration c, final FileSystem fs)
for (Path tableDir: FSUtils.getTableDirs(fs, tmpdir)) {
HFileArchiver.archiveRegions(c, fs, this.rootdir, tableDir,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The archiveRegions will archive all the hfiles and if all successfull... Then it will remains some dirs such as /.tmp/data/ns/table ? If we've deleted a namespace or table , the ns or table dir under .tmp will be also removed (If no, the we will remain many useless dir, that's not good) ?
Another concern is : if hfiles archiving failure happen and enable the snapshot acl sync, will we remain much hfiles under the .tmp dir ? will be storage consuming if so... maybe we need a UT ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tmp namespace or table dirs are deleted if namespace or table are dropped. There are UTs in TestSnapshotScannerHDFSAclController to check it.
Add a warn log to record unarchived regions, if this problem happens in practice, we can fix this? And add UT in TestSnapshotScannerHDFSAclController.

FSUtils.getRegionDirs(fs, tableDir));
if (!FSUtils.getRegionDirs(fs, tableDir).isEmpty()) {
LOG.warn("Found regions in tmp dir after archiving table regions, {}", tableDir);
}
}
if (!fs.delete(tmpdir, true)) {
// if acl sync to hdfs is enabled, then skip delete tmp dir because ACLs are set
if (!SnapshotScannerHDFSAclHelper.isAclSyncToHdfsEnabled(c) && !fs.delete(tmpdir, true)) {
throw new IOException("Unable to clean the temp directory: " + tmpdir);
}
}

// Create the temp directory
if (isSecurityEnabled) {
if (!fs.mkdirs(tmpdir, secureRootSubDirPerms)) {
throw new IOException("HBase temp directory '" + tmpdir + "' creation failure.");
}
} else {
if (!fs.mkdirs(tmpdir)) {
throw new IOException("HBase temp directory '" + tmpdir + "' creation failure.");
if (!fs.exists(tmpdir)) {
if (isSecurityEnabled) {
if (!fs.mkdirs(tmpdir, secureRootSubDirPerms)) {
throw new IOException("HBase temp directory '" + tmpdir + "' creation failure.");
}
} else {
if (!fs.mkdirs(tmpdir)) {
throw new IOException("HBase temp directory '" + tmpdir + "' creation failure.");
}
}
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@
import org.apache.hadoop.hbase.testclassification.LargeTests;
import org.apache.hadoop.hbase.testclassification.SecurityTests;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.hbase.util.FSUtils;
import org.apache.hadoop.hbase.util.HFileArchiveUtil;
import org.junit.AfterClass;
import org.junit.BeforeClass;
Expand Down Expand Up @@ -883,9 +884,24 @@ public void testRestartMaster() throws Exception {
User grantUser = User.createUserForTesting(conf, grantUserName, new String[] {});
String namespace = name.getMethodName();
TableName table = TableName.valueOf(namespace, "t1");
TableName table2 = TableName.valueOf(namespace, "t2");
String snapshot = namespace + "t1";
admin.createNamespace(NamespaceDescriptor.create(namespace).build());

// create table2
TestHDFSAclHelper.createTableAndPut(TEST_UTIL, table2);
// make some region files in tmp dir and check if master archive these region correctly
Path tmpTableDir = helper.getPathHelper().getTmpTableDir(table2);
// make a empty region dir, this is an error region
fs.mkdirs(new Path(tmpTableDir, "1"));
// copy regions from data dir, this is a valid region
for (Path regionDir : FSUtils.getRegionDirs(fs,
helper.getPathHelper().getDataTableDir(table2))) {
FSUtils.copyFilesParallel(fs, regionDir, fs,
new Path(tmpTableDir, regionDir.getName() + "abc"), conf, 1);
}
assertEquals(4, fs.listStatus(tmpTableDir).length);

// grant N(R)
SecureTestUtil.grantOnNamespace(TEST_UTIL, grantUserName, namespace, READ);
// restart cluster and tmp directory will not be deleted
Expand All @@ -894,15 +910,16 @@ public void testRestartMaster() throws Exception {
TEST_UTIL.waitUntilNoRegionsInTransition();

Path tmpNsDir = helper.getPathHelper().getTmpNsDir(namespace);
assertFalse(fs.exists(tmpNsDir));
assertTrue(fs.exists(tmpNsDir));
// check all regions in tmp table2 dir are archived
assertEquals(0, fs.listStatus(tmpTableDir).length);

// create table2 and snapshot
// create table1 and snapshot
TestHDFSAclHelper.createTableAndPut(TEST_UTIL, table);
admin = TEST_UTIL.getAdmin();
aclTable = TEST_UTIL.getConnection().getTable(PermissionStorage.ACL_TABLE_NAME);
admin.snapshot(snapshot, table);
// TODO fix it in another patch
TestHDFSAclHelper.canUserScanSnapshot(TEST_UTIL, grantUser, snapshot, -1);
TestHDFSAclHelper.canUserScanSnapshot(TEST_UTIL, grantUser, snapshot, 6);
}

private void checkUserAclEntry(List<Path> paths, String user, boolean requireAccessAcl,
Expand Down