Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,8 @@
package org.apache.hadoop.hbase.master.snapshot;

import java.io.IOException;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

import java.util.stream.Collectors;
import org.apache.hadoop.hbase.ServerName;
import org.apache.hadoop.hbase.client.RegionInfo;
import org.apache.hadoop.hbase.client.RegionReplicaUtil;
Expand All @@ -35,7 +33,7 @@
import org.apache.yetus.audience.InterfaceAudience;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.apache.hbase.thirdparty.com.google.common.collect.Lists;

import org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotDescription;

/**
Expand Down Expand Up @@ -70,18 +68,10 @@ public EnabledTableSnapshotHandler prepare() throws Exception {
*/
@Override
protected void snapshotRegions(List<Pair<RegionInfo, ServerName>> regions) throws IOException {
Set<String> regionServers = new HashSet<>(regions.size());
for (Pair<RegionInfo, ServerName> region : regions) {
if (region != null && region.getFirst() != null && region.getSecond() != null) {
RegionInfo hri = region.getFirst();
if (hri.isOffline() && (hri.isSplit() || hri.isSplitParent())) continue;
regionServers.add(region.getSecond().toString());
}
}

// start the snapshot on the RS
Procedure proc = coordinator.startProcedure(this.monitor, this.snapshot.getName(),
this.snapshot.toByteArray(), Lists.newArrayList(regionServers));
this.snapshot.toByteArray(), master.getServerManager().getOnlineServersList()
.stream().map(ServerName::toString).collect(Collectors.toList()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want to specify RSes potentially not having anything to do with the snapshot as an acquiring member?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that all the RSes will add themselves to zk as acquired/reached, but Procedure only set those where the table regions on as barriers. So if all barriers reached, snapshot root node will be deleted, but the none barrier RSes may write on it and throw exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the isSnaphsotDone would be called for members only, right?

I don't think so. It's just get the Procedure status.
You can see more details in the issue report.

if (proc == null) {
String msg = "Failed to submit distributed procedure for snapshot '"
+ snapshot.getName() + "'";
Expand Down