Skip to content

HBASE-22749: Distributed MOB compactions #623

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 16 commits into from
Closed

HBASE-22749: Distributed MOB compactions #623

wants to merge 16 commits into from

Conversation

VladRodionov
Copy link
Contributor

Initial patch for HBASE-22748: Distributed MOB compactions

@Apache-HBase

This comment has been minimized.

Copy link
Contributor

@busbey busbey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still reviewing, but trying to periodically get feedback published.

@@ -342,6 +325,21 @@
import org.apache.hadoop.hbase.shaded.protobuf.generated.ReplicationProtos.UpdateReplicationPeerConfigRequest;
import org.apache.hadoop.hbase.shaded.protobuf.generated.ReplicationProtos.UpdateReplicationPeerConfigResponse;
import org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotDescription;
import org.apache.hadoop.hbase.snapshot.ClientSnapshotDescriptionUtils;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't include unrelated formatting changes.

this is also the wrong import order for our checkstyle rules. the o.a.h.h.shaded stuff is purposefully grouped later than other things. for reference see the ImportOrder entry under hbase-checkstyle/src/main/resources/hbase/checkstyle.xml:

    <module name="ImportOrder">
      <property name="groups" value="*,org.apache.hbase.thirdparty,org.apache.hadoop.hbase.shaded"/>
      <property name="option" value="top" />
      <property name="ordered" value="true"/>
      <property name="sortStaticImportsAlphabetically" value="true"/>
    </module>

Depending on what IDE you're using, I believe we have some helper configs in dev-support for getting things in line with what we check.

gg.sortFiles();
}
// Sort generations
Collections.sort(gens);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're assuming here that GEN0 will sort before other generations based on assuming it must be for data from before we had the additional information in the mob filename, right? What if folks have been specifying timestamps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, why they have to do this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should prioritize compacting GEN0 because we can't clean up any mob files for a table until all the pre-mob-ref-tracking hfiles are gone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GEN0 files will have, by default, oldest timestamps, so they will come first. The one major assumption in new compaction also - is nobody touches MOB files except HBase system. If I got you correctly, you meant that somebody could touch MOB files and change their modification time? IN this case the system will fail, of course in several places, especially during cleaning MOB files, where we rely on correct timestamps.


}

class Generations {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any tests around Generation partitioning or the generational compaction. maybe I missed them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@busbey
Copy link
Contributor

busbey commented Sep 23, 2019

still reviewing, but trying to periodically get feedback published.

I believe I've gotten through everything for the current patch.

@busbey
Copy link
Contributor

busbey commented Oct 14, 2019

Looks like a revert was pushed instead an update to the changeset?

@busbey
Copy link
Contributor

busbey commented Oct 16, 2019

please push the current state of the work to the PR branch. it's much harder to follow conversation and track the code off of an outdated commit.

@apache apache deleted a comment from Apache-HBase Oct 17, 2019
@Apache-HBase

This comment has been minimized.

}
}
}
LOG.info(" MOB Cleaner found {} files for family={}", toArchive.size() , family);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we get to this point and it's taken us longer than MOB_MINIMUM_FILE_AGE_TO_ARCHIVE_KEY then we need to do something, since we know that there can easily be a disconnect between the view we had of regions while building the set of active mob files and current.

  1. We could log an ERROR and abort. Good because we know we won't delete anything we need. Bad because it requires an operator noticing and making a configuration change. The cost of them not noticing is we stop removing unneeded things from HDFS, so the cluster could die by exhausting available space.
  2. We could use MOB_MINIMUM_FILE_AGE_TO_ARCHIVE_KEY as a floor. Say we use an archive threshold of 2x the time to build the list of active mobs or MOB_MINIMUM_FILE_AGE_TO_ARCHIVE_KEY, whichever is greater. Good because don't risk cluster death if no one notices. Maybe bad because of how large of a window we end up with? Oh! also bad because if it takes us a long time to get through iterating on the mob directory we might exhaust the min age window still.
  3. Instead of using now when we start iterating over the mob directory, we use now as of before we start iterating over the hfile directories. Good because again no risk of cluster death if no one notices. Also easier to explain/reason about than the variable look-behind of number 2.

what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can record the time needs to build the set of all active MOB files as T and use 2xT as minimum age to archive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means we do not need MOB_MINIMUM_FILE_AGE_TO_ARCHIVE_KEY anymore.

Copy link
Contributor Author

@VladRodionov VladRodionov Oct 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is new algorithm:

      long maxCreationTimeToArchive = EnvironmentEdgeManager.currentTime() - 3600000;
      LOG.info("Only MOB files whose creation time less than {} will be archived", 
        maxCreationTimeToArchive);

The idea is to set maxCreationTimeToArchive as now() -1h. now() - when cleaning chore starts. 1h gap gives us guarantee that all corresponding store files exist at the time we start cleaning. It is hard to imagine, that MOB file and store creation are more that 1h apart.
Then I just check MOB file creation time and if its less than this threshold - it can be archived.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants