-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-26273 Force ReadType.STREAM when the user does not explicitly s… #3675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-26273 Force ReadType.STREAM when the user does not explicitly s… #3675
Conversation
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM,
[nit] if possible, can you fix the checkstyle and whitespace ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
/** | ||
* The {@link ReadType} which should be set on the {@link Scan} to read the HBase Snapshot, default STREAM. | ||
*/ | ||
public static final String SNAPSHOT_INPUTFORMAT_SCANNER_READTYPE = "hbase.TableSnapshotinputFormat.scanner.readtype"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TableSnapshotinputFormat => TableSnapshotInputFormat ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hah! Thanks for noticing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is an improvement so give a +1 first.
But just curious, how much we could gain from this change?
By default, we will switch to stream after reading a small amount of data, several hundreds of KBs? If we read several hundreds of MBs of data in a map reduce job, I do not think it will effect the performance too much?
For a standalone Java program reading a ~5G file in a single JVM (... using the mapreduce snapshot APIs), this change improved run time from 90s to 30s. In a distributed system, it only had about 15% improvement (network became the bottleneck -- that's where HBASE-26274 came into play).
You bet. I hadn't looked at that yet. Thanks for calling it out! |
…et a ReadType on the Scan for a Snapshot-based Job HBase 2 moved over Scans to use PREAD by default instead of STREAM like HBase 1. In the context of a MapReduce job, we can generally expect that clients using the InputFormat (batch job) would be reading most of the data for a job. Cater to them, but still give users who want PREAD the ability to do so.
1d00647
to
a808a2d
Compare
a808a2d has the fixes requested. If QA comes back happy, I'll just merge this. No need to bother y'all for a re-review. Thanks for the eyes! |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
The number is impressive. For standalone Java program, is it hdfs local read with short circuit read enabled or through local tcp connection? |
This should have been a remote TCP connection. I ran these numbers on a multiple-node cluster. It's possible that part of the data was hosted by a local Datanode, but, if memory serves, it was largely remote reads. I have it in a private Git repository with the steps I was doing to test. I can post it if you're curious to reproduce what I did. |
Merged to master, branch-2, and branch-2.4. |
Thanks @joshelser. Was trying to understand how much improvement it does for regions with locality. If it is ok to share your private Git repo, I'd like to run the test on regions with 100% locality and share the numbers on the jira. |
@huaxiangsun https://github.com/joshelser/stream-repro this is the rough outline of what I was doing. Pretty straightforward (hbase pe to make data in a table, take a snapshot, and a |
It is a bit surprise to me that there could a 15% impact on performance. I was suppose that there should be little differences as we only read a very small amount of data with pread. Mind sharing more details here? Such as the HFile block size or something else? IIRC, the default config is to switch to stream after reading 4 HFile block size. And I saw you have already provide the test code, let me also take a look. Maybe we should file an issue about the performance issue with pread switching to stream.
I think we also need this on branch-2.3? It has not been EOL yet. Thanks. |
Oh, just notice that this is for reading snapshot... Let me take a look at the input format implementation. |
Yup! Just for reading HFiles directly from the filesystem in a local JVM.
My bad. i'll apply there too. |
Thanks @joshelser, will report back. |
…et a ReadType on the Scan for a Snapshot-based Job
HBase 2 moved over Scans to use PREAD by default instead of STREAM like
HBase 1. In the context of a MapReduce job, we can generally expect that
clients using the InputFormat (batch job) would be reading most of the
data for a job. Cater to them, but still give users who want PREAD the
ability to do so.