-
Notifications
You must be signed in to change notification settings - Fork 9.1k
MAPREDUCE-7430 FileSystemCount enumeration changes will cause mapreduce application failure during upgrade #5255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
@ayushtkn @jojochuang |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense; a little unmarshalling bug.
I wonder if it is possible to write a test? I do not see an easy way to do this at all...
@@ -311,8 +311,11 @@ public void readFields(DataInput in) throws IOException { | |||
String scheme = WritableUtils.readString(in); // scheme | |||
int numCounters = WritableUtils.readVInt(in); // #counter | |||
for (int j = 0; j < numCounters; ++j) { | |||
findCounter(scheme, enums[WritableUtils.readVInt(in)]) // key | |||
int countTypeIndex = WritableUtils.readVInt(in); | |||
if(countTypeIndex < enums.length) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can you add a space between if and (
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can you add a space between if and (
@steveloughran
Thanks for review, fixed as you mentioned.
I have try to write a UT but it's not that easy to construct the scenario,
Just simple fix, I think it is ok without UT.
Pls help to merge once the pipeline is done,
And could you pls help tp merge #5236 as well, one more approval is needed.
Thanks a lot
…ce application failure during upgrade
f88eeeb
to
a58011d
Compare
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks ok to me, but I will see if anyone on the mapreduce list will review it too
We found this issue when doing rollingUpgrade in our production setup.
A new mapreduce counter is introduced in the patch: HADOOP-15507. Add MapReduce counters about EC bytes read.
In upgrade scenario, if the user with old version mapreduce client try to run a job on yarn cluster with new version, the below exception will be thrown in container log:
2022-12-21 21:38:37,037 | INFO | IPC Server handler 28 on 27102 | Commit go/no-go request from attempt_1670928986900_1250_r_000000_0 | TaskAttemptListenerImpl.java:222
2022-12-21 21:38:37,037 | INFO | IPC Server handler 28 on 27102 | Result of canCommit for attempt_1670928986900_1250_r_000000_0:true | TaskImpl.java:592
2022-12-21 21:38:37,037 | WARN | Socket Reader #2 for port 27102 | Unable to read call parameters for client 192.168.4.96on connection protocol org.apache.hadoop.mapred.TaskUmbilicalProtocol for rpcKind RPC_WRITABLE | Server.java:2598
java.lang.ArrayIndexOutOfBoundsException: 5
at org.apache.hadoop.mapreduce.counters.FileSystemCounterGroup.readFields(FileSystemCounterGroup.java:304)
at org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java:324)
at org.apache.hadoop.mapreduce.counters.AbstractCounters.readFields(AbstractCounters.java:307)
at org.apache.hadoop.mapred.TaskStatus.readFields(TaskStatus.java:489)
at org.apache.hadoop.mapred.ReduceTaskStatus.readFields(ReduceTaskStatus.java:140)
at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:285)
at org.apache.hadoop.ipc.WritableRpcEngine$Invocation.readFields(WritableRpcEngine.java:162)
at org.apache.hadoop.ipc.RpcWritable$WritableWrapper.readFrom(RpcWritable.java:85)
at org.apache.hadoop.ipc.RpcWritable$Buffer.getValue(RpcWritable.java:187)
at org.apache.hadoop.ipc.RpcWritable$Buffer.newInstance(RpcWritable.java:183)
at org.apache.hadoop.ipc.Server$Connection.processRpcRequest(Server.java:2594)
at org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:2515)
at org.apache.hadoop.ipc.Server$Connection.unwrapPacketAndProcessRpcs(Server.java:2469)
at org.apache.hadoop.ipc.Server$Connection.saslReadAndProcess(Server.java:1912)
at org.apache.hadoop.ipc.Server$Connection.processRpcOutOfBandRequest(Server.java:2723)
at org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:2509)
at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:2258)
at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:1395)
at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:1251)
at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:1222)
just ignore the line number which is not completely consistent with trunk, but it is easy to understand.
So a extra validate is needed in readFields() t5 avoid the array element to be read is out of range.
