-
Notifications
You must be signed in to change notification settings - Fork 963
Use immutable metadata in LedgerHandle #1646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
6817348
4f5e62d
50a4840
5931a68
c700787
fc13a8e
2ed547c
97aa23f
3245e76
164efbc
0bc7e07
c102f4c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,98 @@ | ||
| /* | ||
| * | ||
| * Licensed to the Apache Software Foundation (ASF) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. The ASF licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| * | ||
| */ | ||
| package org.apache.bookkeeper.client; | ||
|
|
||
| import static com.google.common.base.Preconditions.checkArgument; | ||
|
|
||
| import java.util.ArrayList; | ||
| import java.util.HashSet; | ||
| import java.util.List; | ||
| import java.util.Map; | ||
| import java.util.Set; | ||
|
|
||
| import org.apache.bookkeeper.net.BookieSocketAddress; | ||
|
|
||
| import org.slf4j.Logger; | ||
| import org.slf4j.LoggerFactory; | ||
|
|
||
| class EnsembleUtils { | ||
| private static final Logger LOG = LoggerFactory.getLogger(EnsembleUtils.class); | ||
|
|
||
| static List<BookieSocketAddress> replaceBookiesInEnsemble(BookieWatcher bookieWatcher, | ||
| LedgerMetadata metadata, | ||
| List<BookieSocketAddress> oldEnsemble, | ||
| Map<Integer, BookieSocketAddress> failedBookies, | ||
| String logContext) | ||
| throws BKException.BKNotEnoughBookiesException { | ||
| List<BookieSocketAddress> newEnsemble = new ArrayList<>(oldEnsemble); | ||
|
|
||
| int ensembleSize = metadata.getEnsembleSize(); | ||
| int writeQ = metadata.getWriteQuorumSize(); | ||
| int ackQ = metadata.getAckQuorumSize(); | ||
| Map<String, byte[]> customMetadata = metadata.getCustomMetadata(); | ||
|
|
||
| Set<BookieSocketAddress> exclude = new HashSet<>(failedBookies.values()); | ||
|
|
||
| int replaced = 0; | ||
| for (Map.Entry<Integer, BookieSocketAddress> entry : failedBookies.entrySet()) { | ||
| int idx = entry.getKey(); | ||
| BookieSocketAddress addr = entry.getValue(); | ||
| if (LOG.isDebugEnabled()) { | ||
| LOG.debug("{} replacing bookie: {} index: {}", logContext, addr, idx); | ||
| } | ||
|
|
||
| if (!newEnsemble.get(idx).equals(addr)) { | ||
| if (LOG.isDebugEnabled()) { | ||
| LOG.debug("{} Not changing failed bookie {} at index {}, already changed to {}", | ||
| logContext, addr, idx, newEnsemble.get(idx)); | ||
| } | ||
| continue; | ||
| } | ||
| try { | ||
| BookieSocketAddress newBookie = bookieWatcher.replaceBookie( | ||
| ensembleSize, writeQ, ackQ, customMetadata, newEnsemble, idx, exclude); | ||
| newEnsemble.set(idx, newBookie); | ||
|
|
||
| replaced++; | ||
| } catch (BKException.BKNotEnoughBookiesException e) { | ||
| // if there is no bookie replaced, we throw not enough bookie exception | ||
| if (replaced <= 0) { | ||
| throw e; | ||
| } else { | ||
| break; | ||
| } | ||
| } | ||
| } | ||
| return newEnsemble; | ||
| } | ||
|
|
||
| static Set<Integer> diffEnsemble(List<BookieSocketAddress> e1, | ||
| List<BookieSocketAddress> e2) { | ||
| checkArgument(e1.size() == e2.size(), "Ensembles must be of same size"); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How is this better than Assert?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. asserts get turned off by default at runtime. |
||
| Set<Integer> diff = new HashSet<>(); | ||
| for (int i = 0; i < e1.size(); i++) { | ||
| if (!e1.get(i).equals(e2.get(i))) { | ||
| diff.add(i); | ||
| } | ||
| } | ||
| return diff; | ||
| } | ||
| } | ||
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -102,6 +102,19 @@ LedgerMetadataBuilder withEnsembleSize(int ensembleSize) { | |
| return this; | ||
| } | ||
|
|
||
| LedgerMetadataBuilder withWriteQuorumSize(int writeQuorumSize) { | ||
| checkArgument(ensembleSize >= writeQuorumSize, "Write quorum must be less or equal to ensemble size"); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. check
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added |
||
| checkArgument(writeQuorumSize >= ackQuorumSize, "Write quorum must be greater or equal to ack quorum"); | ||
| this.writeQuorumSize = writeQuorumSize; | ||
| return this; | ||
| } | ||
|
|
||
| LedgerMetadataBuilder withAckQuorumSize(int ackQuorumSize) { | ||
| checkArgument(writeQuorumSize >= ackQuorumSize, "Ack quorum must be less or equal to write quorum"); | ||
| this.ackQuorumSize = ackQuorumSize; | ||
| return this; | ||
| } | ||
|
|
||
| LedgerMetadataBuilder newEnsembleEntry(long firstEntry, List<BookieSocketAddress> ensemble) { | ||
| checkArgument(ensemble.size() == ensembleSize, | ||
| "Size of passed in ensemble must match the ensembleSize of the builder"); | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -536,12 +536,9 @@ public void testReadAfterLastAddConfirmed() throws Exception { | |
| } | ||
| } | ||
|
|
||
| try { | ||
| writeLh.close(); | ||
| fail("should not be able to close the first LedgerHandler as a recovery has been performed"); | ||
| } catch (BKException.BKMetadataVersionException expected) { | ||
| } | ||
|
|
||
| // should still be able to close as long as recovery closed the ledger | ||
| // with the same last entryId and length as in the write handle. | ||
| writeLh.close(); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hmm this changes the closing behavior. I am not sure how it would impact the applications. so I would suggest keep the original behavior if closing a ledger hit metadata version exception don't attempt. If you want to change the behavior, do a separate PR for it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems like a desirable change to me. Moreover, it protects against the case where the racing update came from the same client due to ZooKeeperClient resending the update.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @sijie the previous behaviour is not documented, nor is it well defined. In some cases a metadata version exception allowed a close to succeed, and in others it did not. I would not expect any application is relying on this behaviour, and if they are, they are probably broken in many other ways.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understood it is not documented. but if we are changing any existing behavior, I would suggest either doing it in a separate PR, or if it is difficult to do it in a separate PR, then update the javadoc of this method to make things clear "this method will not throw any exceptions anymore if hitting metadata version exceptions"
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It may still throw an exception on metadata version exception. However, it will only throw the exception if the length or last entry id in the conflicting write is different to what the caller of #close believed it to be. I strongly prefer making this change as part of this patch, as to do otherwise would be to insert arbitrary strange behaviour into the new implementation. I'll add a javadoc for this.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added javadoc |
||
| } | ||
| } | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a logging message dropped. it would be great not to drop log messages related to ensemble changes. they exist for a reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, the final summary message is pretty handy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will readd. It's not a final summary however, it's the change that we wish to make. It's not final until the zookeeper write completes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, but let us not drop any debug/log messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Readded, but in the calling method, as this method doesn't have all the context.