Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/issue/19268 #20

Open
wants to merge 57 commits into
base: master
Choose a base branch
from
Open

Fix/issue/19268 #20

wants to merge 57 commits into from

Conversation

labuladong
Copy link
Owner

Fixes #xyz

Master Issue: #xyz

PIP: #xyz

Motivation

Modifications

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

lhotari and others added 30 commits January 17, 2023 10:36
…ctive broker. (apache#19244)

Co-authored-by: nicklixinyang <nicklixinyang@didiglobal.com>
Signed-off-by: Zixuan Liu <nodeces@gmail.com>
…pache#19197)

PIP: apache#12105 

### Motivation

This is the first of several PRs to implement [PIP 97](apache#12105).

This PR seeks to start to solve the fact that the `AuthenticationState` class currently authenticates `authData` twice instead of just once. This change is important to make before we are able to utilize the async methods introduced in apache#12104.

Historical context: apache#14044 introduced the `AuthenticationProvider#newHttpAuthState`  method. The only use case for this method in the pulsar code base is to let custom providers specify the `AuthenticationDataSource` on http request attributes. The primary problem with that implementation is that the `AuthenticationState` class currently involves authenticating the `authData` passed in to the `newHttpAuthState`. As such, this code is sub-optimal, and creates a confusing flow.

I propose we deprecate the `newHttpAuthState` method and instead start using the `authenticateHttpRequestAsync` and `authenticateHttpRequest` methods to allow custom implementations to configure the desired `AuthenticationDataSource` on the request attributes.

In order to simplify the diff for reviewers, this PR uses the deprecated `AuthenticationProvider#authenticateHttpRequest` method. I plan to follow up and switch to use the `AuthenticationProvider#authenticateHttpRequestAsync` method.

Note that these changes are completely backwards compatible. The only risk is to users that have custom code loaded into the broker that calls the `AuthenticationProvider#authenticateHttpRequest` method.

### Modifications

* Deprecate `AuthenticationService#authenticateHttpRequest(HttpServletRequest request, AuthenticationDataSource authData)`. It is no longer used.
* Deprecate `AuthenticationProvider#newHttpAuthState(HttpServletRequest request)`. It is no longer used outside of the `AuthenticationProvider` interface itself.
* Remove `@InterfaceStability.Unstable` annotation from the `authenticateHttpRequestAsync` method. When I introduced that annotation, I was under the impression that we didn't need it. However, in order to meet the requirements introduced in apache#14044, we need to let custom `AuthenticationProviders` add their own attributes.
* Update the default implementation of `authenticateHttpRequest`. Because the previous implementation was unreachable by all auth providers except for the SASL implementation, this is a backwards compatible change.
* Implement changes for the `AuthenticationProviderToken` so that it no longer relies on `newHttpAuthState`.

### Verifying this change

I added new tests.

### Does this pull request potentially affect one of the following parts:

- [x] The public API

This changes the public API within the broker by marking some methods as `@Deprecated`.

### Documentation

- [x] `doc-not-needed`

We document the `AuthenticationProvider` interface in the code. I added these docs. There is currently no where else to update docs.

### Matching PR in forked repository

PR in forked repository: michaeljmarshall#12

### Additional motivation from PR discussion

My primary motivation for this PR is to implement PIP 97. If we want authentication to be asynchronous, we cannot assume that when the `AuthenticationState` object is initialized, the `authenticationDataSource` and the `authRole` are present because the authentication might not yet be completed. My goal with this PR is to break an unnecessary relationship between `AuthenticationState` and http request authentication that was created by apache#14044.
…che#19278)

### Motivation

The `AuthenticationDataSource` variable was added to the `ProxyConnection` class in apache#1200. It is no longer needed.

Note that we store this variable in the `ServerCnx` because it is used for authorization. Because we do not do authorization in the proxy, we don't need it.

### Modifications

* Remove unused variable

### Verifying this change

This is a trivial change.

### Does this pull request potentially affect one of the following parts:

The only conceivable way this is a breaking change is if a third party implementation implemented their library so that `authState.getAuthDataSource()` has a side effect.

### Documentation

- [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->

No docs needed.
…apache#19275)

### Motivation

#### 1. `readPosition` point to a deleted ledger

When `trim ledgers` and `create new cursor` are executed concurrently, it will cause the `readPosition` of the cursor to point to a deleted ledger.

| time | `trim ledgers` | `create new cursor` |
| --- | --- | --- |
| 1 | | set read position and mark deleted position |
| 2 | delete ledger | |
| 3 | | add the cursor to `ManagedLedger.cursors` |

----

#### 2. Backlog wrong caused by `readPosition` wrong
<strong>(Highlight)</strong>Since the read position of the cursor is pointing at a deleted ledger, so deleted messages will never be consumed or acknowledged. Since the backlog in the API `topics stats` response is calculated as this: `managedLedger.entriesAddedCounter - cursor.messagesConsumedCounter`, the result is: Topics stats show `msgBacklog` but there is reality no backlog.
- `managedLedger.entriesAddedCounter`: Pulsar will set it to `0` when creating a new managed ledger, it will increment when adding entries.
- `cursor.messagesConsumedCounter`: Pulsar will set it to `0` when creating a new cursor, it will increment when acknowledging.

For example: 
- write entries to the managed ledger: `{1:0~1:9}...{5:0~5:9}`
  - `managedLedger.entriesAddedCounter` is `50` now
- create a new cursor, and set the read position to `1:0`
  -  `cursor.messagesConsumedCounter` is `0` now
- delete ledgers `1~4`
- consume all messages
  - can only consume the messages {5:0~5:9}, so `cursor.messagesConsumedCounter` is `10` now
- the `backlog` in response of `topics stats` is `50 - 10 = 40`, but there reality no backlog

----

#### 3. Reproduce issue
Sorry, I spent 4 hours trying to write a non-invasive test, but failed. <strong>(Highlight)</strong>You can reproduce by `testBacklogIfCursorCreateConcurrentWithTrimLedger` in the PR apache#19274

https://github.com/apache/pulsar/blob/a2cdc759fc2710e4dd913eb0485d23ebcaa076a4/pulsar-broker/src/test/java/org/apache/bookkeeper/mledger/impl/StatsBackLogTest.java#L163

### Modifications

Avoid the race condition of `cursor.initializeCursorPosition` and `internalTrimLedgers`
lhotari and others added 24 commits January 24, 2023 12:48
…he#19312)

PIP: apache#12105 

### Motivation

When authentication fails in the `ServerCnx`, the state is left in `Start` if the primary `authData` fails authentication and in `Connecting` or `Connected` if the `originalAuthData` authentication fails. To prevent any kind of unexpected behavior, we should go to `Failed` state.

Note that the tests verify the current behavior where a failed `originalAuthData` results first in a `Connected` command from the broker and then an `Error` command. I documented that I think this is sub optimal here apache#19311.

### Modifications

* Update `ServerCnx` state to `Failed` when there is an authentication exception during `handleConnect` and during `handleAuthResponse`.
* Update `handleAuthResponse` reply to `"Unable to authenticate"` instead of the `AuthenticationState` exception.

### Verifying this change

A new test is added. The added test covers the change made in apache#19295 where we updated `ServerCnx` so that we call `AuthState#authenticate` instead of relying on the implementation detail that the initialization calls `authenticate`. That PR should have added a test.

### Does this pull request potentially affect one of the following parts:

This is not a breaking change.

### Documentation

- [x] `doc-not-needed`

### Matching PR in forked repository

PR in forked repository: michaeljmarshall#18
…s incorrect (apache#19301)

Co-authored-by: zhangzhanchang <zhangzhanchang@didiglobal.com>
…dCursor(single subscription check) upon subscription (apache#19343)
@github-actions
Copy link

@labuladong Please add the following content to your PR description and select a checkbox:

- [ ] `doc` <!-- Your PR contains doc changes -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [ ] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

@github-actions
Copy link

github-actions bot commented Mar 1, 2023

The pr had no activity for 30 days, mark with Stale label.

@github-actions github-actions bot added the Stale label Mar 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.