Closed
Description
related to #766.
Checklist
Described in #5769
Tell us about the problem you're trying to solve
- Several of our integrations requires authenticating using oauth. The common way of doing this in singer is to cheese the system a little bit. Essentially you find some way to get a refresh token by extracting it out of the network call in the browser's developer tools and then passing it as an argument to the integration. This is not how oauth is intended to work, but we've followed singer's cue here and the done same.
- The down side of this approach is that it's really unfriendly to the user:
- Accessing the refresh token is usually something intended to be done by developers, not your average user of a service.
- Even if you are a developer, it's supposed to be done inside their own application, not as a series of scripts and hacks, which is what the current procedure relies on.
- Anecdotally it takes me on average an hour to go from creating an account to successfully extracting a refresh token for a given service. this is a pretty big friction!!!
If you're not familiar with oauth (or forget how it works every time you encounter it, like me)...
The flow looks something like this.
- X is my application that wants to access User Y's data in Application Z.
- A developer from X goes to Z and gets some credentials to identify their application (usually a client id and a client secret)
- While User Y is using X, X says it needs access to User Y's data in Z.
- User Y is redirected to Z's oauth portal (a.k.a that page where it says "Z wants to be able to see your data, is that okay?" (under the hood, X has passed its own client id and secret to Z to identify it's application)
- Assuming the User agreed to give access, Z redirects back to Y providing it with a refresh token. This refresh token can then be used to create access tokens. The access tokens are how X is able to access Y's data in Z. Access tokens expire after a few hours. Refresh tokens (in ad tech) often don't expire unless they are revoked. If the refresh token does expire, it's usually after many days / months. Pretty much as long as X has a non-revoked / non-expired refresh token, it will be able to access Y's data.
Describe the solution you’d like
- Airbyte should provide facility for integrations to do oauth in airbyte's UI.
- The flow:
- User selects the integration they want to use. They input the credentials (e.g. client id, client secret) (or we input our own, not sure which ones makes most sense yet).
- Airbyte uses that to construct the correct request to the integration's oauth portal. The user will be prompted to allow Airbyte access to their data.
- Once they hit accept they will be redirected back to Airbyte. Airbyte will behind the scenes store the refresh token (this is how oauth is normally supposed to work).
- This is better because now the user doesn't need to worry about refresh tokens at all.
How
- This isn't that easy to do ...
- Right now all integration related code runs inside the workers (docker containers). This is great for creating hermetic environments to run integration code.
- OAuth relies on the browser to work. So while we may be able to offload some of the worker to the worker (e.g. constructing the correct request and handling extracting the refresh token from the response, ultimately the requests need to be made in the browser, not in docker containers, so that users can properly approve the transaction.
┆Issue is synchronized with this Asana task by Unito
Activity