Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datasette create-token ability to create tokens with a reduced set of permissions #1855

Closed
5 tasks done
simonw opened this issue Oct 26, 2022 · 19 comments
Closed
5 tasks done

Comments

@simonw
Copy link
Owner

simonw commented Oct 26, 2022

Initial design ideas: #1852 (comment)

Token design concept:

{
    "t": {
        "a": ["ir", "ur", "dr"],
        "d": {
            "fixtures": ["ir", "ur", "dr"]
        },
        "t": {
            "fixtures": {
                "searchable": ["ir"]
            }
        }
    }
}

That JSON would be minified and signed.

Minified version of the above looks like this (101 characters):

{"t":{"a":["ir","ur","dr"],"d":{"fixtures":["ir","ur","dr"]},"t":{"fixtures":{"searchable":["ir"]}}}}

The "t" key shows this is a token that as a default API key.

"a" means "all" - these are permissions that have been granted on all tables and databases.

"d" means "databases" - this is a way to set permissions for all tables in a specific database.

"t" means "tables" - this lets you set permissions at a finely grained table level.

Then the permissions themselves are two character codes which are shortened versions - so:

  • ir = insert-row
  • ur = update-row
  • dr = delete-row

Remaining tasks

  • Add these options to the datasette create-token command
  • Tests for datasette create-token options
  • Documentation for those options at https://docs.datasette.io/en/latest/authentication.html#datasette-create-token
  • A way to handle permissions that don't have known abbreviations (permissions added by plugins). Probably need to solve the plugin permission registration problem as part of that
  • Stop hard-coding names of actions in the permission_allowed_actor_restrictions function
@simonw
Copy link
Owner Author

simonw commented Oct 26, 2022

I'm going to delay working on this until after I have some of the write APIs built to try it against:

@simonw
Copy link
Owner Author

simonw commented Oct 27, 2022

Here's an interesting edge-case to consider: what if a user creates themselves a token for a specific table, then deletes that table, and waits for another user to create a table of the same name... and then uses their previously created token to write to the table that someone else created?

Not sure if this is a threat I need to actively consider, but it's worth thinking a little bit about the implications of such a thing - since there will be APIs that allow users to create tables, and there may be cases where people want to have a concept of users "owning" specific tables.

This is probably something that could be left for plugins to solve, but it still needs to be understood and potentially documented.

There may even be a world in which tracking the timestamp at which a table was created becomes useful - because that could then be baked into API tokens, such that a token created BEFORE the table was created does not grant access to that table.

@simonw
Copy link
Owner Author

simonw commented Oct 27, 2022

My hunch on this is that anyone with that level of complex permissions requirements needs to be using a custom authentication plugin which includes much more concrete token rules, rather than the default signed stateless token implementation that ships with Datasette core.

@simonw
Copy link
Owner Author

simonw commented Nov 3, 2022

Maybe the way to do this is through a new standard mechanism on the actor: a set of additional restrictions, e.g.:

{
  "id": "root",
  "_r": {
    "a": ["ir", "ur", "dr"],
    "d": {
        "fixtures": ["ir", "ur", "dr"]
    },
    "t": {
        "fixtures": {
            "searchable": ["ir"]
        }
    }
}

"a" is "all permissions" - these apply to everything.
"d" permissions only apply to the specified database
"t" permissions only apply to the specified table

The way this works is there's a default permission_allowed(datasette, actor, action, resource) hook which only consults these, and crucially just says NO if those rules do not match.

In this way it would apply as an extra layer of permission rules over the defaults (which for this root instance would all return yes).

@simonw
Copy link
Owner Author

simonw commented Nov 3, 2022

Built a prototype of the above:

diff --git a/datasette/default_permissions.py b/datasette/default_permissions.py
index 32b0c758..f68aa38f 100644
--- a/datasette/default_permissions.py
+++ b/datasette/default_permissions.py
@@ -6,8 +6,8 @@ import json
 import time
 
 
-@hookimpl(tryfirst=True)
-def permission_allowed(datasette, actor, action, resource):
+@hookimpl(tryfirst=True, specname="permission_allowed")
+def permission_allowed_default(datasette, actor, action, resource):
     async def inner():
         if action in (
             "permissions-debug",
@@ -57,6 +57,44 @@ def permission_allowed(datasette, actor, action, resource):
     return inner
 
 
+@hookimpl(specname="permission_allowed")
+def permission_allowed_actor_restrictions(actor, action, resource):
+    if actor is None:
+        return None
+    _r = actor.get("_r")
+    if not _r:
+        # No restrictions, so we have no opinion
+        return None
+    action_initials = "".join([word[0] for word in action.split("-")])
+    # If _r is defined then we use those to further restrict the actor
+    # Crucially, we only use this to say NO (return False) - we never
+    # use it to return YES (True) because that might over-ride other
+    # restrictions placed on this actor
+    all_allowed = _r.get("a")
+    if all_allowed is not None:
+        assert isinstance(all_allowed, list)
+        if action_initials in all_allowed:
+            return None
+    # How about for the current database?
+    if action in ("view-database", "view-database-download", "execute-sql"):
+        database_allowed = _r.get("d", {}).get(resource)
+        if database_allowed is not None:
+            assert isinstance(database_allowed, list)
+            if action_initials in database_allowed:
+                return None
+    # Or the current table? That's any time the resource is (database, table)
+    if not isinstance(resource, str) and len(resource) == 2:
+        database, table = resource
+        table_allowed = _r.get("t", {}).get(database, {}).get(table)
+        # TODO: What should this do for canned queries?
+        if table_allowed is not None:
+            assert isinstance(table_allowed, list)
+            if action_initials in table_allowed:
+                return None
+    # This action is not specifically allowed, so reject it
+    return False
+
+
 @hookimpl
 def actor_from_request(datasette, request):
     prefix = "dstok_"

@simonw
Copy link
Owner Author

simonw commented Nov 3, 2022

That still needs comprehensive tests before I land it.

simonw added a commit that referenced this issue Nov 4, 2022
New mechanism for restricting permissions further for a given actor.

This still needs documentation. It will eventually be used by the mechanism to issue
signed API tokens that are only able to perform a subset of actions.

This also adds tests that exercise the POST /-/permissions tool, refs #1881
@simonw
Copy link
Owner Author

simonw commented Nov 4, 2022

Added the tests.

@simonw
Copy link
Owner Author

simonw commented Nov 14, 2022

The datasette create-token command will need to be able to do this too.

Right now that command looks like this:

% datasette create-token --help
Usage: datasette create-token [OPTIONS] ID

  Create a signed API token for the specified actor ID

Options:
  --secret TEXT                Secret used for signing the API tokens
                               [required]
  -e, --expires-after INTEGER  Token should expire after this many seconds
  --debug                      Show decoded token
  --help                       Show this message and exit.
% datasette create-token root --secret sec --debug -e 445
dstok_eyJhIjoicm9vdCIsInRva2VuIjoiZHN0b2siLCJ0IjoxNjY4NDA2MjEzLCJkIjo0NDV9.Hd6qRli6xRKkOIRQgZkPO5iN1wM

Decoded:

{
  "a": "root",
  "token": "dstok",
  "t": 1668406213,
  "d": 445
}

(The --debug bit adds the decoded token.)

Syntax for adding "insert row" for everything, "update row" for all in the "data" database and "delete row" just for the docs / titles table:

datasette create-token root --secret sec \
  --all insert-row \
  --database data update-row \
  --table docs titles delete-row

The ir / ur / dr options would work too. To add multiple permissions use these options multiple times:

datasette create-token root --secret sec \
  --all insert-row \
  --all delete-row

Short versions: -a and -d and -t.

UPDATE: I have decided to use the term resource in the user-facing elements of this feature instead of table, since that can refer to a SQL view and a canned query as well.

So --resource and -r, not -t.

simonw added a commit that referenced this issue Dec 6, 2022
Also made a few tweaks to how _r works in tokens and actors,
refs #1855 - I needed that mechanism for the tests.
@simonw
Copy link
Owner Author

simonw commented Dec 6, 2022

That commit there 6da17d5 (which will be squash-merged in a PR later on) made it so that _r was correctly copied across from the token to the created actor, and fixed a bug in the code that checks permissions against it for resources.

I needed that mechanism to write a test that exercised different API permissions.

simonw added a commit that referenced this issue Dec 8, 2022
Close #1878

Also made a few tweaks to how _r works in tokens and actors,
refs #1855 - I needed that mechanism for the tests.
@simonw
Copy link
Owner Author

simonw commented Dec 13, 2022

The hardest piece here is the UI. I'm going to implement the CLI command first.

@simonw
Copy link
Owner Author

simonw commented Dec 13, 2022

I was going to have the CLI command throw an error if you attempt to use a permission that isn't registered with Datasette, but then I remembered that one of the uses for the CLI tool is to create signed tokens that will work against other Datasette instances (via the --secret option) that might have different plugins installed that register different permission names.

So I might have it output warnings instead.

@simonw
Copy link
Owner Author

simonw commented Dec 13, 2022

I'm going to rename "t" in the magic format to "r" for resource.

@simonw
Copy link
Owner Author

simonw commented Dec 13, 2022

Initial prototype of the create-token command changes:

diff --git a/datasette/default_permissions.py b/datasette/default_permissions.py
index 406dae40..bbe1247e 100644
--- a/datasette/default_permissions.py
+++ b/datasette/default_permissions.py
@@ -278,17 +278,55 @@ def register_commands(cli):
         help="Token should expire after this many seconds",
         type=int,
     )
+    @click.option(
+        "alls",
+        "-a",
+        "--all",
+        type=str,
+        multiple=True,
+        help="Restrict token to this permission",
+    )
+    @click.option(
+        "databases",
+        "-d",
+        "--database",
+        type=(str, str),
+        multiple=True,
+        help="Restrict token to this permission on this database",
+    )
+    @click.option(
+        "resources",
+        "-r",
+        "--resource",
+        type=(str, str, str),
+        multiple=True,
+        help="Restrict token to this permission on this database resource (a table, SQL view or named query)",
+    )
     @click.option(
         "--debug",
         help="Show decoded token",
         is_flag=True,
     )
-    def create_token(id, secret, expires_after, debug):
+    def create_token(id, secret, expires_after, alls, databases, resources, debug):
         "Create a signed API token for the specified actor ID"
         ds = Datasette(secret=secret)
         bits = {"a": id, "token": "dstok", "t": int(time.time())}
         if expires_after:
             bits["d"] = expires_after
+        if alls or databases or resources:
+            bits["_r"] = {}
+            if alls:
+                bits["_r"]["a"] = list(alls)
+            if databases:
+                bits["_r"]["d"] = {}
+                for database, action in databases:
+                    bits["_r"]["d"].setdefault(database, []).append(action)
+            if resources:
+                bits["_r"]["r"] = {}
+                for database, table, action in resources:
+                    bits["_r"]["r"].setdefault(database, {}).setdefault(
+                        table, []
+                    ).append(action)
         token = ds.sign(bits, namespace="token")
         click.echo("dstok_{}".format(token))
         if debug:

Still needs tests, plus I'd like it to use abbreviations if available to keep the token length shorter.

@simonw
Copy link
Owner Author

simonw commented Dec 13, 2022

I just noticed this in the existing code:

# How about for the current database?
if action in ("view-database", "view-database-download", "execute-sql"):
database_allowed = _r.get("d", {}).get(resource)
if database_allowed is not None:
assert isinstance(database_allowed, list)
if action_initials in database_allowed:
return None
# Or the current table? That's any time the resource is (database, table)
if resource is not None and not isinstance(resource, str) and len(resource) == 2:

Hard-coding those action names should not be necessary any more, especially now we have datasette.permissions for looking up metadata about the permissions.

@simonw
Copy link
Owner Author

simonw commented Dec 13, 2022

Help looks like this:

Usage: datasette create-token [OPTIONS] ID

  Create a signed API token for the specified actor ID

  Example:

      datasette create-token root --secret mysecret

  To only allow create-table:

      datasette create-token root --secret mysecret \
          --all create-table

  Or to only allow insert-row against a specific table:

      datasette create-token root --secret myscret \
          --resource mydb mytable insert-row

  Restricted actions can be specified multiple times using multiple --all,
  --database, and --resource options.

  Add --debug to see a decoded version of the token.

Options:
  --secret TEXT                   Secret used for signing the API tokens
                                  [required]
  -e, --expires-after INTEGER     Token should expire after this many seconds
  -a, --all ACTION                Restrict token to this action
  -d, --database DB ACTION        Restrict token to this action on this
                                  database
  -r, --resource DB RESOURCE ACTION
                                  Restrict token to this action on this
                                  database resource (a table, SQL view or
                                  named query)
  --debug                         Show decoded token
  --plugins-dir DIRECTORY         Path to directory containing custom plugins
  --help                          Show this message and exit.

@simonw
Copy link
Owner Author

simonw commented Dec 13, 2022

I'm going to move this code into datasette/cli.py - it's a bit unexpected having it live in default_permissions.py like this (I couldn't find the code when I went looking for it earlier).

@simonw
Copy link
Owner Author

simonw commented Dec 13, 2022

For the tests for datasette create-token it would be useful if datasette --get had a mechanism for sending an Authorization: Bearer X header.

@simonw
Copy link
Owner Author

simonw commented Dec 13, 2022

Decided to do the /-/create-token UI in a separate ticket:

@simonw
Copy link
Owner Author

simonw commented Dec 13, 2022

@simonw simonw closed this as completed Dec 13, 2022
@simonw simonw unpinned this issue Dec 13, 2022
@simonw simonw changed the title Ability to create API signed tokens with a reduced set of permissions datasette create-token ability to create tokens with a reduced set of permissions Dec 14, 2022
simonw added a commit that referenced this issue Dec 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant