Skip to content

feat(project-tree): rbac #30933

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Apr 9, 2025
Merged

feat(project-tree): rbac #30933

merged 10 commits into from
Apr 9, 2025

Conversation

mariusandra
Copy link
Collaborator

@mariusandra mariusandra commented Apr 8, 2025

Problem

The file system doesn't support role based access controls.

Changes

  • Add AccessControl filtering to file system list
  • Also add it to all other file system endpoints
  • Remove returning results for the "Unfiled" endpoint (skips filtering and we're not using them anyway)

Small caveat

Caveat: I understand currently not all types are implemented for access control, which includes hog functions... However if/when implemented, the hog_function resource types won't work out of the box. The file system stores the type as "hog_function/destination", whereas I assume access controls will just store "hog_function". We might want to limit access per hog function type... or we might want to refactor the file system to add a "subtype" field. I'm not sure, but I don't see anything really blocking here...

How did you test this code?

Just with unit tests until now.


# 3) Annotate the queryset with 'effective_access_level'.
queryset = queryset.annotate(
effective_access_level=Coalesce(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the new access control system we aren't using effective_access_level, any uses of it are legacy. In this case, I see why you did it because it's not the exact resource but a pointer to it. Just wanted to call that out.


# 4) Exclude items that have 'none' (0) unless user is the creator
# i.e. skip items where effective_access_level=0 AND created_by != user
queryset = queryset.exclude(Q(effective_access_level=0) & ~Q(created_by=user))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why you're mapping these to integers for easy filtering, but we tried to avoid using numbers because adding new access levels would make this hard to change.

We have the notion of ordered_access_levels, which was meant to solve this.

Could we instead annotate the access control enum and exclude where it's "none" to be more explicit?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@mariusandra mariusandra requested a review from zlwaterfield April 8, 2025 19:38
@mariusandra mariusandra marked this pull request as ready for review April 8, 2025 21:33
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

This PR integrates RBAC into the file system API by filtering querysets based on user access control and updating outcomes for endpoints like "unfiled".

  • Modified /posthog/api/file_system.py to apply RBAC filtering and prevent users from accessing unauthorized file system objects.
  • Introduced and utilized filter_and_annotate_file_system_queryset in /posthog/rbac/user_access_control.py.
  • Updated tests in /posthog/api/test/test_file_system.py to assert RBAC restrictions and change the "unfiled" endpoint to return only a count.
  • Extended RBAC tests in /posthog/rbac/test/test_user_access_control.py to cover new file system scenarios.

4 file(s) reviewed, 2 comment(s)
Edit PR Review Bot Settings | Greptile

Copy link
Contributor

@zlwaterfield zlwaterfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just comments, no blockers

FileSystem.objects.filter(team=self.team, path__startswith=f"{instance.path}/").delete()
qs = FileSystem.objects.filter(team=self.team, path__startswith=f"{instance.path}/")
if self.user_access_control:
qs = self.user_access_control.filter_and_annotate_file_system_queryset(qs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting - what happens to things in it where they don't have access? This is just deleting the file system not the actual resource, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, currently deletes are only at the filesystem level. It'll probably remain like this for a bit (as I'll be off for a week now), though the plan is to move it to the actual object level. If you wouldn't have access, those files would just linger on. Someone else with access will then see the folder with whatever is left.

@@ -444,6 +446,74 @@ def filter_queryset_by_access_level(self, queryset: QuerySet, include_all_if_adm

return queryset

def filter_and_annotate_file_system_queryset(self, queryset: QuerySet["FileSystem"]) -> QuerySet["FileSystem"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're creative 😉

This looks good to me

@@ -138,12 +139,18 @@ def safely_get_queryset(self, queryset: QuerySet) -> QuerySet:
if self.action == "list":
queryset = queryset.order_by("path")

if self.user_access_control:
queryset = self.user_access_control.filter_and_annotate_file_system_queryset(queryset)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes in the viewset look good but I do wonder if the access control filter should be moved into the model manager so you don't have to think about it. Something to ponder. I just worry if it needs to be used in every location then it could get forgotten.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder the same. I just added a count_by_path endpoint, and had to now merge with master and add the access control check manually.

I did think about making this check automatic, but decided against it because I wasn't 100% sure of the Django API here and of all the places to add it. I was afraid of accidentally missing something, and causing leakage tha way. I guess this is good enough for now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good for now but worth relooking at.

Bascially just add a custom manager to the FileSystem model that overwrites the queryset. The tricky part would be making sure the manager has access to the user access control which it might not be possible.

Quick psuedo code:

class FileSystemManager(RootTeamManager):
    def get_queryset(self):
        qs = super().get_queryset()
        qs = self.user_access_control.filter_and_annotate_file_system_queryset(qs)
        return qs

class FileSystem(models.Model):
    objects = FileSystemManager()
    
    ...
    

Copy link
Collaborator Author

@mariusandra mariusandra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! 🙌

FileSystem.objects.filter(team=self.team, path__startswith=f"{instance.path}/").delete()
qs = FileSystem.objects.filter(team=self.team, path__startswith=f"{instance.path}/")
if self.user_access_control:
qs = self.user_access_control.filter_and_annotate_file_system_queryset(qs)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, currently deletes are only at the filesystem level. It'll probably remain like this for a bit (as I'll be off for a week now), though the plan is to move it to the actual object level. If you wouldn't have access, those files would just linger on. Someone else with access will then see the folder with whatever is left.

@@ -138,12 +139,18 @@ def safely_get_queryset(self, queryset: QuerySet) -> QuerySet:
if self.action == "list":
queryset = queryset.order_by("path")

if self.user_access_control:
queryset = self.user_access_control.filter_and_annotate_file_system_queryset(queryset)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder the same. I just added a count_by_path endpoint, and had to now merge with master and add the access control check manually.

I did think about making this check automatic, but decided against it because I wasn't 100% sure of the Django API here and of all the places to add it. I was afraid of accidentally missing something, and causing leakage tha way. I guess this is good enough for now?

@mariusandra mariusandra enabled auto-merge (squash) April 9, 2025 14:24
@mariusandra mariusandra merged commit f617252 into master Apr 9, 2025
93 checks passed
@mariusandra mariusandra deleted the project-tree-rbac branch April 9, 2025 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants