Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permissions in VectorDB #1550

Closed
vitalyshalumov opened this issue Apr 11, 2024 · 6 comments
Closed

Permissions in VectorDB #1550

vitalyshalumov opened this issue Apr 11, 2024 · 6 comments

Comments

@vitalyshalumov
Copy link

Hi,
I want each user with his own permissions to the VDB.
How can I set permissions for each document in the VDB?

Thank you

@pseudotensor
Copy link
Collaborator

Hi, at moment every user has their own db unless it's a shared db . We don't currently have capability to set permissions on each document within that shared db, so you can use the file system to control personal collections and their access.

E.g. you can use hard or soft links with all docs, and then for each user make a personal db using make_db using a user_path that contains only the links required.

@vitalyshalumov
Copy link
Author

vitalyshalumov commented Apr 16, 2024

Can I somehow utilize permission options in qdrant or weaviate to enable said control per user?

Or using cli, for a db that contains all documents, enable different documents per user?

@pseudotensor
Copy link
Collaborator

Well, what I was suggesting was related to your 2nd question. You can use file system to avoid dups via soft/hard links in linux, but that's a detail. In general you can have different folders per user.

E.g. if the user is "jon" then the folders end up looking like:

(h2ogpt) jon@pseudotensor:~/h2ogpt$ ls -alrt users/jon/
total 84
drwx------   2 jon jon  4096 Apr  8 01:49 db_dir_yuppy/
drwx------   2 jon jon  4096 Apr  8 01:49 db_dir_xxx/
drwx------   2 jon jon  4096 Apr  8 01:49 db_dir_testsum1/
drwx------   2 jon jon  4096 Apr  8 01:49 db_dir_feefef/
drwx------   2 jon jon  4096 Apr  8 01:49 db_dir_dudedata/
drwx------   2 jon jon  4096 Apr  8 01:49 db_dir_dogdata1/
drwx------   2 jon jon  4096 Apr  8 01:49 db_dir_dogdata/
drwx------   2 jon jon  4096 Apr  8 01:49 db_dir_aaaaa/
drwx------  12 jon jon  4096 Apr  8 02:11 ./
drwx------   3 jon jon  4096 Apr  8 02:12 db_dir_asdfasdf/
drwx------   3 jon jon  4096 Apr  9 08:44 db_dir_MyData/
drwx------ 431 jon jon 36864 Apr 16 11:20 ../
(h2ogpt) jon@pseudotensor:~/h2ogpt$ 

for personal collections.

To make such dbs one would do:

python src/make_db.py --user_path=user_path_jon --collection_name=JonData --langchain_type=personal --hf_embedding_model=hkunlp/instructor-large --persist_directory=users/jon/db_dir_JonData

Then you'll have:

(h2ogpt) jon@pseudotensor:~/h2ogpt$ ls -alrt users/jon/db_dir_JonData/
total 264
drwx------ 13 jon jon   4096 Apr 16 12:28 ../
drwx------  2 jon jon   4096 Apr 16 12:28 d7ccacb6-93fe-4380-9340-b7f5edffb655/
-rw-------  1 jon jon 249856 Apr 16 12:28 chroma.sqlite3
-rw-------  1 jon jon     41 Apr 16 12:28 embed_info
drwx------  3 jon jon   4096 Apr 16 12:28 ./
(h2ogpt) jon@pseudotensor:~/h2ogpt$ 

You can add that database to the auth.json for their entry if using auth.json type file, and they will see when they login.

Or you can have the user add that collection by name (JonData). i.e. at first user might see:

image

but if they do:

image

and hit enter, then they will see:

image

and you'll see the document count (3) is what I expect from what was in the original input folder.

@pseudotensor
Copy link
Collaborator

I think this is sufficient for now.

@vitalyshalumov
Copy link
Author

Thank you. One hopefully last question. Can I approach it from document selection perspective, i.e, select only specific documents at the launch of gradio (an option that is available in the UI)?

@pseudotensor
Copy link
Collaborator

Yes, that is the CLI option --document_choice that is a list of document names -- has to be exact match to like those that appear in UI or via API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants