Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SciCat profiles #37

Open
jl-wynen opened this issue Jan 5, 2023 · 3 comments
Open

SciCat profiles #37

jl-wynen opened this issue Jan 5, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@jl-wynen
Copy link
Collaborator

jl-wynen commented Jan 5, 2023

This is a suggestion for making the setup of clients more user friendly.

Description

Add 'profiles' (name up for debate) that define an API URL, file transfer, and potentially more, such that we can make a client using

# builtin profile
client = Client.from_token(profile='ess', token=...)

# from a file
client = Client.from_token(profile='my-profile.toml', token=...)

# programmatically
profile = Profile(url=..., file_transfer=...)
client = Client.from_token(profile=profile, token=...)

and the profile would be along these lines:

url = "https://ess.scicat.eu/api/v3"

[[file_transfer]]
type = "link"

[[file_transfer]]
type = "ssh"
host = "login.dmsc.dk"
remote_base_path = "/ess/data"

So it would define a client that talks to the production instance at ESS. And for files, it would first attempt to symlink files if we have direct access to the file system (needs to be implemented separately) and if that is not possible, it uses SSH.

Users, maintainers, or admins at other facilities can then write their own profiles and either integrate them into Scitacean or provide them in a different way.

Further attributes can be added if need be (e.g. how to authenticate with the file server)

This would not replace the current mechanism but would be an alternative.

Benefits

  • simplifies creation of clients if a profile is available
  • users no longer have to know URLs and other details
  • is explicit, it doesn't read the setup from the environment like it would with config files. So there is reduced risk of writing to the wrong SciCat instance.

Drawbacks

  • is explicit, the user still needs to do something, admins on a cluster or VM cannot provide a config file that does everything behind the scenes
  • it may be hard to encode everything in a file (TOML or otherwise)
@jl-wynen jl-wynen added the enhancement New feature or request label Jan 5, 2023
@jl-wynen
Copy link
Collaborator Author

Concerning file transfers, there are additional options for finding out what to do:

  • Dataset.source_folder_host: Can encode protocol and host URL
  • Can potentially add a new endpoint to SciCat

@bpedersen2
Copy link

I would prefer to add these options to the scicat backend in the dataset, as the storage can be different depending on the dataset ( we plan to store smaller dataset in S3, available via an https-broker, direct s3-access being an alternative option, and large dataset ( >>1TB) in another storage system). So enhancing the way the data access is stored in scicat is proably the way to go.

@jl-wynen
Copy link
Collaborator Author

jl-wynen commented Feb 1, 2024

Interesting. How are you going to handle this information? Does the user have to specify it during upload or will it be assigned automatically by the backend? If it's the latter, then the client (Scitacean) still needs to know how to upload the files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants