-
Notifications
You must be signed in to change notification settings - Fork 1
Models actions from v1 to v2 protocol #76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/api.jl
Outdated
| tx = Transaction(ctx.region, database, engine, "OPEN"; readonly = false) | ||
| actions = [_make_delete_models_action([model])] | ||
| return _post(ctx, PATH_TRANSACTION; query = query(tx), body = body(tx, actions...), kw...) | ||
| return exec(ctx, database, engine, """def delete:rel:catalog:model["$(model)"] =rel:catalog:model["$(model)"]"""; readonly=false, kw...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually a security vulnerability, since the name model may include injection attacks.
It's tricky to get this right. It used to be that this was sufficient: $(repr(model)) instead of "$(model)", but now there will still be issues if the name includes %` since this an interpolation character in Rel.
I think the right answer is that the SDK needs to provide an escape_string_for_rel(str) function.
I filed an issue for this here: https://github.com/RelationalAI/rai-sdk-issues/issues/68
Can you maybe add that as part of this PR? Does the problem make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nathan can you elaborate on how this in itself is a security vuln? i believe the vuln is if an app developer takes input directly from some UI and inserts it here without any sanitization? i agree that from a correctness pov, we should escape, but not sure about the vuln. also, didnt we agree that query args was the better way to handle this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i think my comment predates the slack discussion where we agreed on query args as the better way to handle this. I believe i had just forgotten about that when i proposed the above.
i believe the vuln is if an app developer takes input directly from some UI and inserts it here without any sanitization?
yes, that's the vulnerability I had in mind. You're right that it's probably on the developer to sanitize the inputs if they come from users, but it seems like something we could do for them to close the potential attack vector. That's all i meant.
src/api.jl
Outdated
| tx = Transaction(ctx.region, database, engine, "OPEN"; readonly = false) | ||
| actions = [_make_load_model_action(name, model) for (name, model) in models] | ||
| return _post(ctx, PATH_TRANSACTION; query = query(tx), body = body(tx, actions...), kw...) | ||
| queries = ["""def insert:rel:catalog:model["$(name)"] = \"\"\"$(models[name])\"\"\"""" for name in keys(models)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, it's very important to use $(repr(models[name])) instead of """$(models[name])""" to avoid (even accidental) injection attacks.
Otherwise, if the users' model file includes """ anywhere, this will be a syntax error. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(But, like above, you'll want to use https://github.com/RelationalAI/rai-sdk-issues/issues/68 to also address % in the user's model.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep totally makes sense, thank you for pointing this out 🙏
NHDaly
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks @NRHelmi! Yeah this looks great. I mainly left comments about making sure we properly escape the strings to prevent (accidental or purposeful) string injection issues.
Otherwise looks great. Thanks!
src/api.jl
Outdated
| """ def insert:rel:catalog:model[$(_escape_string_for_rel(name))] = $(_escape_string_for_rel(models[name])) """ | ||
| for name in keys(models) | ||
| ] | ||
| println(queries) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this on purpose, or just for debugging?
If we want to log this, it probably needs to be more structured?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a left from debugging 😅
NHDaly
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM, minus my request for round-trip tests.
|
Also, after the discussion on slack, i'm not sure if this approach or the inputs is better, to stay consistent with the other repos. But i think since you've got something working, it's probably fine to stay with this. |
|
|
||
| export | ||
| delete_model, | ||
| delete_models, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are making a breaking change like this, we'll have to bump the version number in the Project.toml to be a breaking change.
It would probably be better to leave the old names, and mark them as deprecated, and print a deprecation warning, like we did in #61, for create_database:
Lines 398 to 402 in d2da71d
| if !isempty(engine) | |
| @warn "DEPRECATED: Passing an `engine` is no longer required for creating a" * | |
| " database. This will be removed in a future release. Please update your call" * | |
| " `create_database(ctx, name)`." | |
| end |
In this case, it'd be even easier, you just keep the old method, print the deprecation warning, and forward to the new method.
Then we can delete these when we bump the major version number in a future release. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or if you'd rather not do this now, you can just bump the major version number now, and make this a breaking change, but I would check with @torkins and @bradlo about whether we have the appetite for a breaking release here, or if it'd be better to do this via deprecation warning + new function. I'd personally rather go the non-breaking approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
technically, we are only required to bump the version when we release a version. as for leaving deprecated entries, its really a judgement call .. given the number of breaking changes we are and will be making, there is real overhead to doing this, otoh if its a widely used entry point, then it may be worth it. and if its obscure, it may not. the only hard requirement is that all feature adds and breaking changes are noted in the CHANGEME.md file.
NHDaly
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 awesome, thanks. My last comment is about the breaking change. sorry i didn't notice that earlier. The rest of the changes look good, thanks for addressing the comments
| get_model, | ||
| list_models, | ||
| load_model | ||
| load_models |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, related to the above, can you add a line to the CHANGELOG?
src/api.jl
Outdated
| function load_models(ctx::Context, database::AbstractString, engine::AbstractString, models::Dict; kw...) | ||
| queries = [] | ||
| queries_inputs = Dict() | ||
| rand_uint = rand(UInt32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why 32 bits and not 64 (natural word length)
src/api.jl
Outdated
|
|
||
| function get_model(ctx::Context, database::AbstractString, engine::AbstractString, name::AbstractString; kw...) | ||
| out_name = "model$(rand(UInt32))" | ||
| query = """ def output:$out_name = rel:catalog:model[$(_escape_string_for_rel(name))] """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the spaces at beginning and end of string?
Use v2 protocol for model actions