Skip to content

Conversation

@NRHelmi
Copy link
Contributor

@NRHelmi NRHelmi commented Sep 19, 2022

Use v2 protocol for model actions

src/api.jl Outdated
tx = Transaction(ctx.region, database, engine, "OPEN"; readonly = false)
actions = [_make_delete_models_action([model])]
return _post(ctx, PATH_TRANSACTION; query = query(tx), body = body(tx, actions...), kw...)
return exec(ctx, database, engine, """def delete:rel:catalog:model["$(model)"] =rel:catalog:model["$(model)"]"""; readonly=false, kw...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a security vulnerability, since the name model may include injection attacks.

It's tricky to get this right. It used to be that this was sufficient: $(repr(model)) instead of "$(model)", but now there will still be issues if the name includes %` since this an interpolation character in Rel.

I think the right answer is that the SDK needs to provide an escape_string_for_rel(str) function.

I filed an issue for this here: https://github.com/RelationalAI/rai-sdk-issues/issues/68

Can you maybe add that as part of this PR? Does the problem make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nathan can you elaborate on how this in itself is a security vuln? i believe the vuln is if an app developer takes input directly from some UI and inserts it here without any sanitization? i agree that from a correctness pov, we should escape, but not sure about the vuln. also, didnt we agree that query args was the better way to handle this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i think my comment predates the slack discussion where we agreed on query args as the better way to handle this. I believe i had just forgotten about that when i proposed the above.

i believe the vuln is if an app developer takes input directly from some UI and inserts it here without any sanitization?

yes, that's the vulnerability I had in mind. You're right that it's probably on the developer to sanitize the inputs if they come from users, but it seems like something we could do for them to close the potential attack vector. That's all i meant.

src/api.jl Outdated
tx = Transaction(ctx.region, database, engine, "OPEN"; readonly = false)
actions = [_make_load_model_action(name, model) for (name, model) in models]
return _post(ctx, PATH_TRANSACTION; query = query(tx), body = body(tx, actions...), kw...)
queries = ["""def insert:rel:catalog:model["$(name)"] = \"\"\"$(models[name])\"\"\"""" for name in keys(models)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, it's very important to use $(repr(models[name])) instead of """$(models[name])""" to avoid (even accidental) injection attacks.

Otherwise, if the users' model file includes """ anywhere, this will be a syntax error. Does that make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(But, like above, you'll want to use https://github.com/RelationalAI/rai-sdk-issues/issues/68 to also address % in the user's model.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep totally makes sense, thank you for pointing this out 🙏

Copy link
Member

@NHDaly NHDaly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks @NRHelmi! Yeah this looks great. I mainly left comments about making sure we properly escape the strings to prevent (accidental or purposeful) string injection issues.

Otherwise looks great. Thanks!

@NRHelmi NRHelmi requested a review from NHDaly September 26, 2022 13:39
src/api.jl Outdated
""" def insert:rel:catalog:model[$(_escape_string_for_rel(name))] = $(_escape_string_for_rel(models[name])) """
for name in keys(models)
]
println(queries)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this on purpose, or just for debugging?
If we want to log this, it probably needs to be more structured?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a left from debugging 😅

Copy link
Member

@NHDaly NHDaly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM, minus my request for round-trip tests.

@NHDaly
Copy link
Member

NHDaly commented Oct 3, 2022

Also, after the discussion on slack, i'm not sure if this approach or the inputs is better, to stay consistent with the other repos. But i think since you've got something working, it's probably fine to stay with this.

NRHelmi and others added 3 commits October 4, 2022 19:37
Co-authored-by: Nathan Daly <nathan.daly@relational.ai>

export
delete_model,
delete_models,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are making a breaking change like this, we'll have to bump the version number in the Project.toml to be a breaking change.

It would probably be better to leave the old names, and mark them as deprecated, and print a deprecation warning, like we did in #61, for create_database:

rai-sdk-julia/src/api.jl

Lines 398 to 402 in d2da71d

if !isempty(engine)
@warn "DEPRECATED: Passing an `engine` is no longer required for creating a" *
" database. This will be removed in a future release. Please update your call" *
" `create_database(ctx, name)`."
end

In this case, it'd be even easier, you just keep the old method, print the deprecation warning, and forward to the new method.

Then we can delete these when we bump the major version number in a future release. Does that make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or if you'd rather not do this now, you can just bump the major version number now, and make this a breaking change, but I would check with @torkins and @bradlo about whether we have the appetite for a breaking release here, or if it'd be better to do this via deprecation warning + new function. I'd personally rather go the non-breaking approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically, we are only required to bump the version when we release a version. as for leaving deprecated entries, its really a judgement call .. given the number of breaking changes we are and will be making, there is real overhead to doing this, otoh if its a widely used entry point, then it may be worth it. and if its obscure, it may not. the only hard requirement is that all feature adds and breaking changes are noted in the CHANGEME.md file.

Copy link
Member

@NHDaly NHDaly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 awesome, thanks. My last comment is about the breaking change. sorry i didn't notice that earlier. The rest of the changes look good, thanks for addressing the comments

get_model,
list_models,
load_model
load_models
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, related to the above, can you add a line to the CHANGELOG?

src/api.jl Outdated
function load_models(ctx::Context, database::AbstractString, engine::AbstractString, models::Dict; kw...)
queries = []
queries_inputs = Dict()
rand_uint = rand(UInt32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why 32 bits and not 64 (natural word length)

src/api.jl Outdated

function get_model(ctx::Context, database::AbstractString, engine::AbstractString, name::AbstractString; kw...)
out_name = "model$(rand(UInt32))"
query = """ def output:$out_name = rel:catalog:model[$(_escape_string_for_rel(name))] """
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the spaces at beginning and end of string?

@NRHelmi NRHelmi merged commit 7d54b71 into main Oct 12, 2022
@NRHelmi NRHelmi deleted the hnr-models-v2 branch October 12, 2022 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants