Description
Status Quo
At the core of the sqlx::query()
family of functions lies a contentious design choice that goes all the way back to the initial prototypes of SQLx:
pub fn query<'q>(query: &'q str) -> Query<'q, ...> { ... }
The query string must be a string slice, usually 'static
but can also be borrowed. This was a deliberate choice, because allowing String
to be passed directly makes it extremely tempting to simply format!()
user input into the query string instead of using bind parameters, which would easily introduce SQL injection vulnerabilities into the application. For example, to co-opt XKCD #327:
async fn insert_student(conn: &mut PgConnection, student_name: &str) -> sqlx::Result<StudentId> {
sqlx::query_scalar(format!("INSERT INTO students(name) VALUES('{}') RETURNING student_id", student_name))
.fetch_one(conn)
.await
}
// imagine this is user input, and wasn't properly sanitized
let student_name = "Robert'); DROP TABLE students; --";
let student_id = insert_student(&mut conn, &student_name).await?;
Then suddenly, you've lost all student data for the year because the query that got executed looked like this:
INSERT INTO students(name) VALUES('Robert'); DROP TABLE students; -- ') RETURNING student_id
(Postgres would reject the attempt to prepare a query string with more than one statement, but MySQL and SQLite wouldn't.)
The idea is that by requiring dereffing to a string slice, it will at least make the code start to smell a bit and hopefully lead to the user re-evaluating their implementation:
sqlx::query_scalar(&format!("INSERT INTO students(name) VALUES('{}') RETURNING student_id", student_name))
The "easier" path is intended to be using bind parameters, which by design do not introduce injection vulnerabilities because the database knows not to interpret bound values as part of the SQL. By comparison, this implementation is completely injection-safe, and is the approach that the API design was intended to push users towards:
async fn insert_student(conn: &mut PgConnection, student_name: &str) -> sqlx::Result<StudentId> {
sqlx::query_scalar("INSERT INTO students(name) VALUES(?) RETURNING student_id")
.bind(student_name)
.fetch_one(conn)
.await
}
However, it does have its downsides.
The intention is not at all clear in the design.
Time and again, users have opened issues or questions on the discussion board or asked on Discord about why the API is designed this way, and it gets very tedious to explain the reasoning (I admittedly have copy-pasted large chunks of this first part from a Q&A so I didn't have to write it again). This isn't a complaint about people asking questions, of course, it's more an expression of annoyance that the question needs to be asked in the first place.
Of course, we could just explain it better in the docs, which we should be doing anyways. But it's still an issue that it's even necessary.
The design doesn't give off very obvious signals when the user is doing something wrong.
All you have to do for the function to accept a String
is to add a &
and let auto-deref do the rest. Yeah, it's a code smell, but it reeks more of poor API design than injection vulnerability. I personally would complain about an API so inflexible that all I could pass is &str
; what if I want to pass String
or Arc<str>
?
And if all a user has to do is add a &
, is there really any opportunity for them to gain insight into why they shouldn't be doing that?
There's legitimate use-cases for passing a constructed String
if the user knows what they're doing.
This design decision is notorious for making it very difficult to implement query builders on top of SQLx, because of the lifetime issues that it introduces: #1428
pub fn sql_insert_with_return_id(....) -> QueryAs<'q, sqlx::Postgres, (i64,), PgArguments> {
let mut sql = String::from(....);
// .. continue to build string
let mut query = sqlx::query_as::<sqlx::Postgres, (i64,)>(&sql);
query // <-- Error here since query_as require sql to have same lifetime as the Query
}
Proposal
I don't want to just allow String
to be passed directly, because I think it's really important for the design of an API to avoid straight-up handing the user a footgun. We can warn about injection vulnerabilities in the docs all we want, but as long as it's the path of least resistance, some people will still end up shooting themselves. And allowing that to happen just isn't the Rust way.
I think we could copy a page from std
's notebook here and introduce something similar to UnwindSafe
: https://doc.rust-lang.org/stable/std/panic/trait.UnwindSafe.html
/// A query string that is safe to execute on a live database.
pub trait InjectionSafe {
fn sql(&self) -> &str;
}
/// Note: *only* implemented for static strings as those are highly unlikely to contain injection vulnerabilities,
/// barring compile-time shenanigans or deliberately leaking string allocations.
impl InjectionSafe for &'static str {
...
}
/// Assert that dynamic string data is safe from SQL injection.
///
/// The user assumes responsibility if constructing query strings based on untrustworthy input.
pub struct AssertInjectionSafe<T>(pub T);
impl<T: AsRef<str>> InjectionSafe for AssertInjectionSafe<T> {
...
}
pub fn query<Q: InjectionSafe>(query: Q) -> Query<Q, ...> {
...
}
Existing usage with &'static str
would remain unchanged:
let foo: i32 = sqlx::query_scalar("SELECT 1").fetch_one(&mut conn).await?;
However, usage of dynamic query strings would now be forced to assert that they do not contain SQL injection vulnerabilities, and in exchange would be freed of lifetime constraints:
// Of course, this is a blatant violation of the API contract if `table_name` cannot be trusted.
// A closed set of names like an enum would be better but that would pad out the example.
// Either way, the user has assumed responsibility here, and is aware that any vulnerabilities are on them.
pub fn build_query(table_name: &str) -> Query<AssertInjectionSafe<String>, ...> {
sqlx::query(AssertInjectionSafe(format!("SELECT * FROM {}", table_name)))
}
Of course, it is still possible to bypass using AssertInjectionSafe
, but it's a more significant code smell (and involves deliberately introducing a memory leak):
pub fn build_query(table_name: &str) -> Query<&'static str, ...> {
let query: Box<str> = format!("SELECT * FROM {}", table_name)).into();
// Memory leak!
sqlx::query(Box::leak(query))
}
And ideally, on the way the compiler would have informed them that their query string needs to implement InjectionSafe
which would have prompted them to check the docs and learn about SQL injection and why what they're trying to do is deliberately annoying.