Description
Suppose someone is using this library to adding a star to a repository, where the repository ID comes from user input. If they don't know about GraphQL variables and they're being careless, they might implement it like this:
async function addStar(repoId) {
const graphql = require('@octokit/graphql').defaults({
headers: {
authorization: `token secret123`
}
});
const results = await graphql(`
mutation {
addStar(input: {clientMutationId: "x", starrableId: "${repoId}"}) {
clientMutationId
}
}
`);
}
This example code is vulnerable to a "GraphQL injection" attack, where someone could modify the structure of the query or perform additional actions by supplying malicious user input. (This works in a similar manner to a SQL injection attack.) For example, a malicious user could provide the following string in this case to maliciously add a topic to a repository:
some-repo-id"}) {
clientMutationId
}
updateTopics(input: {clientMutationId: "y", topicNames:["evil-topic"], repositoryId: "some-other-repo-id
...which would result in the following malicious query getting executed after string concatenation happens:
mutation {
addStar(input: {clientMutationId: "x", starrableId: "some-repo-id"}) {
clientMutationId
}
updateTopics(input: {clientMutationId: "y", topicNames:["evil-topic"], repositoryId: "some-other-repo-id"}) {
clientMutationId
}
}
It's true that there's a way to rewrite this function to be safer (by using GraphQL variables), and similarly SQL engines usually allow for prepared statements. However, in SQL's case, developers still frequently shoot themselves in the foot and add injection vulnerabilities by using string concatenation, either by mistake or due to not knowing better. As a result, the broad consensus is that it's better if SQL libraries prevent string concatenation entirely (e.g. by not accepting strings as arguments). I think it would be a good idea if this library did a similar thing for GraphQL queries.
What would you think about the following alternative API?
// note: tagged template, not a regular function call
// ↓
const results = await graphql`
mutation {
addStar(input: {clientMutationId: "x", starrableId: ${repoId}}) {
clientMutationId
}
}
`;
The library would expose a template tag function which either escapes embedded expressions or replaces them all embedded expressions with GraphQL variables, and adds them to the query appropriately. As far as I can tell, this would make it extremely difficult for someone to shoot themself in the foot and introduce an injection attack, because there would be no easy way to concatenate a string while still calling a template tag function.
If someone needed to use additional arguments (e.g. headers), they could use graphql.defaults
:
const results = await graphql.defaults({ /* ... */ })`
mutation { ... }
`
(With this design, you would probably want to guard against someone accidentally calling graphql(`foo`)
or graphql([`foo`])
instead of graphql`foo`
. This could be accomplished by making sure the first argument to the graphql
function is an array with a raw
property.)
Thanks for considering -- I don't mean to drop into the issue tracker and tell you how you should be designing your library. That said, I think a change like this would prevent a significant number of security bugs, and it would be easier to make earlier before compatibility is too big a concern.