-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Do we care about big comments in our SQL queries? #14192
Comments
I think these get de-dented by the engine before being sent over the wire.
Personally I find these much harder to read and reason about. You need to consider multiple "languages" at the same time (both the Python string syntax + SQL). I think you're missing an option 4, which I would be in favor of: move the comments out of the SQL statement. I find the verbose comments hide the query itself, making it hard to see what is actually happening. I prefer having a long explanation above and then being able to see the query itself clearly. |
Drive-by comment:
This makes it really difficult to actually test the queries as you can no longer easily copy and paste them. (I do it surprisingly frequently) |
It's mentioned in the issue description but has its detractor points listed there as well so I didn't include it as a potential solution for my preference. I see that your preference is the opposite though.
I hate the adjacent string literals too. Just seemed the closest to what we currently have while keeping the comments out of the query itself. We discussed this in the backend team sync this morning and people weren't really worried about it. @DMRobertson added onto the point that extra bytes probably isn't the bottleneck and if it is @richvdh mentioned that we should be using prepared statements anyway. Since this was more a theoretical point and no real smoke to point at solving in the first place, I am going to close this issue. |
As brought up by @erikjohnston in #13816 (comment), it's may not totally ideal to have these big SQL comments in the query we send over the wire to Postgres/SQLite.
I have been in the habit of writing some pretty verbose comments in the SQL queries I have been working on which provide context and answer the "why" around we are doing certain things. I find these comments extremely useful for breaking down a SQL query into something understandable and explaining intentions in case of bugs down the road. For example, the following explains the long-lost context around why we care about looking at non-state edges.
synapse/synapse/storage/databases/main/event_federation.py
Lines 821 to 829 in 616dcc1
These comments could be above the SQL query in comments but I feel like they make a lot more sense mixed in the query itself. Some of the comments would be hard to tie to the query statements if they were just above in a big block of paragraphs above. See this full SQL query from the example above. That's a lot of comments to try to mentally piece back to the query if it were above:
synapse/synapse/storage/databases/main/event_federation.py
Lines 795 to 866 in 616dcc1
Potential solutions
1. Does it even matter?
Do the extra comments even matter at all? Does it really affect performance in any meaningful way, whether it be over the wire, inside the SQL engines, etc.
Surely, the extra indentation spaces from the Python
"""
multi-line strings add just as many bytes if we're worried about that.2. Strip comments out of the query
Can we get away with a quick and greedy find/replace of everything between
/* ... */
? Probably not totally safe but it probably works for our queries 🤷♀️Find/replace for every call doesn't sound like the best. There are ways we could memoize and cache the stripped result though.
We already strip out the newlines and trim the whitespace from each line in
_make_sql_one_line
but that's only used in the logger and doesn't look particularly optimized by eye.3. Use adjacent string literals to mimic the same multiline string
Adjacent string literals are concatenated together when inside parenthesis which mimics the
"""
multiline string pretty well. This gives us the benefit of spreading an SQL query over multiple lines while putting the comments in between each line as before.This solution kinda seems the best if we're worried about it. Just slightly more boilerplate faff to wrangle
"
and including the space for when it's concatenated together.synapse/synapse/storage/databases/main/events_worker.py
Lines 1610 to 1623 in 616dcc1
The text was updated successfully, but these errors were encountered: