Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Fix tests for change in PostgreSQL 14 behavior change. #14310

Merged
merged 3 commits into from
Oct 27, 2022
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/14310.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Allow use of postgres and sqllite full-text search operators in search queries.
5 changes: 2 additions & 3 deletions synapse/storage/databases/main/search.py
Original file line number Diff line number Diff line change
Expand Up @@ -824,9 +824,8 @@ def _tokenize_query(query: str) -> TokenList:
in_phrase = False
parts = deque(query.split('"'))
for i, part in enumerate(parts):
# The contents inside double quotes is treated as a phrase, a trailing
# double quote is not implied.
in_phrase = bool(i % 2) and i != (len(parts) - 1)
# The contents inside double quotes is treated as a phrase.
in_phrase = bool(i % 2)
Comment on lines -827 to +828
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To check: this means that a trailing quote is now implied?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. 👍


# Pull out the individual words, discarding any non-word characters.
words = deque(re.findall(r"([\w\-]+)", part, re.UNICODE))
Expand Down
15 changes: 11 additions & 4 deletions tests/storage/test_room_search.py
Original file line number Diff line number Diff line change
Expand Up @@ -239,7 +239,6 @@ class MessageSearchTest(HomeserverTestCase):
("fox -nope", (True, False)),
("fox -brown", (False, True)),
('"fox" quick', True),
('"fox quick', True),
('"quick brown', True),
('" quick "', True),
('" nope"', False),
Expand Down Expand Up @@ -269,6 +268,14 @@ def prepare(
response = self.helper.send(self.room_id, self.PHRASE, tok=self.access_token)
self.assertIn("event_id", response)

# The behaviour of a missing trailing double quote changed in PostgreSQL 14
# from ignoring the initial double quote to treating it as a phrase.
Comment on lines +271 to +272
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To check: we've changed our parser to mirror this behaviour?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so SQLite will match the behavior of PostgreSQL 14. PostgreSQL 11 - 13 will still use whatever their internal function call does. (And PostgreSQL 10 fallsback to plainto_tsquery anyway so let's just ignore talking about that.)

main_store = homeserver.get_datastores().main
found = False
if isinstance(main_store.database_engine, PostgresEngine):
found = main_store.database_engine._version < 140000
self.COMMON_CASES.append(('"fox quick', (found, True)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To check: I think this means that:

  • the first "chunk" of the phrase (now fox without a leading quote) is found on PG < 14 and SQLite, but not on PG >= 14
  • the second chunk (quick) is found in all cases

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, what it means is that:

  • fox and quick are both found on PG < 14
  • The phrase "fox quick" is not found on PG >= 14 and SQLite.


def test_tokenize_query(self) -> None:
"""Test the custom logic to tokenize a user's query."""
cases = (
Expand All @@ -280,9 +287,9 @@ def test_tokenize_query(self) -> None:
("fox -brown", ["fox", SearchToken.Not, "brown"]),
("- fox", [SearchToken.Not, "fox"]),
('"fox" quick', [Phrase(["fox"]), SearchToken.And, "quick"]),
# No trailing double quoe.
('"fox quick', ["fox", SearchToken.And, "quick"]),
('"-fox quick', [SearchToken.Not, "fox", SearchToken.And, "quick"]),
# No trailing double quote.
('"fox quick', [Phrase(["fox", "quick"])]),
('"-fox quick', [Phrase(["-fox", "quick"])]),
('" quick "', [Phrase(["quick"])]),
(
'q"uick brow"n',
Expand Down