Skip to content

Conversation

jedelbo
Copy link
Contributor

@jedelbo jedelbo commented Jul 9, 2024

What, How & Why?

☑️ ToDos

  • 📝 Changelog update
  • 🚦 Tests (or not relevant)
  • C-API, if public C++ API changed
  • bindgen/spec.yml, if public C++ API changed

if (m_type == Type::interned_strings) {
return StringID(static_cast<Array*>(m_arr)->get(ndx));
}
auto str = get(ndx);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: const auto

{
if (m_needles.empty()) {
return m_leaf->find_first(m_string_value, start, end);
return m_leaf->find_first(m_string_value, start, end, m_interned_string);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I see, you are reusing the ID if we got already one, as it was before. NIT, maybe can we call m_interned_string_id to differentiate from m_string_interner

public:
constexpr static bool case_sensitive_comparison =
is_any_v<TConditionFunction, Greater, GreaterEqual, Less, LessEqual>;
is_any_v<TConditionFunction, NotEqual, Greater, GreaterEqual, Less, LessEqual>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why adding NotEqual to case-insensitive comparison?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it was missing. Aparently we did not have any tests for that. (And it is case sentitive, not insensitive)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to make a failing test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - it works anyway. But in many cases we construct the m_ucase and m_lcase when they are not needed.

const auto id = m_leaf->get_string_id(s);
if (m_string_interner->compare(*m_interned_string, *id) != 0)
return s;
continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This continue is a bit tricky here. Essentially we want to skip getting the string by index s at line 1702, since we know that we won't find any match. Maybe a small comment just to be sure we are not going to delete this continue in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have re-arranged the code a bit.

@jedelbo jedelbo merged commit 9653f0f into nc/eq_neq_string_compression_query Jul 10, 2024
@jedelbo jedelbo deleted the je/eq_neq_string_compression_query branch July 10, 2024 09:27
nicola-cab added a commit that referenced this pull request Jul 10, 2024
* find_first optimization for compressed strings

* core test passing

* compression tests for collection of strings

* code review

* Fixes (#7872)

---------

Co-authored-by: Jørgen Edelbo <jorgen.edelbo@mongodb.com>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 9, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants