Skip to content

Conversation

gibber9809
Copy link
Contributor

@gibber9809 gibber9809 commented Aug 26, 2025

Description

This PR is the fourth in a series of PRs (#1163 for the third) that is focused on sharing core parsing and search code between clp and clp-s. This will be the final PR in this series for now, though we plan on doing some follow-up work to improve the shared subquery generation code.

The changes of this PR consist of

  1. Defining an interface for variable and log type dictionary readers and writers as well as their entries as concepts in DictionaryConcepts.hpp.
  2. Using those concepts in EncodedVariableInterpreter.hpp and GrepCore.hpp to ensure that callers pass dictionaries and dictionary entries that conform to the expected interface.

The previous PRs in this series mostly already modified clp and clp-s dictionary implementations to comply with the new interface, though we do make one small change to the clp-s dictionary reader templates to correct some Id types.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  • Validated that tests still pass.

Summary by CodeRabbit

  • Refactor

    • Renamed public dictionary type aliases from entry_t to Entry.
    • Converted public templates to use concept-constrained parameters and rely on nested Entry types, improving compile-time type safety without behavioural changes.
    • Adjusted query and encoding APIs to use the updated type aliases and constraints.
  • Chores

    • Updated build lists to include new concept headers across targets and unit tests to ensure consistent compilation.

@gibber9809 gibber9809 requested review from wraymo and a team as code owners August 26, 2025 19:42
Copy link
Contributor

coderabbitai bot commented Aug 26, 2025

Walkthrough

Adds C++20 concept headers constraining logtype/variable dictionary entries/readers/writers; updates templates in EncodedVariableInterpreter.hpp and GrepCore.hpp to use those concepts and the reader-provided Entry alias; renames public alias entry_t→Entry; adjusts dictionary ID typedefs; and adds new headers to multiple CMakeLists.

Changes

Cohort / File(s) Summary
New concepts
components/core/src/clp/LogTypeDictionaryEntryReq.hpp, components/core/src/clp/LogTypeDictionaryReaderReq.hpp, components/core/src/clp/VariableDictionaryEntryReq.hpp, components/core/src/clp/VariableDictionaryReaderReq.hpp, components/core/src/clp/VariableDictionaryWriterReq.hpp
Adds C++20 concepts that constrain required interfaces and dictionary ID types for logtype/variable dictionary entries, readers, and writers (compile-time requirements only).
Template APIs updated to use concepts
components/core/src/clp/EncodedVariableInterpreter.hpp, components/core/src/clp/GrepCore.hpp
Replaces unconstrained typename template parameters with the new concept-constrained template parameters; derives Entry types from reader nested aliases; updates out-of-class definitions to match. No runtime/control-flow changes.
Public alias renames & ID typedef adjustments
components/core/src/clp/DictionaryReader.hpp, components/core/src/clp/DictionaryWriter.hpp, components/core/src/clp_s/DictionaryReader.hpp, components/core/src/clp_s/DictionaryWriter.hpp
Renames public alias entry_tEntry; in clp_s DictionaryReader adds Defs.h include and switches dictionary ID typedefs to clp-specific typedefs.
CMake: add new headers to targets
components/core/CMakeLists.txt, components/core/src/clp/clg/CMakeLists.txt, components/core/src/clp/clo/CMakeLists.txt, components/core/src/clp/clp/CMakeLists.txt, components/core/src/clp_s/CMakeLists.txt, components/core/src/clp_s/indexer/CMakeLists.txt
Adds the new *Req.hpp headers to various target/source lists (unit tests, clg, clo, clp, clp_s, indexer). Build-system bookkeeping only.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title concisely and accurately summarizes the primary change: refactoring core code to use C++20 concepts to enforce interfaces for logtype and variable dictionary entries, readers, and writers, which aligns with the PR objectives and the file-level changes and avoids noisy details.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
components/core/src/clp/GrepCore.hpp (1)

275-283: Style: Prefer 'false == expr' over 'expr == false'

Project guideline says: prefer false == <expression> rather than !<expression>. Here we have an == false comparison (later in the function body). Swap operand order for consistency.

Apply:

-            == false)
+            == false)  // Follow-up below

And for the comparison at Line 296 in this file:

-        if (EncodedVariableInterpreter::encode_and_search_dictionary(
+        if (false == EncodedVariableInterpreter::encode_and_search_dictionary(
                     query_token.get_value(),
                     var_dict,
                     ignore_case,
                     logtype,
                     sub_query
-            )
-            == false)
+            ))
         {
             // Variable doesn't exist in dictionary
             return false;
         }
components/core/src/clp/EncodedVariableInterpreter.hpp (2)

175-184: Constrain the encoded-vars container to prevent hard-to-diagnose template errors

This function assumes random access, size(), and element type compatible with encoded_variable_t. Constrain the container to a random-access range of encoded_variable_t to catch misuse at compile-time.

Apply:

-    template <
-            LogTypeDictionaryEntryReq LogTypeDictionaryEntryType,
-            VariableDictionaryReaderReq VariableDictionaryReaderType,
-            typename EncodedVariableContainerType>
+    #include <ranges>
+    template <
+            LogTypeDictionaryEntryReq LogTypeDictionaryEntryType,
+            VariableDictionaryReaderReq VariableDictionaryReaderType,
+            std::ranges::random_access_range EncodedVariableContainerType>
+        requires std::same_as<std::ranges::range_value_t<EncodedVariableContainerType>,
+                              encoded_variable_t>
     static bool decode_variables_into_message(

Note: Add the #include <ranges> at the top of this header.


555-562: No-op, but tiny nit: prefer taking pointer as const pointer explicitly in loop

In wildcard_search_dictionary_and_get_encoded_matches, iterating with for (auto entry : var_dict_entries) copies the pointer value. Using const auto* better communicates intent.

Apply:

-    for (auto entry : var_dict_entries) {
+    for (const auto* entry : var_dict_entries) {
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9b4c1c1 and e9dbd71.

📒 Files selected for processing (9)
  • components/core/CMakeLists.txt (1 hunks)
  • components/core/src/clp/DictionaryConcepts.hpp (1 hunks)
  • components/core/src/clp/EncodedVariableInterpreter.hpp (15 hunks)
  • components/core/src/clp/GrepCore.hpp (7 hunks)
  • components/core/src/clp/clg/CMakeLists.txt (1 hunks)
  • components/core/src/clp/clo/CMakeLists.txt (1 hunks)
  • components/core/src/clp/clp/CMakeLists.txt (1 hunks)
  • components/core/src/clp_s/CMakeLists.txt (1 hunks)
  • components/core/src/clp_s/indexer/CMakeLists.txt (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/core/src/clp/DictionaryConcepts.hpp
  • components/core/src/clp/EncodedVariableInterpreter.hpp
  • components/core/src/clp/GrepCore.hpp
🧠 Learnings (1)
📚 Learning: 2025-08-06T23:57:49.762Z
Learnt from: gibber9809
PR: y-scope/clp#1163
File: components/core/src/clp_s/ColumnWriter.hpp:120-122
Timestamp: 2025-08-06T23:57:49.762Z
Learning: In the CLP codebase, the `encoded_log_dict_id_t` type alias in `components/core/src/clp_s/ColumnWriter.hpp` is appropriately placed as a class-scoped alias because the ColumnWriter.hpp header is already included in all places where this type is used, eliminating the need to move it to a common header.

Applied to files:

  • components/core/src/clp/DictionaryConcepts.hpp
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: ubuntu-jammy-lint
  • GitHub Check: centos-stream-9-static-linked-bins
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: build-macos (macos-15, false)
  • GitHub Check: build-macos (macos-15, true)
🔇 Additional comments (13)
components/core/CMakeLists.txt (2)

476-476: Including headers in SOURCE_FILES is fine; addition looks correct

Listing src/clp/DictionaryConcepts.hpp among unit test sources ensures rebuilds when the header changes and is consistent with existing headers in this list. No issues from a build-system standpoint.


438-723: DictionaryConcepts.hpp inclusion verified across all relevant targets

I ran a grep over the CMakeLists in components/ and confirmed that every CMake target which pulls in headers depending on DictionaryConcepts.hpp now lists that header in its sources:

  • components/core/CMakeLists.txt (unitTest sources)
  • components/core/src/clp_s/CMakeLists.txt
  • components/core/src/clp_s/indexer/CMakeLists.txt
  • components/core/src/clp/clg/CMakeLists.txt
  • components/core/src/clp/clo/CMakeLists.txt
  • components/core/src/clp/clp/CMakeLists.txt

Since each of these explicitly references DictionaryConcepts.hpp, the concept header is wired into all targets using the constrained templates. No further changes are required.

components/core/src/clp/DictionaryConcepts.hpp (1)

20-132: Concept surface looks comprehensive and aligned to usage

The LogTypeDictionaryEntryReq requirements match the methods used across the codebase (reserve/parse/add/get_*). Good balance between precision and flexibility.

components/core/src/clp/GrepCore.hpp (5)

46-49: Good adoption of C++20 concepts for public API

Using LogTypeDictionaryReaderReq and VariableDictionaryReaderReq at the interface makes constraints explicit and improves diagnostics for callers. Defaulting the entry type to ReaderType::entry_t is sensible.


113-120: Template constraint on process_var_token is appropriate

Constrained to VariableDictionaryReaderReq matches the function’s needs and documents expectations to callers.


137-142: Default the entry type and additionally require it to satisfy the entry concept

The parameter pack
LogTypeDictionaryEntryReq LogTypeDictionaryEntryType = typename LogTypeDictionaryReaderType::entry_t
is a nice touch: it both defaults and validates the entry_t type from the reader. No changes requested.


336-341: Out-of-class definition mirrors constrained declaration

Template parameter list and constraints match the declaration, preventing ODR or deduction mismatches. Looks good.


230-238: API change is source-breaking for non-conforming external readers

Switching to constrained templates will cause compile errors for external dictionary reader implementations that don’t meet the concepts. This is expected but should be called out as a potential breaking change in the PR checklist.

Please mark the “breaking change” checkbox in the PR and add a one-liner: “Compile-time breaking for custom dictionary readers/writers that don’t satisfy the new concepts.”

components/core/src/clp/EncodedVariableInterpreter.hpp (5)

9-9: Including DictionaryConcepts.hpp here is necessary and correct

The constrained templates in this header depend on the concepts. Include placement is appropriate.


125-134: Constraints on encode_and_add_to_dictionary(string_view, …) are correct

Requiring a logtype entry and a writer via concepts matches usage and will improve diagnostics. No change requested.


151-162: Constraints on encode_and_add_to_dictionary(LogEvent<…>, …) are correct

Matches the above overload; all good.


199-206: Reader constraint is appropriate for encode_and_search_dictionary

The dictionary reader requirements are minimal and sufficient. No change requested.


219-227: Defaulted entry type for wildcard search correctly validated

The entry type defaults to the reader’s entry_t and is validated by VariableDictionaryEntryReq. Looks good.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
components/core/src/clp_s/DictionaryReader.hpp (3)

6-14: Missing include for std::ranges::find_if

This header uses std::ranges::find_if (Line 177) but doesn’t include . Relying on transitive includes is fragile and may break with toolchain or header changes. Please add the include explicitly.

Apply this diff:

 #include <iterator>
+#include <algorithm>
 #include <string>
 #include <string_view>
 #include <unordered_set>
 #include <utility>

206-213: Follow project style: prefer false == expr to negation

Coding guidelines prefer false == over !. Adjust the argument passed to wildcard_match_unsafe accordingly.

Apply this diff:

         if (clp::string_utils::wildcard_match_unsafe(
                     entry.get_value(),
                     wildcard_string,
-                    !ignore_case
+                    false == ignore_case
             ))

188-197: Simplify case-insensitive branch to avoid extra allocations and std::ignore

Two improvements:

  • Replace the output-iterator overload of to_upper_copy with the simpler returning overload.
  • Alternatively (and preferable), use boost::algorithm::iequals to avoid allocating uppercased copies for each entry.

Option A (minimal change):

-    std::string search_string_uppercase;
-    std::ignore = boost::algorithm::to_upper_copy(
-            std::back_inserter(search_string_uppercase),
-            search_string
-    );
+    std::string search_string_uppercase = boost::algorithm::to_upper_copy(search_string);

Option B (preferred for clarity/perf; requires including <boost/algorithm/string/predicate.hpp>):

// Add: #include <boost/algorithm/string/predicate.hpp>
std::vector<EntryType const*> entries;
for (auto const& entry : m_entries) {
    if (boost::algorithm::iequals(entry.get_value(), search_string)) {
        entries.push_back(&entry);
    }
}
♻️ Duplicate comments (5)
components/core/src/clp/DictionaryConcepts.hpp (5)

193-208: Constrain writer.add_entry to take an lvalue reference for the out-parameter

Call sites rely on add_entry updating id_ref. The current concept allows implementations that accept id_ref by value (compiles but breaks functionality). The parameter list already correctly uses variable_dictionary_id_t& id_ref, so the concept is properly constrained.


155-155: Fix missing 'typename' for dependent default template argument

C++ requires typename before dependent types used in default template parameters. Without it, this won't compile.

Apply this diff:

-        typename LogTypeDictionaryEntryType = typename LogTypeDictionaryReaderType::entry_t>
+        typename LogTypeDictionaryEntryType = typename LogTypeDictionaryReaderType::entry_t>

182-186: Invalid requirement: use a nested requirement with 'requires'

Inside a requires-expression body, bare std::same_as<...>; is ill-formed. It must be a nested requirement with the requires keyword.

Apply this diff:

-    requires std::
-            same_as<typename LogTypeDictionaryReaderType::dictionary_id_t, logtype_dictionary_id_t>;
-
-    requires std::
-            same_as<typename LogTypeDictionaryReaderType::entry_t, LogTypeDictionaryEntryType>;
+    requires std::same_as<typename LogTypeDictionaryReaderType::dictionary_id_t,
+                          logtype_dictionary_id_t>;
+
+    requires std::same_as<typename LogTypeDictionaryReaderType::entry_t,
+                          LogTypeDictionaryEntryType>;

217-217: Fix missing 'typename' for dependent default template argument

Same issue as the logtype reader concept; this will fail to compile without typename.

Apply this diff:

-        typename VariableDictionaryEntryType = typename VariableDictionaryReaderType::entry_t>
+        typename VariableDictionaryEntryType = typename VariableDictionaryReaderType::entry_t>

253-258: Invalid requirement: use a nested requirement with 'requires'

Mirror the fix done for the logtype reader concept.

Apply this diff:

-    requires std::same_as<
-            typename VariableDictionaryReaderType::dictionary_id_t,
-            variable_dictionary_id_t>;
-
-    requires std::
-            same_as<typename VariableDictionaryReaderType::entry_t, VariableDictionaryEntryType>;
+    requires std::same_as<typename VariableDictionaryReaderType::dictionary_id_t,
+                          variable_dictionary_id_t>;
+
+    requires std::same_as<typename VariableDictionaryReaderType::entry_t,
+                          VariableDictionaryEntryType>;
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e9dbd71 and 7da8133.

📒 Files selected for processing (2)
  • components/core/src/clp/DictionaryConcepts.hpp (1 hunks)
  • components/core/src/clp_s/DictionaryReader.hpp (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/core/src/clp_s/DictionaryReader.hpp
  • components/core/src/clp/DictionaryConcepts.hpp
🧠 Learnings (1)
📚 Learning: 2025-08-06T23:57:49.762Z
Learnt from: gibber9809
PR: y-scope/clp#1163
File: components/core/src/clp_s/ColumnWriter.hpp:120-122
Timestamp: 2025-08-06T23:57:49.762Z
Learning: In the CLP codebase, the `encoded_log_dict_id_t` type alias in `components/core/src/clp_s/ColumnWriter.hpp` is appropriately placed as a class-scoped alias because the ColumnWriter.hpp header is already included in all places where this type is used, eliminating the need to move it to a common header.

Applied to files:

  • components/core/src/clp_s/DictionaryReader.hpp
  • components/core/src/clp/DictionaryConcepts.hpp
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
  • GitHub Check: ubuntu-jammy-lint
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: centos-stream-9-static-linked-bins
🔇 Additional comments (2)
components/core/src/clp_s/DictionaryReader.hpp (1)

15-15: Adopting clp::*_dictionary_id_t via Defs.h — looks good

Including clp/Defs.h here to use the shared ID aliases aligns this header with the new cross-module ID types introduced in the PR.

components/core/src/clp/DictionaryConcepts.hpp (1)

1-262: Review of C++20 concept syntax: definitions are valid

I’ve verified that each concept in DictionaryConcepts.hpp correctly follows the C++20 grammar:

  • Concepts are defined at namespace scope using
    template<…> concept Name = constraint-expression;
    which matches the standard syntax (cppreference.com).
  • The requires(...) { … } blocks use both simple requirements (a + b;) and compound requirements ({ expr } -> std::same_as<T>;), as well as nested requirements (requires Condition;), exactly as prescribed by the standard (en.cppreference.com).
  • All required headers are present (<concepts>, <string_view>, etc.), and there are no stray semicolons or misplaced commas.

No syntax errors were found that would prevent compilation under a conforming C++20 compiler. Please disregard the original concern about “several syntax errors.”

Likely an incorrect or invalid review comment.

Copy link
Member

@LinZhihao-723 LinZhihao-723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall is good to me.

  • A few comments about the docstrings.
  • I left two interface design comments.
  • Waiting for @kirkrodrigues to confirm whether we should split the requirements into dedicated headers.

* of bounds.
*/
{
entry.get_placeholder_info(placeholder_ix, placeholder_ref)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

placeholder_ix -> placeholder_idx
We will stick to idx and drop the use ix for any new code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I started using ix over idx because @kirkrodrigues recently told me that ix is preferred. Are we changing that again?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, that might've been a miscommunication. ix is what we have in the codebase cause I used to use that for both iterators and indices, but idx is more conventional.

*/
template <
typename VariableDictionaryReaderType,
typename VariableDictionaryEntryType = typename VariableDictionaryReaderType::entry_t>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the logtype reader:

  • How about using VariableDictionaryReaderType::entry_t directly?
  • We need to check it matches the variable entry requirement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized for dictionary readers, the signatures are almost the same for logtypes and variables. Do u think it's worth making a general dictionary reader requirement to include the common requirements?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit on the fence about this one -- it will probably save a few lines of code but I think it might end up harder to read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kk sure~

Comment on lines 138 to 141
LogTypeDictionaryReaderReq LogTypeDictionaryReaderType,
VariableDictionaryReaderReq VariableDictionaryReaderType,
LogTypeDictionaryEntryReq LogTypeDictionaryEntryType
= typename LogTypeDictionaryReaderType::entry_t>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we update the requirement to not explicitly take the entry type, we can only take the dictionary reader type as a template param.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Doing a similar simplification in EncodedVariableInterpreter as well.

gibber9809 and others added 2 commits September 15, 2025 11:06
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (7)
components/core/src/clp/DictionaryConcepts.hpp (7)

154-157: Resolved: added missing 'typename' in dependent default argument.

The default for LogTypeDictionaryEntryType now correctly uses typename. Thanks for fixing this.


194-209: Resolved: enforce lvalue-ref out-parameter for writer.add_entry.

id_ref is now an lvalue reference, preventing implementations that take it by value. This aligns with call‑site expectations.


216-219: Resolved: added missing 'typename' in dependent default argument.

Same fix applied for the variable dictionary reader. Looks good.


28-28: Rename ix -> idx for consistency.

Project convention prefers idx. Update the parameter and doc accordingly.

Apply:

-        size_t placeholder_ix,
+        size_t placeholder_idx,
@@
-     * @param placeholder_ix The index of the placeholder to get the info for.
-     * @param placeholder_ref Returns the type of the placeholder at `placeholder_ix`.
-     * @return The placeholder's position in the logtype, or `SIZE_MAX` if `placeholder_ix` is out
+     * @param placeholder_idx The index of the placeholder to get the info for.
+     * @param placeholder_ref Returns the type of the placeholder at `placeholder_idx`.
+     * @return The placeholder's position in the logtype, or `SIZE_MAX` if `placeholder_idx` is out
@@
-        entry.get_placeholder_info(placeholder_ix, placeholder_ref)
+        entry.get_placeholder_info(placeholder_idx, placeholder_ref)

Also applies to: 118-121, 124-124


157-162: Also require the entry type to satisfy the entry concept.

Add an explicit constraint so the reader’s entry type adheres to LogTypeDictionaryEntryReq.

Apply:

-concept LogTypeDictionaryReaderReq = requires(
+concept LogTypeDictionaryReaderReq = requires(
         LogTypeDictionaryReaderType reader,
         std::string_view logtype,
         bool ignore_case,
         std::unordered_set<LogTypeDictionaryEntryType const*>& entries
 ) {
+    // Ensure the entry type meets the entry concept.
+    requires LogTypeDictionaryEntryReq<LogTypeDictionaryEntryType>;

254-259: Tidy nested requirements formatting (std::same_as).

Unify formatting with the logtype reader concept.

Apply:

-    requires std::same_as<
-            typename VariableDictionaryReaderType::dictionary_id_t,
-            variable_dictionary_id_t>;
-
-    requires std::
-            same_as<typename VariableDictionaryReaderType::entry_t, VariableDictionaryEntryType>;
+    requires std::same_as<typename VariableDictionaryReaderType::dictionary_id_t,
+                          variable_dictionary_id_t>;
+
+    requires std::same_as<typename VariableDictionaryReaderType::entry_t,
+                          VariableDictionaryEntryType>;

219-226: Also require the variable entry type to satisfy its concept.

Mirror the logtype reader: assert VariableDictionaryEntryType satisfies VariableDictionaryEntryReq.

Apply:

-concept VariableDictionaryReaderReq = requires(
+concept VariableDictionaryReaderReq = requires(
         VariableDictionaryReaderType reader,
         variable_dictionary_id_t id,
         std::string_view variable,
         bool ignore_case,
         std::unordered_set<VariableDictionaryEntryType const*>& entries
 ) {
+    // Ensure the entry type meets the entry concept.
+    requires VariableDictionaryEntryReq<VariableDictionaryEntryType>;
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7da8133 and d8d81f8.

📒 Files selected for processing (1)
  • components/core/src/clp/DictionaryConcepts.hpp (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/core/src/clp/DictionaryConcepts.hpp
🧠 Learnings (2)
📚 Learning: 2025-08-06T23:57:49.762Z
Learnt from: gibber9809
PR: y-scope/clp#1163
File: components/core/src/clp_s/ColumnWriter.hpp:120-122
Timestamp: 2025-08-06T23:57:49.762Z
Learning: In the CLP codebase, the `encoded_log_dict_id_t` type alias in `components/core/src/clp_s/ColumnWriter.hpp` is appropriately placed as a class-scoped alias because the ColumnWriter.hpp header is already included in all places where this type is used, eliminating the need to move it to a common header.

Applied to files:

  • components/core/src/clp/DictionaryConcepts.hpp
📚 Learning: 2025-09-11T13:15:39.930Z
Learnt from: anlowee
PR: y-scope/clp#1176
File: components/core/src/clp_s/ColumnWriter.cpp:67-72
Timestamp: 2025-09-11T13:15:39.930Z
Learning: In components/core/src/clp_s/ColumnWriter.cpp, when using sizeof with types like variable_dictionary_id_t, always use the fully qualified namespace (clp::variable_dictionary_id_t) to match the variable declaration and maintain consistency with other column writers in the codebase.

Applied to files:

  • components/core/src/clp/DictionaryConcepts.hpp

Comment on lines 161 to 162
std::unordered_set<LogTypeDictionaryEntryType const*>& entries
) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Consider relaxing the concrete container type for entries.

Requiring std::unordered_set<... const*> is unnecessarily restrictive. A lightweight set concept would let callers pass any compatible container (e.g., flat_hash_set, std::pmr::unordered_set).

Example approach:

template <typename Set, typename EntryPtr>
concept EntryPtrSetReq = requires(Set s, EntryPtr p) {
    { s.insert(p) } -> std::same_as<std::pair<typename Set::iterator, bool>>;
    // Optionally require size/clear if needed
};

template <typename LogTypeDictionaryReaderType,
          typename LogTypeDictionaryEntryType = typename LogTypeDictionaryReaderType::entry_t,
          typename EntrySet = std::unordered_set<LogTypeDictionaryEntryType const*>>
concept LogTypeDictionaryReaderReq = requires(
        LogTypeDictionaryReaderType reader,
        std::string_view logtype,
        bool ignore_case,
        EntrySet& entries
) {
    requires EntryPtrSetReq<EntrySet, LogTypeDictionaryEntryType const*>;
    ...
};

If you’d prefer to keep the current signature, at least accept auto& entries and constrain with a nested requires.

Also applies to: 224-225

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
components/core/src/clp_s/DictionaryReader.hpp (1)

209-213: Style: replace !ignore_case per repo guideline.

Guidelines prefer false == <expr> over !<expr>.

-                    !ignore_case
+                    false == ignore_case
components/core/src/clp/GrepCore.hpp (1)

208-208: Apply coding guideline: replace !expr with false == expr.
Multiple instances in this file still use !. Please update for consistency with the repo guideline.

Apply this diff:

-        if (!query_token.has_greedy_wildcard_in_middle() && query_token.is_ambiguous_token()) {
+        if (false == query_token.has_greedy_wildcard_in_middle() && query_token.is_ambiguous_token()) {
@@
-    if (!query_token.contains_wildcards()) {
+    if (false == query_token.contains_wildcards()) {
@@
-                if (!EncodedVariableInterpreter::wildcard_search_dictionary_and_get_encoded_matches(
+                if (false == EncodedVariableInterpreter::wildcard_search_dictionary_and_get_encoded_matches(
@@
-            if (!query_token.is_var()) {
+            if (false == query_token.is_var()) {
@@
-            if (!query_token.is_var()) {
+            if (false == query_token.is_var()) {
-            } else if (!process_var_token(query_token, var_dict, ignore_case, sub_query, logtype)) {
+            } else if (false == process_var_token(query_token, var_dict, ignore_case, sub_query, logtype)) {

Also applies to: 286-286, 312-312, 379-379, 387-391

♻️ Duplicate comments (4)
components/core/src/clp/DictionaryConcepts.hpp (3)

152-183: Reader concept OK; tidy nested requirement formatting and consider generalising entries container.

  • Formatting: keep std::same_as unbroken.
  • Portability: accepting only std::unordered_set<...> is restrictive; allow any set-like container supporting insert(ptr) via an auxiliary concept.
-        std::unordered_set<typename LogTypeDictionaryReaderType::EntryT const*>& entries
+        auto& entries
 ) {
@@
-    requires std::
-            same_as<typename LogTypeDictionaryReaderType::dictionary_id_t, logtype_dictionary_id_t>;
+    requires std::same_as<typename LogTypeDictionaryReaderType::dictionary_id_t,
+                          logtype_dictionary_id_t>;
+    // Additionally:
+    // requires EntryPtrSetReq<decltype(entries),
+    //                         typename LogTypeDictionaryReaderType::EntryT const*>;

If you prefer to keep the concrete type, at least fix the std::same_as wrapping as above.


19-132: LogTypeDictionaryEntryReq API surface — solid. Minor naming nit.

Consider renaming placeholder_ix to placeholder_idx to match team convention.

-        size_t placeholder_ix,
+        size_t placeholder_idx,
...
-        entry.get_placeholder_info(placeholder_ix, placeholder_ref)
+        entry.get_placeholder_info(placeholder_idx, placeholder_ref)

211-252: VariableDictionaryReaderReq — OK; mirror container generalisation and return-type flexibility (optional).

Consider:

  • Using a generic set-like entries parameter as noted for logtype reader.
  • Optionally accept any range of entry pointers for return type instead of hard std::vector.
-    } -> std::same_as<std::vector<typename VariableDictionaryReaderType::EntryT const*>>;
+    } -> std::ranges::range;

And generalise entries similarly to the previous comment if you adopt that change.

components/core/src/clp/GrepCore.hpp (1)

46-49: Concept‑constrained process_raw_query — LGTM and aligns with prior feedback.
Removes the need for explicit entry type; accepting only reader types is cleaner.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d8d81f8 and 0d1e320.

📒 Files selected for processing (8)
  • components/core/CMakeLists.txt (1 hunks)
  • components/core/src/clp/DictionaryConcepts.hpp (1 hunks)
  • components/core/src/clp/DictionaryReader.hpp (1 hunks)
  • components/core/src/clp/DictionaryWriter.hpp (1 hunks)
  • components/core/src/clp/EncodedVariableInterpreter.hpp (15 hunks)
  • components/core/src/clp/GrepCore.hpp (8 hunks)
  • components/core/src/clp_s/DictionaryReader.hpp (3 hunks)
  • components/core/src/clp_s/DictionaryWriter.hpp (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/core/src/clp/DictionaryWriter.hpp
  • components/core/src/clp_s/DictionaryWriter.hpp
  • components/core/src/clp/DictionaryReader.hpp
  • components/core/src/clp_s/DictionaryReader.hpp
  • components/core/src/clp/GrepCore.hpp
  • components/core/src/clp/DictionaryConcepts.hpp
  • components/core/src/clp/EncodedVariableInterpreter.hpp
🧠 Learnings (2)
📚 Learning: 2025-08-06T23:57:49.762Z
Learnt from: gibber9809
PR: y-scope/clp#1163
File: components/core/src/clp_s/ColumnWriter.hpp:120-122
Timestamp: 2025-08-06T23:57:49.762Z
Learning: In the CLP codebase, the `encoded_log_dict_id_t` type alias in `components/core/src/clp_s/ColumnWriter.hpp` is appropriately placed as a class-scoped alias because the ColumnWriter.hpp header is already included in all places where this type is used, eliminating the need to move it to a common header.

Applied to files:

  • components/core/src/clp/DictionaryWriter.hpp
  • components/core/src/clp_s/DictionaryWriter.hpp
  • components/core/src/clp/DictionaryReader.hpp
  • components/core/src/clp_s/DictionaryReader.hpp
  • components/core/src/clp/DictionaryConcepts.hpp
📚 Learning: 2025-09-11T13:15:39.930Z
Learnt from: anlowee
PR: y-scope/clp#1176
File: components/core/src/clp_s/ColumnWriter.cpp:67-72
Timestamp: 2025-09-11T13:15:39.930Z
Learning: In components/core/src/clp_s/ColumnWriter.cpp, when using sizeof with types like variable_dictionary_id_t, always use the fully qualified namespace (clp::variable_dictionary_id_t) to match the variable declaration and maintain consistency with other column writers in the codebase.

Applied to files:

  • components/core/src/clp/DictionaryWriter.hpp
  • components/core/src/clp_s/DictionaryWriter.hpp
  • components/core/src/clp/DictionaryReader.hpp
  • components/core/src/clp_s/DictionaryReader.hpp
  • components/core/src/clp/DictionaryConcepts.hpp
🧬 Code graph analysis (2)
components/core/src/clp/DictionaryConcepts.hpp (5)
components/core/src/clp/EncodedVariableInterpreter.hpp (11)
  • logtype (61-63)
  • logtype (61-61)
  • logtype (69-71)
  • logtype (69-69)
  • logtype (77-79)
  • logtype (77-77)
  • logtype (85-87)
  • logtype (85-85)
  • value (95-98)
  • value (105-108)
  • id (54-54)
components/core/src/clp/GrepCore.hpp (2)
  • value (69-74)
  • value (86-92)
components/core/src/clp_s/DictionaryWriter.hpp (1)
  • value (91-91)
components/core/src/clp/DictionaryReader.hpp (2)
  • id (79-79)
  • id (86-86)
components/core/src/clp_s/DictionaryReader.hpp (2)
  • id (63-63)
  • id (69-69)
components/core/src/clp/EncodedVariableInterpreter.hpp (1)
components/core/src/glt/EncodedVariableInterpreter.hpp (1)
  • var_wildcard_str (176-181)
🔇 Additional comments (27)
components/core/CMakeLists.txt (1)

477-477: Including DictionaryConcepts.hpp in unitTest sources — OK.

Header-only addition; inclusion here ensures concept checks compile under tests.

components/core/src/clp/DictionaryReader.hpp (1)

41-41: EntryT alias rename — OK.

Matches concepts usage and other headers; no behaviour change.

components/core/src/clp_s/DictionaryReader.hpp (3)

15-15: New include of clp/Defs.h — OK.

Needed for clp::variable_dictionary_id_t / clp::logtype_dictionary_id_t.


32-32: EntryT alias rename — OK.

Consistent with clp-side reader/writer.


100-103: Confirm typedef signedness — logtype is signed

variable_dictionary_id_t = uint64_t; logtype_dictionary_id_t = int64_t. Definitions: components/core/src/clp_s/Defs.hpp (lines 15,19); components/core/src/clp/Defs.h (14,18); components/core/src/glt/Defs.h (17,21). Because logtype_dictionary_id_t is signed, ensure callers never pass negative IDs (especially where IDs are compared against container.size(), used as indices/keys, or encoded/cast).

components/core/src/clp/DictionaryConcepts.hpp (1)

189-204: VariableDictionaryWriterReq — good constraint on out-param by lvalue-ref.

This prevents by-value implementations that wouldn’t propagate the ID.

components/core/src/clp_s/DictionaryWriter.hpp (1)

24-24: Approve: entry_t -> EntryT rename verified — no remaining references in components/core/src.
Repository search found zero matches for 'entry_t' and all code uses EntryT.

components/core/src/clp/DictionaryWriter.hpp (1)

39-39: EntryT alias rename — approved.

Repo-wide search found no occurrences of entry_t; readers/writers consistently use EntryT. Ensure any external tools/tests outside this repository are updated.

components/core/src/clp/GrepCore.hpp (7)

17-17: Include for concepts — LGTM.


113-113: process_var_token now constrained by VariableDictionaryReaderReq — LGTM.


136-139: generate_logtypes_and_vars_for_subquery constrained by reader concepts — LGTM.


149-161: Out‑of‑class definition signature matches constrained declaration — LGTM.


272-272: process_var_token out‑of‑class definition matches declaration — LGTM.


333-336: generate_logtypes_and_vars_for_subquery out‑of‑class definition matches — LGTM.


412-414: Correct use of Reader::EntryT pointer set — LGTM.
Matches the new concepts‑driven API.

components/core/src/clp/EncodedVariableInterpreter.hpp (12)

11-11: Include concepts header — LGTM.


127-130: encode_and_add_to_dictionary(message, …) now concept‑constrained — LGTM.


178-186: decode_variables_into_message now concept‑constrained — LGTM.


201-209: encode_and_search_dictionary constrained by reader concept — LGTM.


219-226: wildcard_search_dictionary_and_get_encoded_matches constrained — LGTM.


240-248: encode_var concept‑constrained — LGTM.


262-269: add_dict_var concept‑constrained — LGTM.


272-281: encode_and_add_to_dictionary(message, …) definition matches constraints — LGTM.


371-373: decode_variables_into_message definition matches constraints — LGTM.


445-452: encode_and_search_dictionary definition matches constraints — LGTM.


523-525: encode_var definition matches constraints — LGTM.


544-546: add_dict_var definition matches constraints — LGTM.

Comment on lines +494 to 501
template <VariableDictionaryReaderReq VariableDictionaryReaderType>
bool EncodedVariableInterpreter::wildcard_search_dictionary_and_get_encoded_matches(
std::string_view var_wildcard_str,
VariableDictionaryReaderType const& var_dict,
bool ignore_case,
SubQuery& sub_query
) {
// Find matches
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Reserve capacity for result sets to cut rehashing in hot path.

Apply this diff:

-    // Encode matches
-    std::unordered_set<encoded_variable_t> encoded_vars;
-    std::unordered_set<variable_dictionary_id_t> var_dict_ids;
+    // Encode matches
+    std::unordered_set<encoded_variable_t> encoded_vars;
+    std::unordered_set<variable_dictionary_id_t> var_dict_ids;
+    encoded_vars.reserve(var_dict_entries.size());
+    var_dict_ids.reserve(var_dict_entries.size());

Committable suggestion skipped: line range outside the PR's diff.

@gibber9809 gibber9809 changed the title refactor(clp): Use concepts to define interface for shared parsing and search code. refactor(clp): Use concepts to define the interface for dictionaries passed to the shared parsing and search code. Sep 15, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
components/core/src/clp/GrepCore.hpp (1)

209-211: Adhere to project style: prefer false == expr over !expr.

Replace negations per coding guidelines for **/*.hpp.

Apply this diff:

-        if (!query_token.has_greedy_wildcard_in_middle() && query_token.is_ambiguous_token()) {
+        if (false == query_token.has_greedy_wildcard_in_middle() && query_token.is_ambiguous_token()) {
@@
-    if (!query_token.contains_wildcards()) {
+    if (false == query_token.contains_wildcards()) {
@@
-                if (!EncodedVariableInterpreter::wildcard_search_dictionary_and_get_encoded_matches(
+                if (false == EncodedVariableInterpreter::wildcard_search_dictionary_and_get_encoded_matches(
                         query_token.get_value(),
                         var_dict,
                         ignore_case,
                         sub_query
                     ))
                 {
                     // Variable doesn't exist in dictionary
                     return false;
                 }
@@
-            if (!query_token.is_var()) {
+            if (false == query_token.is_var()) {
                 logtype += '*';
             } else {
                 logtype += '*';
                 EncodedVariableInterpreter::add_dict_var(logtype);
                 logtype += '*';
             }
@@
-            if (!query_token.is_var()) {
+            if (false == query_token.is_var()) {
                 ir::append_constant_to_logtype(query_token.get_value(), escape_handler, logtype);
-            } else if (!process_var_token(query_token, var_dict, ignore_case, sub_query, logtype)) {
+            } else if (false == process_var_token(query_token, var_dict, ignore_case, sub_query, logtype)) {
                 return SubQueryMatchabilityResult::WontMatch;
             }

Also applies to: 286-299, 313-319, 380-386, 388-393

components/core/src/clp/EncodedVariableInterpreter.hpp (1)

468-490: Reserve capacity for ID set when number of entries is known.

Pre‑reserve var_dict_ids to cut rehashing in hot path; you already do it for encoded_vars.

Apply this diff:

         std::unordered_set<encoded_variable_t> encoded_vars;
         std::unordered_set<variable_dictionary_id_t> var_dict_ids;
-        encoded_vars.reserve(entries.size());
+        encoded_vars.reserve(entries.size());
+        var_dict_ids.reserve(entries.size());
♻️ Duplicate comments (4)
components/core/src/clp_s/indexer/CMakeLists.txt (1)

46-46: Factor the clp/ header path to reduce maintenance risk (echoing prior suggestion).

Centralize the ../../clp prefix to avoid future off‑by‑one path breaks and make moves simpler. Change only the four added entries.

Apply this diff:

-        ../../clp/LogTypeDictionaryEntryReq.hpp
+        ${CLP_DICT_CONCEPTS_DIR}/LogTypeDictionaryEntryReq.hpp
@@
-        ../../clp/VariableDictionaryEntryReq.hpp
-        ../../clp/VariableDictionaryReaderReq.hpp
-        ../../clp/VariableDictionaryWriterReq.hpp
+        ${CLP_DICT_CONCEPTS_DIR}/VariableDictionaryEntryReq.hpp
+        ${CLP_DICT_CONCEPTS_DIR}/VariableDictionaryReaderReq.hpp
+        ${CLP_DICT_CONCEPTS_DIR}/VariableDictionaryWriterReq.hpp

Add near the top of this file (before the set(INDEXER_SOURCES ...) block):

set(CLP_DICT_CONCEPTS_DIR "${CMAKE_CURRENT_SOURCE_DIR}/../../clp")

Also applies to: 64-66

components/core/src/clp/EncodedVariableInterpreter.hpp (3)

298-309: Constrain EncodedVariableType at compile time.

Without a constraint, any non‑8B type falls into the “else” branch. Enforce allowed types to prevent misuse.

Apply this diff:

 void EncodedVariableInterpreter::encode_and_add_to_dictionary(
         ir::LogEvent<EncodedVariableType> const& log_event,
         LogTypeDictionaryEntryType& logtype_dict_entry,
         VariableDictionaryWriterType& var_dict,
         std::vector<ir::eight_byte_encoded_variable_t>& encoded_vars,
         std::vector<variable_dictionary_id_t>& var_ids,
         size_t& raw_num_bytes
 ) {
+    static_assert(
+            std::is_same_v<EncodedVariableType, ir::eight_byte_encoded_variable_t>
+            || std::is_same_v<EncodedVariableType, ir::four_byte_encoded_variable_t>,
+            "EncodedVariableType must be ir::eight_byte_encoded_variable_t or ir::four_byte_encoded_variable_t"
+    );

4-21: Missing header for std::is_same_v.

Include <type_traits> explicitly to avoid relying on transitive includes.

Apply this diff:

 #include <unordered_set>
 #include <vector>
+#include <type_traits>

496-523: Reserve capacity for result sets after wildcard search.

Once var_dict_entries is filled, reserve space for encoded_vars and var_dict_ids before inserting.

Apply this diff:

     // Encode matches
     std::unordered_set<encoded_variable_t> encoded_vars;
     std::unordered_set<variable_dictionary_id_t> var_dict_ids;
+    encoded_vars.reserve(var_dict_entries.size());
+    var_dict_ids.reserve(var_dict_entries.size());
     for (auto entry : var_dict_entries) {
         encoded_vars.emplace(encode_var_dict_id(entry->get_id()));
         var_dict_ids.emplace(entry->get_id());
     }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 219daed and abc5387.

📒 Files selected for processing (13)
  • components/core/CMakeLists.txt (2 hunks)
  • components/core/src/clp/EncodedVariableInterpreter.hpp (15 hunks)
  • components/core/src/clp/GrepCore.hpp (8 hunks)
  • components/core/src/clp/LogTypeDictionaryEntryReq.hpp (1 hunks)
  • components/core/src/clp/LogTypeDictionaryReaderReq.hpp (1 hunks)
  • components/core/src/clp/VariableDictionaryEntryReq.hpp (1 hunks)
  • components/core/src/clp/VariableDictionaryReaderReq.hpp (1 hunks)
  • components/core/src/clp/VariableDictionaryWriterReq.hpp (1 hunks)
  • components/core/src/clp/clg/CMakeLists.txt (2 hunks)
  • components/core/src/clp/clo/CMakeLists.txt (2 hunks)
  • components/core/src/clp/clp/CMakeLists.txt (2 hunks)
  • components/core/src/clp_s/CMakeLists.txt (2 hunks)
  • components/core/src/clp_s/indexer/CMakeLists.txt (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/core/src/clp/VariableDictionaryWriterReq.hpp
  • components/core/src/clp/VariableDictionaryEntryReq.hpp
  • components/core/src/clp/VariableDictionaryReaderReq.hpp
  • components/core/src/clp/LogTypeDictionaryReaderReq.hpp
  • components/core/src/clp/LogTypeDictionaryEntryReq.hpp
  • components/core/src/clp/GrepCore.hpp
  • components/core/src/clp/EncodedVariableInterpreter.hpp
🧠 Learnings (8)
📚 Learning: 2025-09-11T13:15:39.930Z
Learnt from: anlowee
PR: y-scope/clp#1176
File: components/core/src/clp_s/ColumnWriter.cpp:67-72
Timestamp: 2025-09-11T13:15:39.930Z
Learning: In components/core/src/clp_s/ColumnWriter.cpp, when using sizeof with types like variable_dictionary_id_t, always use the fully qualified namespace (clp::variable_dictionary_id_t) to match the variable declaration and maintain consistency with other column writers in the codebase.

Applied to files:

  • components/core/src/clp/VariableDictionaryWriterReq.hpp
  • components/core/src/clp/EncodedVariableInterpreter.hpp
📚 Learning: 2025-08-09T04:07:27.083Z
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1156
File: components/core/CMakeLists.txt:772-772
Timestamp: 2025-08-09T04:07:27.083Z
Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.

Applied to files:

  • components/core/src/clp/clp/CMakeLists.txt
  • components/core/src/clp/clo/CMakeLists.txt
📚 Learning: 2025-06-02T18:22:24.060Z
Learnt from: gibber9809
PR: y-scope/clp#955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.

Applied to files:

  • components/core/src/clp/clp/CMakeLists.txt
📚 Learning: 2025-08-06T23:57:49.762Z
Learnt from: gibber9809
PR: y-scope/clp#1163
File: components/core/src/clp_s/ColumnWriter.hpp:120-122
Timestamp: 2025-08-06T23:57:49.762Z
Learning: In the CLP codebase, the `encoded_log_dict_id_t` type alias in `components/core/src/clp_s/ColumnWriter.hpp` is appropriately placed as a class-scoped alias because the ColumnWriter.hpp header is already included in all places where this type is used, eliminating the need to move it to a common header.

Applied to files:

  • components/core/CMakeLists.txt
  • components/core/src/clp_s/CMakeLists.txt
  • components/core/src/clp/LogTypeDictionaryReaderReq.hpp
  • components/core/src/clp/EncodedVariableInterpreter.hpp
📚 Learning: 2024-10-18T02:31:18.595Z
Learnt from: LinZhihao-723
PR: y-scope/clp#557
File: components/core/src/clp/ffi/ir_stream/utils.hpp:0-0
Timestamp: 2024-10-18T02:31:18.595Z
Learning: In `components/core/src/clp/ffi/ir_stream/utils.hpp`, the function `size_dependent_encode_and_serialize_schema_tree_node_id` assumes that the caller checks that `node_id` fits within the range of `encoded_node_id_t` before casting.

Applied to files:

  • components/core/src/clp/EncodedVariableInterpreter.hpp
📚 Learning: 2025-04-26T02:21:22.021Z
Learnt from: LinZhihao-723
PR: y-scope/clp#856
File: components/core/src/clp/ffi/ir_stream/search/utils.cpp:258-266
Timestamp: 2025-04-26T02:21:22.021Z
Learning: In the clp::ffi::ir_stream::search namespace, the design principle is that callers are responsible for type checking, not the called functions. If control flow reaches a function, input types should already be validated by the caller.

Applied to files:

  • components/core/src/clp/EncodedVariableInterpreter.hpp
📚 Learning: 2024-10-07T21:35:04.362Z
Learnt from: AVMatthews
PR: y-scope/clp#543
File: components/core/src/clp_s/JsonParser.cpp:735-794
Timestamp: 2024-10-07T21:35:04.362Z
Learning: In `components/core/src/clp_s/JsonParser.cpp`, within the `parse_from_ir` method, encountering errors from `kv_log_event_result.error()` aside from `std::errc::no_message_available` and `std::errc::result_out_of_range` is anticipated behavior and does not require additional error handling or logging.

Applied to files:

  • components/core/src/clp/EncodedVariableInterpreter.hpp
📚 Learning: 2024-10-08T15:52:50.753Z
Learnt from: AVMatthews
PR: y-scope/clp#543
File: components/core/src/clp_s/JsonParser.cpp:756-765
Timestamp: 2024-10-08T15:52:50.753Z
Learning: In `components/core/src/clp_s/JsonParser.cpp`, within the `parse_from_ir()` function, reaching the end of log events in a given IR is not considered an error case. The errors `std::errc::no_message_available` and `std::errc::result_out_of_range` are expected signals to break the deserialization loop and proceed accordingly.

Applied to files:

  • components/core/src/clp/EncodedVariableInterpreter.hpp
🧬 Code graph analysis (2)
components/core/src/clp/VariableDictionaryReaderReq.hpp (1)
components/core/src/clp/EncodedVariableInterpreter.hpp (1)
  • id (56-56)
components/core/src/clp/EncodedVariableInterpreter.hpp (1)
components/core/src/glt/EncodedVariableInterpreter.hpp (1)
  • var_wildcard_str (176-181)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: manylinux_2_28-x86_64-static-linked-bins
  • GitHub Check: musllinux_1_2-x86_64-static-linked-bins
  • GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
  • GitHub Check: ubuntu-jammy-lint
  • GitHub Check: package-image
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: centos-stream-9-static-linked-bins
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: lint-check (ubuntu-24.04)
🔇 Additional comments (10)
components/core/CMakeLists.txt (1)

583-583: Unit-test target sources updated appropriately for concept coverage

Bringing the Req headers into the unit test build ensures compile-time checks are exercised; unitTest already requests cxx_std_20. Looks solid.

Please confirm unit tests include at least one compile path that instantiates templates constrained by each Req concept so failures surface early.

Also applies to: 585-585, 678-678, 680-680, 683-683

components/core/src/clp/VariableDictionaryEntryReq.hpp (1)

13-21: Entry concept is precise and minimal

Using std::same_as with variable_dictionary_id_t enforces the exact ID type; include set is sufficient. No concerns.

Double-check that all CLP and CLP-S variable dictionary entries return variable_dictionary_id_t (not a ref/const-qualified variant) from get_id() to satisfy the strict same_as.

components/core/src/clp/VariableDictionaryWriterReq.hpp (1)

13-28: Writer concept cleanly specifies add_entry contract

Signature and return constraint are clear; string_view is an appropriate choice for value. Good to merge.

Confirm all writer implementations now accept std::string_view for value; older const char* overloads alone would not satisfy this concept.

components/core/src/clp/LogTypeDictionaryReaderReq.hpp (1)

17-48: Reader concept comprehensively constrains the interface

  • Nested Entry requirement and ID type check are appropriate.
  • Return/container types for exact and wildcard queries are explicit.
    Looks correct and matches the intended API shape.

Please verify clp and clp-s reader types expose Entry and dictionary_id_t aliases exactly as required and that method names match (get_entry_matching_value / get_entries_matching_wildcard_string).

components/core/src/clp/VariableDictionaryReaderReq.hpp (2)

37-46: Naming consistency: method returns multiple entries but is named singular.

Please confirm the API name get_entry_matching_value(...) is intentionally singular even though it returns a std::vector<Entry const*>. If not intentional, consider pluralizing for clarity across the codebase.


57-60: LGTM: concept checks are precise and minimal.

The dictionary_id_t alias constraint and entry concept binding look correct.

components/core/src/clp/LogTypeDictionaryEntryReq.hpp (1)

17-130: LGTM: comprehensive entry concept.

Interface coverage (lifecycle, parsing, placeholders, ID) matches usage in the decoding/encoding paths.

components/core/src/clp/GrepCore.hpp (2)

47-50: Good move to concept‑constrained APIs.

Switching to LogTypeDictionaryReaderReq/VariableDictionaryReaderReq tightens contracts and improves error messages at compile time.

Also applies to: 114-115, 137-140, 150-153, 335-337


413-416: LGTM: uses reader’s nested Entry type.

Using typename LogTypeDictionaryReaderType::Entry const* avoids duplicating the entry type in template parameters.

components/core/src/clp/EncodedVariableInterpreter.hpp (1)

129-138: LGTM: concept adoption and nested Entry usage.

The templates are cleanly constrained; removal of the explicit entry type parameter reduces coupling.

Also applies to: 179-188, 203-211, 242-271, 545-561

Comment on lines +73 to +74
../clp/LogTypeDictionaryEntryReq.hpp
../clp/LogTypeDictionaryReaderReq.hpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

clp_s dependency surface correctly extended with Req headers

Headers are wired into clp_s_clp_dependencies, and the target compiles with C++20. No issues spotted.

Optional: define a DICT_REQ_HEADERS list once in this file and reuse it to reduce repetition.

Also applies to: 102-104

🤖 Prompt for AI Agents
In components/core/src/clp_s/CMakeLists.txt around lines 73-74 (and similarly
lines 102-104), the same Req header paths are repeated; define a
DICT_REQ_HEADERS variable near the top of the file containing the shared header
paths (e.g. ../clp/LogTypeDictionaryEntryReq.hpp;
../clp/LogTypeDictionaryReaderReq.hpp) and then replace the repeated literal
header lists with the DICT_REQ_HEADERS variable wherever those headers are added
to clp_s_clp_dependencies (or used in
target_sources/target_include_directories), ensuring the variable is used
consistently for both places mentioned.

@@ -48,7 +48,9 @@ set(
../LogSurgeonReader.hpp
../LogTypeDictionaryEntry.cpp
../LogTypeDictionaryEntry.hpp
../LogTypeDictionaryEntryReq.hpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

Good placement of concept wrapper headers

Additions align with the pattern in clp/clo and maintain proximity to base headers. All good.

Consider centralizing the repeated Req header paths into a list variable to avoid drift across targets.

Also applies to: 53-53, 122-122, 124-124, 127-127

🤖 Prompt for AI Agents
In components/core/src/clp/clg/CMakeLists.txt around line 51 (and similarly at
lines 53, 122, 124, 127), multiple targets repeat the same "../*Req.hpp" header
paths; extract the repeated Req header paths into a single CMake list variable
(e.g. set(REQ_HEADERS ../LogTypeDictionaryEntryReq.hpp ../OtherReq.hpp ...)) at
the top of the file and then reference that variable in each target (e.g.
target_sources(... PRIVATE ${REQ_HEADERS})), ensuring you update all listed
locations so they use the centralized list to prevent path drift across targets.

@@ -49,7 +49,9 @@ set(
../LogSurgeonReader.hpp
../LogTypeDictionaryEntry.cpp
../LogTypeDictionaryEntry.hpp
../LogTypeDictionaryEntryReq.hpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

Concept headers correctly added; build visibility improved

The added Req headers are placed adjacent to their base headers and the target already builds with C++20. Looks good.

If you want to reduce future churn across clp/clg/clo, consider defining a small CMake list variable (e.g., DICT_REQ_HEADERS) and expanding it here and in the sibling CMake files.

Also applies to: 54-54, 122-122, 124-124, 127-127

@@ -64,6 +64,7 @@ set(
../LogSurgeonReader.hpp
../LogTypeDictionaryEntry.cpp
../LogTypeDictionaryEntry.hpp
../LogTypeDictionaryEntryReq.hpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

Non-functional additions are correct and consistent with C++20 concepts

Headers are correctly surfaced for IDEs and compilation units; target has cxx_std_20 enabled. No issues.

Optional: mirror a shared DICT_REQ_HEADERS list to keep these five Req headers in one place across CMake files.

Also applies to: 69-69, 147-147, 149-149, 152-152

🤖 Prompt for AI Agents
In components/core/src/clp/clp/CMakeLists.txt around lines 67, 69, 147, 149 and
152, the five Req header filenames are being duplicated across multiple
CMakeLists entries; create a single list variable (e.g., DICT_REQ_HEADERS) at
the top of the CMake file or in a shared included CMake module that contains the
five Req headers, then replace the repeated header lists at the specified lines
with a reference to that variable (use target_sources or
target_include_directories as appropriate) so all CMake files include the same
canonical list; ensure the new variable is visible to all relevant CMake scopes
(optionally use set(... PARENT_SCOPE) or a common .cmake include) and update the
other CMake files to reference the shared DICT_REQ_HEADERS.

LinZhihao-723
LinZhihao-723 previously approved these changes Sep 17, 2025
Copy link
Member

@LinZhihao-723 LinZhihao-723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PR title, how about:

refactor(core): Use C++20 concepts to enforce requirements for logtype and variable dictionary entries, readers, and writers.

@LinZhihao-723 LinZhihao-723 dismissed their stale review September 17, 2025 17:29

Sorry for missing this: #1267 (comment)

@gibber9809 gibber9809 changed the title refactor(clp): Use concepts to define the interface for dictionaries passed to the shared parsing and search code. refactor(core): Use C++20 concepts to enforce requirements for logtype and variable dictionary entries, readers, and writers. Sep 17, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
components/core/src/clp/VariableDictionaryReaderReq.hpp (1)

15-17: Stale @tparam removed — looks good.

Matches the single template parameter of the concept.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between abc5387 and 3201f95.

📒 Files selected for processing (1)
  • components/core/src/clp/VariableDictionaryReaderReq.hpp (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/core/src/clp/VariableDictionaryReaderReq.hpp
🧠 Learnings (1)
📓 Common learnings
Learnt from: gibber9809
PR: y-scope/clp#1163
File: components/core/src/clp_s/ColumnWriter.hpp:120-122
Timestamp: 2025-08-06T23:57:49.762Z
Learning: In the CLP codebase, the `encoded_log_dict_id_t` type alias in `components/core/src/clp_s/ColumnWriter.hpp` is appropriately placed as a class-scoped alias because the ColumnWriter.hpp header is already included in all places where this type is used, eliminating the need to move it to a common header.
🧬 Code graph analysis (1)
components/core/src/clp/VariableDictionaryReaderReq.hpp (1)
components/core/src/clp/EncodedVariableInterpreter.hpp (1)
  • id (56-56)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: centos-stream-9-static-linked-bins
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
  • GitHub Check: musllinux_1_2-x86_64-static-linked-bins
  • GitHub Check: package-image
  • GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-static-linked-bins
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: ubuntu-jammy-lint
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
🔇 Additional comments (2)
components/core/src/clp/VariableDictionaryReaderReq.hpp (2)

56-58: LGTM: strong type guarantee on dictionary_id_t.

Requiring dictionary_id_t to be variable_dictionary_id_t prevents cross‑dict ID mixups at compile time.


19-25: Avoid accidental default-constructible requirement; use const& in requires parameters.

Declaring reader by value makes the concept require default-initializability. Pass it by const& (or & if mutation is intended) to avoid over‑constraining and to enforce const‑correct reads.

-template <typename VariableDictionaryReaderType>
-concept VariableDictionaryReaderReq = requires(
-        VariableDictionaryReaderType reader,
+template <typename VariableDictionaryReaderType>
+concept VariableDictionaryReaderReq = requires(
+        VariableDictionaryReaderType const& reader,
         variable_dictionary_id_t id,
         std::string_view variable,
         bool ignore_case,
         std::unordered_set<typename VariableDictionaryReaderType::Entry const*>& entries
 ) {

Likely an incorrect or invalid review comment.

Comment on lines +29 to +31
* @param id
* @return The value of the dictionary entry with the given Id.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

Doc nits: capitalise “ID”; hyphenate “case‑insensitive”.

Pure comment fix.

- * @return The value of the dictionary entry with the given Id.
+ * @return The value of the dictionary entry with the given ID.
- * @param ignore_case Whether the search should be case insensitive.
+ * @param ignore_case Whether the search should be case-insensitive.
- * @param ignore_case Whether the search should be case insensitive.
+ * @param ignore_case Whether the search should be case-insensitive.

Also applies to: 39-39, 49-49

🤖 Prompt for AI Agents
In components/core/src/clp/VariableDictionaryReaderReq.hpp around lines 29-31
(also update lines 39 and 49), the comment text uses lowercase "Id" and "case
insensitive" — update these doc comments to use capitalized "ID" and hyphenated
"case-insensitive" respectively; make the same replacements in the other
indicated comment lines so all docstrings consistently read "ID" and
"case‑insensitive".

Comment on lines +42 to +45
{
reader.get_entry_matching_value(variable, ignore_case)
} -> std::same_as<std::vector<typename VariableDictionaryReaderType::Entry const*>>;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

Naming consistency: function returns multiple entries but is named singular.

Consider aligning with the wildcard variant for symmetry, e.g., get_entries_matching_value(...). If the name is entrenched, leave as-is but note the plural return in docs.

🤖 Prompt for AI Agents
In components/core/src/clp/VariableDictionaryReaderReq.hpp around lines 42-45,
the function name get_entry_matching_value is singular while it returns multiple
entries; rename it to get_entries_matching_value to reflect the plural return
type (or, if renaming is not acceptable, update the function comment/docs to
explicitly state it returns a vector of entries). Update all declarations/usages
and any tests or docs to use the new name (or adjusted docs) to maintain naming
consistency with the wildcard variant.

@gibber9809 gibber9809 merged commit 16d95b1 into y-scope:main Sep 17, 2025
29 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants