Skip to content

Conversation

SharafMohamed
Copy link
Contributor

@SharafMohamed SharafMohamed commented Apr 14, 2025

Description

Bump Log Surgeon version so that we can compiling on Ubuntu 24.04.

We do not bump to the newest version because currently CLP build will break upon using fmtlib in Log Surgeon. We will do so after we switch to use explicit instantiation in Log Surgeon.

Validation performed

Existing unit-tests pass.

Summary by CodeRabbit

  • Chores
    • Updated the project to use the latest version of an essential logging library, ensuring enhanced compatibility and stability.
  • Documentation
    • Revised dependency version details and source URLs in the documentation to reflect the recent update.

@SharafMohamed SharafMohamed requested a review from a team as a code owner April 14, 2025 07:28
Copy link
Contributor

coderabbitai bot commented Apr 14, 2025

Walkthrough

This pull request updates the commit identifier for the log-surgeon submodule from 895f46489b1911ab3b3aac3202afd56c96e8cd98 to f801a3f369bb14919d82a928d9a3812006245aa0. Corresponding dependency information in the task file and the developer guide documentation has also been updated to reference the new commit version.

Changes

Files Change Summary
components/core/submodules/log-surgeon, deps-tasks.yml, docs/src/dev-guide/components-core/index.md Updated log-surgeon commit from 895f46489b1911ab3b3aac3202afd56c96e8cd98 to f801a3f369bb14919d82a928d9a3812006245aa0. Adjusted dependency source name, URL, and documentation reference accordingly.

Suggested reviewers

  • kirkrodrigues

Tip

⚡💬 Agentic Chat (Pro Plan, General Availability)
  • We're introducing multi-step agentic chat in review comments and issue comments, within and outside of PR's. This feature enhances review and issue discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments and add commits to existing pull requests.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 64d3306 and 4ac08b0.

📒 Files selected for processing (2)
  • deps-tasks.yml (1 hunks)
  • docs/src/dev-guide/components-core/index.md (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • docs/src/dev-guide/components-core/index.md
  • deps-tasks.yml

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (9)
components/core/config/schemas.txt (9)

2-3: Suggestion: Escape Literal Dot in ISO Timestamp Pattern
The regex on line 3

timestamp:\d{4}\-\d{2}\-\d{2}T\d{2}:\d{2}:\d{2}.\d{3}

uses an unescaped dot before the millisecond section. In regular expressions, an unescaped dot matches any character. If the intent is to match a literal period (as in the example "2015-01-31T15:50:45.392"), please escape it using \..

Proposed diff:

-timestamp:\d{4}\-\d{2}\-\d{2}T\d{2}:\d{2}:\d{2}.\d{3}
+timestamp:\d{4}\-\d{2}\-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}

13-14: Suggestion: Escape Dot in Space-Separated Timestamp Pattern
On line 14, the regex

timestamp:\d{4}\-\d{2}\-\d{2} \d{2}:\d{2}:\d{2}.\d{3}

represents a timestamp where a literal period separates the seconds from the milliseconds. To ensure that only a period is matched (not any character), please escape the dot as \..

Proposed diff:

-timestamp:\d{4}\-\d{2}\-\d{2} \d{2}:\d{2}:\d{2}.\d{3}
+timestamp:\d{4}\-\d{2}\-\d{2} \d{2}:\d{2}:\d{2}\.\d{3}

20-22: Note: Double Spacing in Timestamp Pattern
The regex on line 22

timestamp:\d{4}\-\d{2}\-\d{2}  \d{2}:\d{2}:\d{2}

features two spaces between the date and time. Please confirm that this double spacing is intentional to match logs (e.g. "Start-Date: 2015-01-31 15:50:45").


27-28: Suggestion: Clarify Whitespace Handling for Hour Field
The pattern on line 28

timestamp:\d{2}\d{2}\d{2} [ 0-9]{2}:\d{2}:\d{2}

uses the character class [ 0-9]{2} to capture the hour portion. This likely aims to allow a leading space for single-digit hours. Consider either adding a comment explaining this design or refactoring to use an optional whitespace token (e.g. \s?) for improved clarity.


31-32: Suggestion: Verify Hour Format in 12-Hour Timestamp
The regex on line 32

timestamp:[A-Z][a-z]{2} \d{2}, \d{4} [ 0-9]{2}:\d{2}:\d{2} [AP]M

utilizes [ 0-9]{2} for the hour field, which may include a leading space for single-digit hours. Please verify that this behavior is consistent with log expectations.


35-37: Observation: Comment Examples with Unbalanced Brackets
The example comments on lines 35–37 (e.g.,

// E.g. E [31/Jan/2015:15:50:45
// E.g. localhost - - [01/Jan/2016:15:50:17
// E.g. 192.168.4.5 - - [01/Jan/2016:15:50:17
```) display an opening bracket without an apparent closing bracket. While these are comments, please ensure they accurately reflect the log formats the regexes are intended to match, or add a clarifying note if the missing bracket is deliberate.

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 LanguageTool</summary>

[typographical] ~35-~35: Unpaired symbol: ']' seems to be missing
Context: ...-z]+ \d{2}, \d{4} \d{2}:\d{2} // E.g. E [31/Jan/2015:15:50:45 // E.g. localhost -...

(UNPAIRED_BRACKETS)

---

[typographical] ~36-~36: Unpaired symbol: ']' seems to be missing
Context: ...Jan/2015:15:50:45 // E.g. localhost - - [01/Jan/2016:15:50:17 // E.g. 192.168.4.5...

(UNPAIRED_BRACKETS)

---

[typographical] ~37-~37: Unpaired symbol: ']' seems to be missing
Context: ...n/2016:15:50:17 // E.g. 192.168.4.5 - - [01/Jan/2016:15:50:17 timestamp:\[\d{2}/[...

(UNPAIRED_BRACKETS)

</details>

</details>

---

`43-44`: **Suggestion: Validate Triple “<” Pattern Usage**  
The regex on line 44  

timestamp:<<<\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}:\d{3}

matches timestamps prefixed with three `<` characters. This is an unusual pattern—please verify that the use of three `<` is intentional and that it complies with the expected log format.

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 LanguageTool</summary>

[uncategorized] ~43-~43: Pentru limba română se folosește simbolul "«".
Context: ...0-9]{2} \d{2}:\d{2}:\d{2} \d{4} // E.g. <<<2016-11-10 03:02:29:936 timestamp:\<\<\...

(GHILIMELE_DUBLE_INTERIOR_INCEPUT)

</details>

</details>

---

`47-48`: **Suggestion: Escape Literal Dot in Short Timestamp Pattern**  
On line 48, the regex  

timestamp:\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{3}

includes an unescaped dot before the millisecond section. For precision (to match a literal period as in "01-21 11:56:42.392"), please consider escaping the dot.  
   
*Proposed diff:*  
```diff
-timestamp:\d{2}\-\d{2} \d{2}:\d{2}:\d{2}.\d{3}
+timestamp:\d{2}\-\d{2} \d{2}:\d{2}:\d{2}\.\d{3}

49-50: Suggestion: Escape Literal Dot in Extended Timestamp Pattern
Similarly, on line 50, the regex

timestamp:\d{4}\-\d{2}\-\d{2} \d{2}:\d{2}:\d{2}.\d{6}

should use an escaped dot (\.) if the period is meant to be literal in the timestamp (e.g., "2016-05-08 11:34:04.083464").

Proposed diff:

-timestamp:\d{4}\-\d{2}\-\d{2} \d{2}:\d{2}:\d{2}.\d{6}
+timestamp:\d{4}\-\d{2}\-\d{2} \d{2}:\d{2}:\d{2}\.\d{6}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2e7f63c and 0cdc934.

📒 Files selected for processing (2)
  • components/core/config/schemas.txt (1 hunks)
  • components/core/submodules/log-surgeon (1 hunks)
🧰 Additional context used
🪛 LanguageTool
components/core/config/schemas.txt

[typographical] ~6-~6: Unpaired symbol: ']' seems to be missing
Context: ...-\d{2}T\d{2}:\d{2}:\d{2},\d{3} // E.g. [2015-01-31T15:50:45 timestamp:[\d{4}-...

(UNPAIRED_BRACKETS)


[typographical] ~7-~7: Unpaired symbol: ']' seems to be missing
Context: ...// E.g. [2015-01-31T15:50:45 timestamp:[\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2} //...

(UNPAIRED_BRACKETS)


[typographical] ~35-~35: Unpaired symbol: ']' seems to be missing
Context: ...-z]+ \d{2}, \d{4} \d{2}:\d{2} // E.g. E [31/Jan/2015:15:50:45 // E.g. localhost -...

(UNPAIRED_BRACKETS)


[typographical] ~36-~36: Unpaired symbol: ']' seems to be missing
Context: ...Jan/2015:15:50:45 // E.g. localhost - - [01/Jan/2016:15:50:17 // E.g. 192.168.4.5...

(UNPAIRED_BRACKETS)


[typographical] ~37-~37: Unpaired symbol: ']' seems to be missing
Context: ...n/2016:15:50:17 // E.g. 192.168.4.5 - - [01/Jan/2016:15:50:17 timestamp:[\d{2}/[...

(UNPAIRED_BRACKETS)


[typographical] ~38-~38: Unpaired symbol: ']' seems to be missing
Context: ....5 - - [01/Jan/2016:15:50:17 timestamp:[\d{2}/[A-Z][a-z]{2}/\d{4}:\d{2}:\d{2}:\d...

(UNPAIRED_BRACKETS)


[typographical] ~39-~39: Unpaired symbol: ']' seems to be missing
Context: ...{2}:\d{2}:\d{2} // E.g. 192.168.4.5 - - [01/01/2016:15:50:17 timestamp:[\d{2}/\d...

(UNPAIRED_BRACKETS)


[typographical] ~40-~40: Unpaired symbol: ']' seems to be missing
Context: ...4.5 - - [01/01/2016:15:50:17 timestamp:[\d{2}/\d{2}/\d{4}:\d{2}:\d{2}:\d{2} // E...

(UNPAIRED_BRACKETS)


[uncategorized] ~43-~43: Pentru limba română se folosește simbolul "«".
Context: ...0-9]{2} \d{2}:\d{2}:\d{2} \d{4} // E.g. <<<2016-11-10 03:02:29:936 timestamp:<<...

(GHILIMELE_DUBLE_INTERIOR_INCEPUT)


[duplication] ~52-~52: Posibilă greșeală: ați repetat un cuvânt
Context: ...d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{6} // Delimiters delimiters: \t\r\n!"#$%&'()*,:;<>?@[]^_`{|...

(WORD_REPEAT_RULE)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: build (macos-latest)
  • GitHub Check: lint-check (macos-latest)
  • GitHub Check: lint-check (ubuntu-latest)
🔇 Additional comments (16)
components/core/config/schemas.txt (15)

4-5: Review: Comma-Separated Timestamp Pattern
The regex on line 5

timestamp:\d{4}\-\d{2}\-\d{2}T\d{2}:\d{2}:\d{2},\d{3}

correctly uses a comma as the millisecond separator. No changes are required here.


6-7: Note: Unbalanced Brackets in Example Pattern
The regex on line 7

timestamp:\[\d{4}\-\d{2}\-\d{2}T\d{2}:\d{2}:\d{2}

starts with an escaped [ but does not include a closing bracket. This appears to be intentional to match logs that start with [ (as the example on line 6 shows “E.g. [2015-01-31T15:50:45”). Please verify that the absence of a closing bracket is desired rather than a typo.

🧰 Tools
🪛 LanguageTool

[typographical] ~6-~6: Unpaired symbol: ']' seems to be missing
Context: ...-\d{2}T\d{2}:\d{2}:\d{2},\d{3} // E.g. [2015-01-31T15:50:45 timestamp:[\d{4}-...

(UNPAIRED_BRACKETS)


[typographical] ~7-~7: Unpaired symbol: ']' seems to be missing
Context: ...// E.g. [2015-01-31T15:50:45 timestamp:[\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2} //...

(UNPAIRED_BRACKETS)


8-9: Approval: Bracket-Enclosed Compact Timestamp
The regex on line 9

timestamp:\[\d{4}\d{2}\d{2}\-\d{2}:\d{2}:\d{2}\]

properly matches a timestamp enclosed in literal square brackets.


11-12: Approval: Timestamp with Space and Millisecond (Comma-Separated)
The regex on line 12

timestamp:\d{4}\-\d{2}\-\d{2} \d{2}:\d{2}:\d{2},\d{3}

accurately captures the intended timestamp format.


15-16: Approval: Bracket-Enclosed Timestamp with Comma-Separated Milliseconds
The regex on line 16

timestamp:\[\d{4}\-\d{2}\-\d{2} \d{2}:\d{2}:\d{2},\d{3}\]

is correctly formed and matches the intended format.


17-19: Observation: Expanded Example Comments
Lines 17–19 provide additional example log lines. These comments enhance clarity regarding the variety of timestamp contexts that may be encountered.


23-24: Approval: Slash-Separated Date Timestamp Pattern
The regex on line 24

timestamp:\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}

is clear and aligns with the provided example.


25-26: Approval: Short Year Format Timestamp Pattern
The regex on line 26

timestamp:\d{2}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}

matches the intended abbreviated date format.


29-30: Approval: Timestamp with Month Name and Milliseconds
The regex on line 30

timestamp:\d{2} [A-Z][a-z]{2} \d{4} \d{2}:\d{2}:\d{2},\d{3}

adequately captures the format demonstrated in the comment example.


33-34: Approval: Full Month Name Timestamp Pattern
The regex on line 34

timestamp:[A-Z][a-z]+ \d{2}, \d{4} \d{2}:\d{2}

clearly captures timestamps with a full month name.


38-40: Observation: Missing Closing Bracket in Specific Timestamp Patterns
The regexes on lines 38 and 40

timestamp:\[\d{2}/[A-Z][a-z]{2}/\d{4}:\d{2}:\d{2}:\d{2}

and

timestamp:\[\d{2}/\d{2}/\d{4}:\d{2}:\d{2}:\d{2}

do not include a closing bracket, which follows the pattern seen in several other examples. Confirm that this is intentional based on the log format, or consider appending \] if a closing bracket is consistently expected.

🧰 Tools
🪛 LanguageTool

[typographical] ~38-~38: Unpaired symbol: ']' seems to be missing
Context: ....5 - - [01/Jan/2016:15:50:17 timestamp:[\d{2}/[A-Z][a-z]{2}/\d{4}:\d{2}:\d{2}:\d...

(UNPAIRED_BRACKETS)


[typographical] ~39-~39: Unpaired symbol: ']' seems to be missing
Context: ...{2}:\d{2}:\d{2} // E.g. 192.168.4.5 - - [01/01/2016:15:50:17 timestamp:[\d{2}/\d...

(UNPAIRED_BRACKETS)


[typographical] ~40-~40: Unpaired symbol: ']' seems to be missing
Context: ...4.5 - - [01/01/2016:15:50:17 timestamp:[\d{2}/\d{2}/\d{4}:\d{2}:\d{2}:\d{2} // E...

(UNPAIRED_BRACKETS)


41-42: Approval: Error Log Timestamp Pattern
The regex on line 42

timestamp:[A-Z][a-z]{2} [A-Z][a-z]{2} [ 0-9]{2} \d{2}:\d{2}:\d{2} \d{4}

effectively matches timestamps embedded in error logs.


45-46: Approval: Short Month Abbreviation Timestamp Pattern
The regex on line 46

timestamp:[A-Z][a-z]{2} \d{2} \d{2}:\d{2}:\d{2}

correctly captures the abbreviated month format as intended.


52-53: Approval: Delimiters Update
The delimiters line on 53

delimiters: \t\r\n!"#$%&'\(\)\*,:;<>?@\[\]\^_`\{\|\}~=

has been updated to include the equal sign. This change is consistent with the updated dictionary variable parsing requirements. Please verify that the inclusion of = in the delimiters is intended.

🧰 Tools
🪛 LanguageTool

[duplication] ~52-~52: Posibilă greșeală: ați repetat un cuvânt
Context: ...d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{6} // Delimiters delimiters: \t\r\n!"#$%&'()*,:;<>?@[]^_`{|...

(WORD_REPEAT_RULE)


59-63: Approval: Enhanced Dictionary Variable Regex
The updated dictionary variable pattern on line 61

equals:.*=(?<val>.*[a-zA-Z0-9].*)

incorporates a named capture group (?<val>) for the value, which improves readability and downstream processing. This is a clear improvement over the previous version.

components/core/submodules/log-surgeon (1)

1-1: Subproject Commit Update Confirmed

The submodule commit has been updated to f801a3f369bb14919d82a928d9a3812006245aa0 to support Ubuntu 24.04. This update matches the changes in the configuration schema (including new timestamp regex patterns) mentioned in the PR. Please ensure that the new submodule version is validated on Ubuntu 24.04 to confirm compatibility.

Copy link
Member

@kirkrodrigues kirkrodrigues left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also update the URL & hash in deps-tasks.yml?

Copy link
Contributor

@Bill-hbrhbr Bill-hbrhbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor request: can you add <cstdint> to Lexer.tpp and <map> to Token.hpp like in this PR? These are creating some issues on my Ubuntu 20.04 WSL1 build. I don't think adding these break the code so it would be a great help if you can.

Also for bumping the version of a dep you can reference this PR

@SharafMohamed
Copy link
Contributor Author

SharafMohamed commented Apr 14, 2025

Minor request: can you add <cstdint> to Lexer.tpp and <map> to Token.hpp like in this PR? These are creating some issues on my Ubuntu 20.04 WSL1 build. I don't think adding these break the code so it would be a great help if you can.

Also for bumping the version of a dep you can reference this PR

Token.hpp doesn't use <map> does it?

@Bill-hbrhbr
Copy link
Contributor

Minor request: can you add <cstdint> to Lexer.tpp and <map> to Token.hpp like in this PR? These are creating some issues on my Ubuntu 20.04 WSL1 build. I don't think adding these break the code so it would be a great help if you can.
Also for bumping the version of a dep you can reference this PR

Token.hpp doesn't use <map> does it?

@SharafMohamed sorry. scratch my PR request. It compiles now. I must have been using an older version.

kirkrodrigues
kirkrodrigues previously approved these changes Apr 14, 2025
Copy link
Member

@kirkrodrigues kirkrodrigues left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PR title, how about:

build(deps): Update log-surgeon to the clp-ubuntu-24.04-support branch.

@kirkrodrigues
Copy link
Member

Actually, there are conflicts you'll need to resolve.

@SharafMohamed SharafMohamed changed the title fix: Ubuntu 24.04 Log Surgeon support. build(deps): Update log-surgeon to the clp-ubuntu-24.04-support branch Apr 14, 2025
@SharafMohamed SharafMohamed changed the title build(deps): Update log-surgeon to the clp-ubuntu-24.04-support branch build(deps): Update log-surgeon to the clp-ubuntu-24.04-support branch. Apr 14, 2025
@kirkrodrigues kirkrodrigues merged commit e3d1b24 into y-scope:main Apr 15, 2025
20 checks passed
anlowee pushed a commit to anlowee/clp that referenced this pull request Apr 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants