Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement/4534 remove beats disk queue encryption #42643

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

kaanyalti
Copy link
Contributor

@kaanyalti kaanyalti commented Feb 7, 2025

  • Enhancement

Proposed commit message

Removed diskqueue encryption, compression and v2 implementation. The main goal of this PR is to remove the diskqueue encryption as it uses an encryption function that is not implemented for Windows in Microsoft's golang fork which we are using for FIPS compliance. Along with encryption this PR also removes compression and the v2 implementation of the diskqueue all together.

Checklist

  • My code follows the style guidelines of this project
  • [ ] I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

There shouldn't be any user impacting changes

How to test this PR locally

Related issues

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Feb 7, 2025
Copy link
Contributor

mergify bot commented Feb 7, 2025

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @kaanyalti? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

@kaanyalti kaanyalti force-pushed the enhancement/4534_remove_beats_disk_queue_encryption branch from 580f087 to c1e7aba Compare February 8, 2025 00:53
@kaanyalti kaanyalti marked this pull request as ready for review February 8, 2025 02:05
@kaanyalti kaanyalti requested review from a team as code owners February 8, 2025 02:06
@kaanyalti kaanyalti requested review from faec and leehinman February 8, 2025 02:06
@kaanyalti kaanyalti added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Feb 8, 2025
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Feb 8, 2025
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

Copy link
Contributor

@leehinman leehinman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to leave the V0 & V1 pic and svg files in the docs directory. Those are still current. Also the on-disk-structures.md file in that directory needs to be updated.

Otherwise LGTM

@kaanyalti kaanyalti force-pushed the enhancement/4534_remove_beats_disk_queue_encryption branch from 0f1ef70 to 200dc1f Compare February 18, 2025 17:55
There are currently 3 versions of the disk queue, and the current code
base is able to write versions 1 & 2, while it is able to read version
0, 1, and 2.
There are currently 2 versions of the disk queue, and the current code
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leehinman Could you please confirm that the information here is correct.

@kaanyalti kaanyalti requested a review from leehinman February 18, 2025 18:34
Copy link
Contributor

@leehinman leehinman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

ENABLE_COMPRESSION // 0x2
ENABLE_PROTOBUF // 0x4
)
const segmentHeaderSize = 8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this, it looks like it will break if a preexisting segment file is opened with the new version because it has no way to know how big the header is. A change that affects the size of the segment header requires incrementing the schema version and respecting the header sizes of previous versions on load, so we should either leave the header size as 12 and relabel the last 4 bytes as reserved space, or increment the schema version to 3 and update (*queueSegment).headerSize().

header, err := readSegmentHeader(file)
// We don't need the header contents here, we just want to advance past the
// header region, so discard the return value.
_, err = readSegmentHeader(file)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't need these contents then this read can be skipped entirely, the reader loop already seeks to the beginning of the unprocessed data before reading.

@@ -424,25 +356,34 @@ func readSegmentHeader(in io.Reader) (*segmentHeader, error) {
if err != nil {
return nil, fmt.Errorf("could not read segment version: %w", err)
}

if header.version > currentSegmentVersion {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it will refuse to open any segments created with a previous build, we should still accept schema 2 as valid input since it doesn't inherently require any of the encryption code that's being removed.(This is why I prefer incrementing the schema version rather than reverting to 1, because it keeps the special cases restricted to headerSize() which is explicitly a backwards compatibility wrapper. But if we do revert the "current" schema to 1 then we will need to special-case this check and document along with the current version that version 2 is reserved and should be skipped if the schema ever changes again.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main goal of this PR is to remove the encryption code. After discussing how to approach this, I decided to revert the PR that added encryption which also was the PR that added v2. If the issue here is the reduced header size, instead of reverting I can just remove encryption. I may still have to fill the options with null bytes in that case for v2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just saw your comment below. I will then leave everything as is, and remove encryption, and fill in the options with null bytes.

return err
}
err = binary.Write(out, binary.LittleEndian, frameCount)
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do end up sticking with schema 2 instead of incrementing to 3, then we should probably still write out null bytes for the options here.

@@ -73,7 +73,8 @@ def get_release_branches() -> typing.List[str]:
try:
release_branches = [branch for branch in resp["branches"]]
except KeyError:
fail_with_error(f'''Didn't find the excepted structure ["branches"] in the response [{resp}] from [{ACTIVE_BRANCHES_URL}]''')
fail_with_error(
f'''Didn't find the excepted structure ["branches"] in the response [{resp}] from [{ACTIVE_BRANCHES_URL}]''')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is out of scope for this PR do you think we could undo it? Changing this, due to the CODEOWNERS file, resulted in requesting a review from an additional team (ingest-eng-prod) which isn't otherwise required for this PR.

@@ -103,7 +104,8 @@ def generate_pipeline(pipelines_to_trigger: typing.List[str], branches: typing.L

target_branches = sorted(list(set(release_branches).difference(exclude_branches)))
if len(target_branches) == 0 or target_branches[0].isspace():
fail_with_error(f"Calculated target branches were empty! You passed EXCLUDE_BRANCHES={exclude_branches} and release branches are {release_branches} the difference of which results in {target_branches}.")
fail_with_error(
f"Calculated target branches were empty! You passed EXCLUDE_BRANCHES={exclude_branches} and release branches are {release_branches} the difference of which results in {target_branches}.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants