Skip to content

Conversation

@nielsbauman
Copy link
Contributor

Makes the execution and use of enrich policies project-aware.
Note: this does not make the enrich cache project-aware. That is to be handled in a follow-up PR.

Makes the execution and use of enrich policies project-aware.
Note: this does not make the enrich cache project-aware. That is to be
handled in a follow-up PR.
@nielsbauman nielsbauman added the :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP label Mar 5, 2025
@nielsbauman nielsbauman requested a review from a team as a code owner March 5, 2025 13:18
@nielsbauman nielsbauman requested a review from a team March 5, 2025 13:18
@elasticsearchmachine elasticsearchmachine added Team:Data Management Meta label for data/management team v9.1.0 labels Mar 5, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Copy link
Member

@ywangd ywangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I don't fully understand how ingest processor works. But the changes related to passing project-id makes sense to me. I think it might be worthwhile to create a placeholder ticket to remind relevant teams owning the processors to review whether the new project-id parameter should be leveraged. Probably also worth a separate ticket of the similar nature for logstash since it is a separate product.

* @param searchResponseFetcher The function used to compute the value to be put in the cache, if there is no value in the cache already
* @param listener A listener to be notified of the value in the cache
*/
@FixForMultiProject(description = "The enrich cache will currently leak data between projects. We need to either disable or fix it.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems pretty serious if it happens, I suggest we create a JIRA issue to track it for better visibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already created ES-10936 and put it on the agenda for today's weekly Data Management team meeting :)

@nielsbauman
Copy link
Contributor Author

nielsbauman commented Mar 6, 2025

@ywangd, most processors work within the scope of a single document and thus won't need a project ID. There are some exceptions, like the enrich, pipeline, and more processors. I think that in any of those cases, they'll at some point call a method that uses Metadata#getProject(). While we're working through the list of Metadata#getProject(), we will eventually run into cases that are used by processors, causing us to update the processor to make use of the project ID.

However, some parts are perhaps more difficult to identify (e.g. the enrich cache I mentioned), so I'll add a ticket to do a final check on all the processors that they're not leaking information between projects. I opened ES-11061.

@elasticsearchmachine elasticsearchmachine added the serverless-linked Added by automation, don't add manually label Mar 6, 2025
@nielsbauman nielsbauman enabled auto-merge (squash) March 6, 2025 16:45
@nielsbauman nielsbauman disabled auto-merge March 6, 2025 16:45
@nielsbauman nielsbauman enabled auto-merge (squash) March 6, 2025 17:18
@nielsbauman nielsbauman merged commit 20e186a into elastic:main Mar 6, 2025
16 of 17 checks passed
@nielsbauman nielsbauman deleted the mp-enrich branch March 6, 2025 18:20
georgewallace pushed a commit to georgewallace/elasticsearch that referenced this pull request Mar 11, 2025
Makes the execution and use of enrich policies project-aware.
Note: this does not make the enrich cache project-aware. That is to be
handled in a follow-up PR.
costin pushed a commit to costin/elasticsearch that referenced this pull request Mar 15, 2025
Makes the execution and use of enrich policies project-aware.
Note: this does not make the enrich cache project-aware. That is to be
handled in a follow-up PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >non-issue serverless-linked Added by automation, don't add manually Team:Data Management Meta label for data/management team v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants