Skip to content

Route badly interpolated dynamic index to DLQ #1084

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

andsel
Copy link
Contributor

@andsel andsel commented Sep 7, 2022

Release notes

Modified the resolution of dynamic index name so that in case of interpolation errors the event is sent to the DLQ. This feature could be explicitly disabled configuring properly the dlq_on_failed_indexname_interpolation.

What does this PR do?

Update the the pre-indexing steps, so to that in case of an IndexInterpolationError is raised during the index name resolution, the event is sent to DLQ. It leverarges the refactored #handle_dlq_status from PR #1080 to reuse the enqueuing logic.

Why is it important/What is the impact to the user?

Let the user to handle specifically the events which once indexed would create an ugly formatted name like %{[not_existing_field]}_index.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files (and/or docker env variables)
  • I have added tests that prove my fix is effective or that my feature works

Author's Checklist

  • run with
.ci/docker-setup.sh && INTEGRATION=true .ci/docker-run.sh
  • adds feature flag to disable this behavior. Track telemetry data to know how many uses this explicit disabling.

How to test this PR locally

  • enable the DLQ, set dead_letter_queue.enable: true in config/logstash.yml.
  • launch an Elasticsearch instance
  • execute a pipeline which wouldn't resolve the index name like:
input {
	generator {
		message => '{"name": "John", "surname": "Doe"}'
		count => 1
		codec => json
	}
}

output {
	elasticsearch {
		index => "%{[not_existing_field]}"
		hosts => "http://localhost:9200"
		user => "elastic"
		password => "changeme"
	}
}
  • then drain the DQL with another pipeline and verify it contains the previous failed event:
input {
  dead_letter_queue {
  	path => "/path_to/data/dead_letter_queue/"
  }
}

output {
  stdout {
    codec => rubydebug
  }

Related issues

Use cases

As a user I don't want that unresolved index names generates unexpected indexes and I would to reprocess such failed events with another DLQ pipeline.

@andsel andsel self-assigned this Sep 7, 2022
@andsel andsel changed the title Feature/dlq bad dynamic index replacements Route badly interpolated dynamic index to DLQ Sep 7, 2022
@andsel andsel force-pushed the feature/dlq_bad_dynamic_index_replacements branch from f8cf7af to 05e68bf Compare September 19, 2022 08:38
@andsel andsel marked this pull request as ready for review September 19, 2022 14:03
@andsel andsel force-pushed the feature/dlq_bad_dynamic_index_replacements branch from 4f4cbbc to 6a66e14 Compare September 19, 2022 14:34
@andsel andsel requested review from yaauie and mashhurs September 19, 2022 15:33
let(:dlq_writer) { double('DLQ writer') }
before { subject.instance_variable_set('@dlq_writer', dlq_writer) }

it "should doesn't create an index name with unresolved placeholders" do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
it "should doesn't create an index name with unresolved placeholders" do
it "shouldn't create an index name with unresolved placeholders" do

@@ -1,3 +1,6 @@
## 11.9.0
- Feature: force unresolved dynamic index names to be sent into DLQ. This feature could be explicitly disabled using `dlq_on_failed_indexname_interpolation` setting [#1084](https://github.com/logstash-plugins/logstash-output-elasticsearch/pull/1084)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the explanation, for me dlq_on_unresolved_index_names would be good for me. We interpolate the index name to resolve it. The end goal seems to me resolving the index name.

* Value type is <<boolean,boolean>>
* Default value is `true`.

If enabled, failed index name interpolation events go into dead letter queue.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be rephrased if we align on dlq_on_unresolved_index_names.

@@ -362,11 +365,43 @@ def config_init(params)
# Receive an array of events and immediately attempt to index them (no buffering)
def multi_receive(events)
wait_for_successful_connection if @after_successful_connection_done
retrying_submit map_events(events)
events_mapped = safe_interpolation_map_events(events)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you think if we reject index name unresolved events and only bring back successful events?
In my unmerged PR, we had a great discussion with Ry and agreed to work with successful/submittable events here, send to DLQ rejected events wen filtering.

failed_events << event_action_tuple
end
end
MapEventsResult.new(successful_events, failed_events)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
MapEventsResult.new(successful_events, failed_events)
unless failed_events.empty?
@logger.error("Can't map some events, needs to be handled by DLQ #{failed_events}")
send_failed_resolutions_to_dlq(failed_events)
end
successful_events


log_level = dig_value(response, 'index', 'error', 'type') == 'invalid_index_name_exception' ? :error : :warn

handle_dlq_status(action, log_level, detailed_message)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Special thanks for including handle_dlq_response & handle_dlq_status!

Copy link
Contributor

@yaauie yaauie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM.

I'm not terribly fond of raising exceptions for flow control, but feel that the structure as-proposed here is simpler and much more clear than alternatives.

My only concern is conflicts with the also-in-flight #1080, which similarly pre-filters events whose action cannot be interpolated.

@mashhurs
Copy link
Contributor

Changes LGTM.

I'm not terribly fond of raising exceptions for flow control, but feel that the structure as-proposed here is simpler and much more clear than alternatives.

My only concern is conflicts with the also-in-flight #1080, which similarly pre-filters events whose action cannot be interpolated.

Initially I was thinking that seemed to me easily include my #1080 changes but it takes sometime to make an adjustment (especially unit test cases). Feel free to go first @andsel !

@yaauie yaauie merged commit b401631 into logstash-plugins:main Sep 20, 2022
@yaauie
Copy link
Contributor

yaauie commented Sep 20, 2022

Merged and published:

Published logstash-output-elasticsearch-11.9.0-java.gem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DLQ when sprintf references a missing field
4 participants