Skip to content

How should output-elasticsearch deal with a missing @timestamp? #779

Open
@webmat

Description

@webmat

Events without @timestamp can be difficult to process in certain situations. According to code that dates back to 2013 (and probably before), its presence should be guaranteed. Logstash has never enforced this by preventing the removal of @timestamp per se, however.

So whenever a problem arises from a missing @timestamp, people may simply see an innocuous error like the following, and conclude that there's a bug in Logstash:

[2018-02-22T13:49:42,425][FATAL][logstash.runner ] An unexpected error occurred! {:error=>#<LogStash::Error: timestamp field is missing>, :backtrace=>["org/logstash/ext/JrubyEventExtLibrary.java:205:in sprintf'", "/Users/Gabriel/Documents/ElasticSearch/logstash-5.6.2/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.4.0-java/lib/logstash/outputs/elasticsearch/common.rb:169:inevent_action_params'", "/Users/Gabriel/Documents/ElasticSearch/logstash-5.6.2/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.4.0-java/lib/logstash/outputs/elasticsearch/common.rb:44:in event_action_tuple'", "/Users/Gabriel/Documents/ElasticSearch/logstash-5.6.2/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.4.0-java/lib/logstash/outputs/elasticsearch/common.rb:38:inmulti_receive'", "org/jruby/RubyArray.java:2414:in map'", "/Users/Gabriel/Documents/ElasticSearch/logstash-5.6.2/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.4.0-java/lib/logstash/outputs/elasticsearch/common.rb:38:inmulti_receive'", "/Users/Gabriel/Documents/ElasticSearch/logstash-5.6.2/logstash-core/lib/logstash/output_delegator_strategies/shared.rb:13:in multi_receive'", "/Users/Gabriel/Documents/ElasticSearch/logstash-5.6.2/logstash-core/lib/logstash/output_delegator.rb:49:inmulti_receive'", "/Users/Gabriel/Documents/ElasticSearch/logstash-5.6.2/logstash-core/lib/logstash/pipeline.rb:436:in output_batch'", "org/jruby/RubyHash.java:1342:ineach'", "/Users/Gabriel/Documents/ElasticSearch/logstash-5.6.2/logstash-core/lib/logstash/pipeline.rb:435:in output_batch'", "/Users/Gabriel/Documents/ElasticSearch/logstash-5.6.2/logstash-core/lib/logstash/pipeline.rb:381:inworker_loop'", "/Users/Gabriel/Documents/ElasticSearch/logstash-5.6.2/logstash-core/lib/logstash/pipeline.rb:342:in `start_workers'"]}```

One point of view about this is that the event should actually be considered invalid and whatever lead to this situation is likely to cause data loss.

There are two schools of thoughts or approaches we can take for these:

  1. We can try to support these events as best as possible
  2. We can take a hard line and not try to support these events, and assume that the customer's configuration has a problem that will lead to data loss. We can adjust the level of "severity" we take with this approach.

In either case, in order to prevent needless confusion and support requests, we should at the very least output a log message that's a bit more verbose than the one above, in explaining what the problem actually is. With this event it's missing @timestamp and the customer tries to index it to a timestamped index (like logstash-%{+YYYY.MM.dd}).

Are non-timestamped events legitimate?

The behaviour currently (unless the above situation happens) is to successfully index the event. I would argue there's a legitimate use case for removing @timestamp in Logstash pipelines. It's counterintuitive because Logstash was born to process time-based data. But once someone has a Logstash installation, Logstash is a great way to ingest non time-based data periodically to their ElasticSearch cluster.

Examples of non time-based data to reimport regularly:

  • zip codes (or other regional databases)
  • IP geolocation databases
  • other kinds of CSV data dumps

Another supporting argument for the legitimacy of non time-based support in Logstash is the fact that our plugins are gradually getting better at supporting non time-sensitive scenarios:

  • More explicit support in input-file for reading full files, rather than tailing them (CSV imports)
  • We've had requests (#9) and attempts (filter-lrucache) at making filter-elasticsearch be able to cache static datasets in memory, to achieve a behaviour, similar to filter-jdbc_static. This is a sign that people have static datasets in ES.

The last point in demonstrating the legitimacy of processing non-timestamped data with Logstash is the very first Logstash configuration one sees, during the ElasticSearch training.
Remove noise from Logstash events
Here we're ingesting bunch of data, but we're not interested in keeping @timestamp (among others).

Approach 1: Support these events as best as possible

The current behaviour is that events with no @timestamps work in most cases. People can do this, and are doing it. There are still some improvements we can make:

  • Output a more explicit message when someone tries to index an event without a @timestamp to a timestamped index.
    • Part of #777 was about putting in place this clearer messaging.
  • Instead of a full Logstash crash (current behaviour), output an ERROR for every event in this situation, and keep Logstash running.
    • #777 implements this behaviour
  • If the customer has DLQ enabled, we could even send the event there. Perhaps their timestamp is just named incorrectly and they may be able to reprocess these events without any other disruption.
    • One could argue that the DLQ may fill up before people notice this & so on, and they may lose data eventually.
    • Idea floated in some form in #718 and tracked as part of a bigger plan in #772

Approach 2: Interpret this as a problem with the user's configuration and crash

A lot of the feedback around #777 was not about the PR, but went in the direction that "not having a @timestamp on the event" was not officially supported. So this definitely stuck a chord with a few people :-)

In any case, nobody denies that we can improve the error message when the timestamp error happens.

Now, do we want to take a harder line to the effect that we do not want to support a missing @timestamp at all? If so, the current behaviour is not strict enough. We may want to take the following additional steps:

  • Pre-emptively refuse any event without @timestamp as soon as we see them (whether or not the timestamp is needed).
  • Eventually add a breaking change to Logstash, preventing the removal of @timestamp.
  • Until this is prevented, should we always crash on any such event? (regardless of whether the event is problematic)

1+2=3: Let users decide

Any given user may really want Logstash to behave like 1, or to behave like 2. Presumably, the default could be the friendlier approach 1, and people who take great pains to construct a pipeline that has the least data loss possible (e.g. use only non-lossy transports & plugins, careful use of PQ & DLQ) could optionally decide that such events should indeed cause a hard crash.

Why this discussion?

This issue came out of the discussion here #777, which in turn came from reports such as #739. Requesting feedback on #777 in Slack immediately led to discussions of going towards approach #2.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions