Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while trying to send logs threw elasticsearch #44724

Closed
1 of 2 tasks
benjamindinh-loreal opened this issue Dec 6, 2024 · 10 comments · Fixed by #45263
Closed
1 of 2 tasks

Error while trying to send logs threw elasticsearch #44724

benjamindinh-loreal opened this issue Dec 6, 2024 · 10 comments · Fixed by #45263

Comments

@benjamindinh-loreal
Copy link

benjamindinh-loreal commented Dec 6, 2024

Apache Airflow version

2.10.3

If "Other Airflow 2 version" selected, which one?

No response

What happened?

Hello all, hope you doing well.

While trying to send logs to elasticsearch directly threw the elastic adapter (inside airflow conf), it does not work.
--> today we pass threw file share (azure) mounted as PV inside K8S, then logstash pipeline, but it costs a lots per years ...

I think the adapter is broken, even when trying to test a connection directly inside the webserver, we have an error :

'ESConnection' object has no attribute 'close'

Then when trying to send logs to elasticsearch, it does not try to send logs and it cannot connect to the elasticsearch when trying to get back logs.

Anyway, while trying to look for logs inside a DAG we also have this error :

elasticsearch.AuthenticationException: AuthenticationException(401, 'security_exception', 'missing authentication credentials for REST request [/airflow-logs-*/_count]') airflow-webserver-1 | 172.18.0.1 - - [06/Dec/2024:11:26:46 +0000] "GET /api/v1/dags/debug_airflow_to_elastic/dagRuns/manual__2024-12-06T11:26:37.042642+00:00/taskInstances/print_debug_message/logs/1?full_content=false HTTP/1.1" 500 1588 "http://localhost:8080/dags/debug_airflow_to_elastic/grid?dag_run_id=manual__2024-12-06T11%3A26%3A37.042642%2B00%3A00&task_id=print_debug_message&tab=logs" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36"

But the I tried with an api key, user password and even both but cannot get rid of it.

I think there is a bug around it or did we do something wrong.

Thousands thanks !

Benjamin

What you think should happen instead?

No response

How to reproduce

Just add a connection to elasticsearch, try to connect to it.

Then add remote logging inside conf.

It does not try to send logs and it cannot connect to the elasticsearch.

Operating System

Kubernetes and docker compose. (both d'ont work)

Versions of Apache Airflow Providers

No response

Deployment

Official Apache Airflow Helm Chart

Deployment details

[core] sensitive_var_conn_names = key,login,secret,pass,auth hide_sensitive_var_conn_fields = True max_map_length = 16396 expose_config = non-sensitive-only load_examples = False test_connection = Enabled [webserver] show_trigger_form_if_no_params = True allow_testing_connections = Enabled [logging] remote_logging = True remote_log_conn_id = elasticsearch_default logging_level = INFO [elasticsearch] host = ************************************ write_stdout = True json_format = True index_patterns = airflow-logs-* [elasticsearch_config] verify_certs=False

Anything else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@benjamindinh-loreal benjamindinh-loreal added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Dec 6, 2024
Copy link

boring-cyborg bot commented Dec 6, 2024

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@dosubot dosubot bot added the area:logging label Dec 6, 2024
@eladkal
Copy link
Contributor

eladkal commented Dec 15, 2024

cc @Owen-CH-Leung wdyt?

@Owen-CH-Leung
Copy link
Contributor

From your error log, it seems that the elasticsearch cluster has a security setup to prevent unauthorised access from your k8s cluster. The AuthenticationException is a clear indication.

I'd advise to start by testing connectivity outside of Airflow to narrow down the root cause. For example, try running a standalone Python script inside the same Kubernetes cluster that hosts your Airflow environment. In that script, use the official [elasticsearch-py] (https://github.com/elastic/elasticsearch-py) client library to connect to your Elasticsearch cluster and try to do sth like es_client.ping(). Make sure to experiment with SSL-related parameters such as ssl_verify and ca_certs until you can reliably connect.

Once you've confirmed that your Python script can successfully interact with Elasticsearch, you can mirror those working configurations in your airflow.cfg (e.g., adjusting the Elasticsearch configuration sections) and restart Airflow

@benjamindinh-loreal
Copy link
Author

Hello @Owen-CH-Leung,

Thank you for your response.

The cluster is accessible from the outside, and some pipelines already successfully send data to Elasticsearch using elasticsearch-py.

However, the error logs suggest that Airflow is not transmitting the authentication parameters:

missing authentication credentials for REST request

The elasticsearch_default connection has already been created, so I’m wondering if there might be a workaround to ensure Airflow sends the authentication details to Elasticsearch properly?

Thank you

@Owen-CH-Leung
Copy link
Contributor

You can define the credentials in elasticsearch_configs session in your airflow cfg.

https://airflow.apache.org/docs/apache-airflow-providers-elasticsearch/stable/configurations-ref.html#elasticsearch-configs

In the elasticsearch_configs session, you can pass in any parameters that elasticsearch client accepts. Example:

[elasticsearch_configs] 
http_compress = True 
ca_certs = /root/ca.pem 
api_key = "SOMEAPIKEY" 
verify_certs = True

All the params you define will be passed into the elasticsearch python library like elasticsearch.Elasticsearch(**kwargs)

@julienlagorsse-loreal
Copy link

Thanks, it solved the issue, but we are still blocked after that, the data isn't push, however we can see pull of logs. We will create another issue for that. Thank you

@julienlagorsse-loreal
Copy link

The doc seems to be missleading, the title is Writing logs to Elasticsearch but it doesn't write anything to Elasticsearch, only read.

@potiuk
Copy link
Member

potiuk commented Dec 20, 2024

The doc seems to be missleading, the title is Writing logs to Elasticsearch but it doesn't write anything to Elasticsearch, only read.

Actually not really - It's about both writing (and then reading the logs. If you read the first paragraph (that's the first time I see the docs). The docs say that you can get the logs from stdout and forward them (write) to elasticsearch by fluentd, logstash or others.

Airflow can be configured to read task logs from Elasticsearch and optionally write logs to stdout in standard or json format. These logs can later be collected and forwarded to the Elasticsearch cluster using tools like fluentd, logstash or others.

Are you doing it @julienlagorsse-loreal ? Maybe that is the problem that you are not forwarding the stdout logs to elasticsearch?

@potiuk potiuk added pending-response and removed needs-triage label for new issues that we didn't triage yet labels Dec 20, 2024
@julienlagorsse-loreal
Copy link

julienlagorsse-loreal commented Dec 20, 2024 via email

@potiuk
Copy link
Member

potiuk commented Dec 20, 2024

We have issues on file share with k8s, anyway it's not related to the bug, but the title is clearly misleading, it say write logs to elastic, not read logs from elastic or write logs to stdout ... But I agree the doc paragraph is not.

Can you please propose an update to the page. It's as simple as clickign "Suggest a change on this page" and it will open a Pull Request where you can propose a change tha will remove the confusion.

Can we count on it @julienlagorsse-loreal ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants