Skip to content

Conversation

@kolfild26
Copy link
Contributor

Based on #28435

The purpose is to save to Log table an airflow user who created a dagrun via REST API.
Currently, such event as trigger (from UI) and cli_dag_trigger (from CLI) are being written to metadb, But the api endpoint dagrun request is missing there.

I suggest to add a log entry in dag_run_endpoint.py like this:

IDdttmdag_idtask_ideventexecution_dateowner
1431302023-01-17 17:30:05.498895test_restrest_dag_trigger2023-01-17 17:30:05.482352my_user

'rest_dag_trigger' is discussable, it might be someone could advise a better text. I just named it in the likeness of 'cli_dag_trigger'.
The other columns are obvious.

@boring-cyborg boring-cyborg bot added the area:API Airflow's REST/HTTP API label Jan 17, 2023
dag_hash=get_airflow_app().dag_bag.dags_hash.get(dag_id),
session=session,
)
from airflow.models.log import Log
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this, there's action_logging decorator you should use, see

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ephraimbuddy thanks, my dev env was a bit outdated so I didn't have this decorator.
Updated and changed the way for db insertion. Please review.

"owner": g.user.username,
"extra": None,
"task_id": None,
"dag_id": f"{dag_run.dag_id}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the action_logging, do we still need this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point was to write the log table. To do so we first need to create Log object. That's what we actually do here.
We don't write to dag_run table, thus cannot use DagRun object as itself.
And also, owner is only accessible from flask.g.
So we need to "construct" Log object

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you look at the code for the action_logging?

log = Log(
event=event or f.__name__,
task_instance=None,
owner=user,
extra=str(extra_fields),
task_id=params.get("task_id"),
dag_id=params.get("dag_id"),
)

@provide_session
@action_logging(
event=action_event_from_permission(
prefix="log",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have a global constant for this prefix. Check the usage in other places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same, as above. In other places we are logging the "main" object to its correspondent table. For instance, in connection_endpoint.py we are logging the Connection object to connection table. In variable_endpoint.py we are logging the Variable object to variable table.

Here is the difference, in dag_run_endpoint.py we are logging to log table rather than dag_run table.

Based on that I decided not to have prefix in a global constatnt. I mean if we want to add something more to dag_run table we will need RESOURCE_EVENT_PREFIX = "dag_run" rather than RESOURCE_EVENT_PREFIX = "log".

@ephraimbuddy so, the question, is it ok to use RESOURCE_EVENT_PREFIX or it's better to use smth like RESOURCE_EVENT_LOG_PREFIX ?

#example
RESOURCE_EVENT_LOG_PREFIX = "log" 
...
@action_logging(
    event=action_event_from_permission(
        prefix=RESOURCE_EVENT_LOG_PREFIX,
        permission=permissions.ACTION_CAN_CREATE,
    ),
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same, as above. In other places we are logging the "main" object to its correspondent table. For instance, in connection_endpoint.py we are logging the Connection object to connection table. In variable_endpoint.py we are logging the Variable object to variable table.

This is wrong. The logging is all done in the log table through the action_logging decorator.

@kolfild26
Copy link
Contributor Author

kolfild26 commented Jan 25, 2023

@ephraimbuddy
Thanks for these inputs. I was actually looking at cli_action_loggers.py. It differs in there. That's why.
For both questions, you're right. We only need a decorator to be used. Fixed.

@ephraimbuddy
Copy link
Contributor

@ephraimbuddy Thanks for these inputs. I was actually looking at cli_action_loggers.py. It differs in there. That's why. For both questions, you're right. We only need a decorator to be used. Fixed.

We also need a test

@kolfild26
Copy link
Contributor Author

We also need a test

Ok, will update once it's done.

@kolfild26
Copy link
Contributor Author

Hi, @ephraimbuddy
I added test _check_last_log similarly it's done in test_connection_endpoint.py.
Tests were passed successfully.

@kolfild26
Copy link
Contributor Author

Above there was a conflict with the very fresh commit b94f36b in main. It also changed the same test in the same row of code by coincidence.
I resolved and re-run tests locally. Everything's fine.

@eladkal eladkal added this to the Airflow 2.5.2 milestone Feb 2, 2023
@uranusjr uranusjr changed the title write to DB a user who triggered a dag via API endpoint request Write action log to DB when DAG run is trigged via API Feb 3, 2023
@uranusjr uranusjr merged commit edc2e0b into apache:main Feb 3, 2023
@pierrejeambrun pierrejeambrun added type:misc/internal Changelog: Misc changes that should appear in change log type:bug-fix Changelog: Bug Fixes and removed type:misc/internal Changelog: Misc changes that should appear in change log type:bug-fix Changelog: Bug Fixes labels Feb 27, 2023
pierrejeambrun pushed a commit that referenced this pull request Mar 6, 2023
pierrejeambrun pushed a commit that referenced this pull request Mar 7, 2023
pierrejeambrun pushed a commit that referenced this pull request Mar 8, 2023
@pierrejeambrun pierrejeambrun added type:bug-fix Changelog: Bug Fixes and removed type:misc/internal Changelog: Misc changes that should appear in change log labels Mar 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API type:bug-fix Changelog: Bug Fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants