Skip to content

Sometimes PubSubPullSensor doesn't pull messages even if the PubSub subscription has unacked messages  #41838

@arnaubadia

Description

@arnaubadia

Apache Airflow Provider(s)

google

Versions of Apache Airflow Providers

apache-airflow-providers-google==10.17.0

Apache Airflow version

2.7.3

Operating System

macOS Sonoma

Deployment

Google Cloud Composer

Deployment details

No response

What happened

Sometimes PubSubPullSensor doesn't pull messages from a pubsub subscription even if there are unacked messages available. It's not clear when it does take

What you think should happen instead

The PubSubPullSensor should always pull messages if there are unacked messages in the pubsub subscription.

How to reproduce

Difficult to reproduce exactly as it's not something that happens deterministically. The way I noticed the problem is in a DAG where we have a PubSubPullSensor poking every minute trying to retrieve messages from a pubsub subscription. Sometimes (seemingly random pattern) you'll notice that the the sensor doesn't pull any messages even if there are unacked messages for that subscription.

Anything else

I believe I found the cause of this problem and it's that the PubSubPullSensor class contains this code:

pulled_messages = hook.pull(
    project_id=self.project_id,
    subscription=self.subscription,
    max_messages=self.max_messages,
    return_immediately=True,
)

which has return_immediately hard-coded to True. Inside this function, the code calls the pull function of SubscriberClient, which has the following comment on the return_immediately argument:

return_immediately (bool):
    Optional. If this field set to true, the system will
    respond immediately even if it there are no messages
    available to return in the ``Pull`` response. Otherwise,
    the system may wait (for a bounded amount of time) until
    at least one message is available, rather than returning
    no messages. Warning: setting this field to ``true`` is
    discouraged because it adversely impacts the performance
    of ``Pull`` operations. We recommend that users do not
    set this field.

I created a version of the PubSubPullSensor which is exactly the same as the original but with return_immediately=False, and the problems I mention in this issue went away consistently.

The hard-coded return_immediately=True was introduced in this PR.

I believe return_immediately should either be hard-coded to False or it should go back to being an argument of the class so users can set it to False and avoid the problem I'm describing in this issue.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions