Skip to content

No worker instances after upgrade #18189

Closed
@penguineer

Description

@penguineer

Gitea Version

1.15.9

Git Version

2.30.2 (from container)

Operating System

Alpine 3.13 (from container)

How are you running Gitea?

I'm using the proposed docker-compose configuraiton.
Initial version: 1.12 (.3?)
Upgrade to: 1.15.7

Database

SQLite

Can you reproduce the bug on the Gitea demo site?

No

Log Gist

No response

Description

After upgrading Gitea to 1.15.9 everything seemed well, but over time we experienced weird effects:

  • New repositories would always show the "new" page.
  • Pull Requests are fine after being created, but would not update or show "missing fork information".
  • Overall the repositories (git interaction) worked, but the UI was not in sync.

A restart of the container would now and then fix one of the issues, but only momentarily. I would see log messages like:
2022/01/04 19:10:48 ...ue/queue_bytefifo.go:224:doPop() [E] level: task-level Failed to unmarshal with error: readObjectStart: expect { or n, but found [, error found in #1 byte of ...|[{"PusherID|..., bigger context ...|[{"PusherID":1,"PusherName":"t","RepoUserName":"t|... 2022/01/04 19:10:48 ...ue/queue_bytefifo.go:224:doPop() [E] level: task-level Failed to unmarshal with error: readObjectStart: expect { or n, but found ", error found in #1 byte of ...|"429"|..., bigger context ...|"429"|...
These, however, disappeared after this one instance.

All these problems do not occur on a fresh instance, but we want to keep our data. (...)
Unfortunately we could also not just go back, because we realized the problems too late for any of the Backups to be viable.

I did the following during debugging:

  • Verify that hooks are called
  • Use the doctor tool to migrate the database (recreate-table) and fix warnings on all checks

In the end I realized that there are no workers (Site Administration → Monitoring). My initial guess was that these workers are instantiated automatically of there are tasks in the queue. Later I checked the queue configurations and clicked "Add Flush Workers" with three findings:

  • I get a 500 page, the log says that there are nil references and some i18n keys cannot be resolved. (Reproducible by instantiating a flush worker with tasks in the queue.)
  • Complaints about non-existing branches, which made sense because we resolved some merges by hand. (Work needed to continue.)
  • Suddenly there were updates in the UI.

As a workaround I have added a group with one worker to each Queue. I could not find these worker groups in the database though.

There is still the problem described in #17204, but it has not happened on new PRs yet.

Is there a problem during upgrade that leads to a situation where no workers exist?

Screenshots

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions