Description
Gitea Version
1.15.9
Git Version
2.30.2 (from container)
Operating System
Alpine 3.13 (from container)
How are you running Gitea?
I'm using the proposed docker-compose configuraiton.
Initial version: 1.12 (.3?)
Upgrade to: 1.15.7
Database
SQLite
Can you reproduce the bug on the Gitea demo site?
No
Log Gist
No response
Description
After upgrading Gitea to 1.15.9 everything seemed well, but over time we experienced weird effects:
- New repositories would always show the "new" page.
- Pull Requests are fine after being created, but would not update or show "missing fork information".
- Overall the repositories (git interaction) worked, but the UI was not in sync.
A restart of the container would now and then fix one of the issues, but only momentarily. I would see log messages like:
2022/01/04 19:10:48 ...ue/queue_bytefifo.go:224:doPop() [E] level: task-level Failed to unmarshal with error: readObjectStart: expect { or n, but found [, error found in #1 byte of ...|[{"PusherID|..., bigger context ...|[{"PusherID":1,"PusherName":"t","RepoUserName":"t|... 2022/01/04 19:10:48 ...ue/queue_bytefifo.go:224:doPop() [E] level: task-level Failed to unmarshal with error: readObjectStart: expect { or n, but found ", error found in #1 byte of ...|"429"|..., bigger context ...|"429"|...
These, however, disappeared after this one instance.
All these problems do not occur on a fresh instance, but we want to keep our data. (...)
Unfortunately we could also not just go back, because we realized the problems too late for any of the Backups to be viable.
I did the following during debugging:
- Verify that hooks are called
- Use the doctor tool to migrate the database (
recreate-table
) and fix warnings on all checks
In the end I realized that there are no workers (Site Administration → Monitoring). My initial guess was that these workers are instantiated automatically of there are tasks in the queue. Later I checked the queue configurations and clicked "Add Flush Workers" with three findings:
- I get a 500 page, the log says that there are nil references and some i18n keys cannot be resolved. (Reproducible by instantiating a flush worker with tasks in the queue.)
- Complaints about non-existing branches, which made sense because we resolved some merges by hand. (Work needed to continue.)
- Suddenly there were updates in the UI.
As a workaround I have added a group with one worker to each Queue. I could not find these worker groups in the database though.
There is still the problem described in #17204, but it has not happened on new PRs yet.
Is there a problem during upgrade that leads to a situation where no workers exist?
Screenshots
No response