Skip to content

Temporary working folders are left behind on Middle Managers after tasks complete #12332

Open
@sergioferragut

Description

@sergioferragut

Affected Version

Apache Druid 0.22.1

Description

This problem was originally reported here: https://www.druidforum.org/t/temp-folder-size-was-increasing-due-to-that-peons-processing-taking-more-time-how-to-clear-temp-folder-automatically/7139

I was able to reproduce it by running on a small minikube deployment by running the vanilla wikipedia index_parallel ingestion a few times, each with a different target datasource name and confirmed that after the jobs completed the temporary folders for the tasks are not being removed, after 3 runs, the ~/var/tmp folder still contained the three empty folders:

~/var/tmp $ ls -l
total 12
drwx------    2 druid    druid         4096 Mar 14 23:39 druid-realtime-persist1040350100896362009
drwx------    2 druid    druid         4096 Mar 14 23:32 druid-realtime-persist668375622911252079
drwx------    2 druid    druid         4096 Mar 14 23:34 druid-realtime-persist944793843865837077
~/var/tmp $ ls -l druid-realtime-persist944793843865837077
total 0
~/var/tmp $ ls -l druid-realtime-persist668375622911252079
total 0
~/var/tmp $ ls -l druid-realtime-persist1040350100896362009
total 0

The original report on Druid Forum spoke of thousands of such folders left behind.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions