[Bug] When spark writes data to the paimon table, data is lost due to some task retries #4831
Closed
1 of 2 tasks
Labels
bug
Something isn't working
Search before asking
Paimon version
0.9
Compute Engine
spark3.5.1
Minimal reproduce step
dataset.write().mode(mode).format("paimon").save(path);
9314203 + 6211188 = 15525391
But the amount of data queried from the paimon table is 15476552
What doesn't meet your expectations?
When I increased execu's memory, the task did not retry and ended up writing 15,525,244 pieces of data. I guess the possible reason is that the task retry will overwrite the file written the first time, or some other possibility
Anything else?
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: