Closed
Description
Search before asking
- I searched in the issues and found nothing similar.
Paimon version
0.9
Compute Engine
spark3.5.1
Minimal reproduce step
- Save data to the paimon table
dataset.write().mode(mode).format("paimon").save(path);
- Perform to the stage (collect at PaimonSparkWriter. Scala: 195), Some nodes are lost. Try again
- The amount of data written by the two retries is different from that of the final query
9314203 + 6211188 = 15525391
But the amount of data queried from the paimon table is 15476552
What doesn't meet your expectations?
When I increased execu's memory, the task did not retry and ended up writing 15,525,244 pieces of data. I guess the possible reason is that the task retry will overwrite the file written the first time, or some other possibility
Anything else?
No response
Are you willing to submit a PR?
- I'm willing to submit a PR!