-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Open
Labels
enhancementmulti-stageRelated to the multi-stage query engineRelated to the multi-stage query engine
Description
currently, "physical plan explain" added in #11052 will create a dag-like structure with all the workerIDs printed out
but there are still several improvements we can add:
- sometimes it only prints the first one due to repetition for example
- it doesn't have details regarding the logical node (such as project columns etc)
[0]@192.168.1.108:56541 MAIL_RECEIVE(RANDOM_DISTRIBUTED)
├── [1]@192.168.1.108:56595 MAIL_SEND(RANDOM_DISTRIBUTED)->{[0]@192.168.1.108@{56541,56541}|[0]} (Subtree Omitted)
└── [1]@192.168.1.108:56589 MAIL_SEND(RANDOM_DISTRIBUTED)->{[0]@192.168.1.108@{56541,56541}|[0]}
└── [1]@192.168.1.108:56589 AGGREGATE_FINAL <---- this should really be on 2 servers
└── [1]@192.168.1.108:56589 MAIL_RECEIVE(HASH_DISTRIBUTED)
├── [2]@192.168.1.108:56595 MAIL_SEND(HASH_DISTRIBUTED)->{[1]@192.168.1.108@{56595,56596}|[0],[1]@192.168.1.108@{56589,56590}|[1]} (Subtree Omitted) <---- subtree is omitted b/c they are the same except for the server/worker ID
└── [2]@192.168.1.108:56589 MAIL_SEND(HASH_DISTRIBUTED)->{[1]@192.168.1.108@{56595,56596}|[0],[1]@192.168.1.108@{56589,56590}|[1]} <---- mailbox send includes a list of receiving mailbox
└── [2]@192.168.1.108:56589 AGGREGATE_LEAF <---- this should really be on 2 servers
└── [2]@192.168.1.108:56589 JOIN <---- this should really be on 2 servers
├── [2]@192.168.1.108:56589 MAIL_RECEIVE(HASH_DISTRIBUTED)
│ └── [3]@192.168.1.108:56589 MAIL_SEND(HASH_DISTRIBUTED)->{[2]@192.168.1.108@{56595,56596}|[0],[2]@192.168.1.108@{56589,56590}|[1]}
│ └── [3]@192.168.1.108:56589 PROJECT <---- missing project columns
│ └── [3]@192.168.1.108:56589 TABLE SCAN (A) null
└── [2]@192.168.1.108:56589 MAIL_RECEIVE(HASH_DISTRIBUTED)
└── [4]@192.168.1.108:56595 MAIL_SEND(HASH_DISTRIBUTED)->{[2]@192.168.1.108@{56595,56596}|[0],[2]@192.168.1.108@{56589,56590}|[1]}
└── [4]@192.168.1.108:56595 PROJECT
└── [4]@192.168.1.108:56595 TABLE SCAN (B) null
i would suggest
- all nodes except mailbox send and mailbox receive shouldn't have server/worker info attached. e.g. only the stage/fragment ID
- attach logical info to the nodes as well
- as a side note, do not attach plan in error messages when execution failed. printing it out in the log should be suffice and it will make the error message much simpler comprehend
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementmulti-stageRelated to the multi-stage query engineRelated to the multi-stage query engine