Skip to content

Commit 3a74365

Browse files
committed
update introduction
1 parent 0744dd6 commit 3a74365

File tree

1 file changed

+7
-6
lines changed

1 file changed

+7
-6
lines changed

1-welcome/introduction.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -150,21 +150,22 @@ Much of the improvement in performance in the last two decades has come from arc
150150

151151
### Out of order execution
152152

153-
Out of order, also known as super scalar, execution is a way of extracting so called _Instruction level parallelism from the code the CPU is executing. Modern CPUs effectively do SSA at the hardware level to identify data dependencies between operations, and where possible run independent operations in parallel.
153+
Out of order, also known as super scalar, execution is a way of extracting so called _Instruction level parallelism_ from the code the CPU is executing. Modern CPUs effectively do SSA at the hardware level to identify data dependencies between operations, and where possible run independent instructions in parallel.
154154

155155
However there is a limit to the amount of parallelism inherent in any piece of code. It's also tremendously power hungry. Most modern CPUs have settled on six execution units per core as there is an n squared cost of connecting each execution unit to all others at each stage of the pipeline.
156156

157157

158158
### Speculative execution
159159

160-
One of the problems with out of order execution is branches and memory loads. When a CPU reaches a branch
160+
Save the smallest micro controllers, all CPUs utilise an _instruction pipeline_ to overlap parts of in the instruction fetch/decode/execute/commit cycle.
161161

162-
Super scalar execution, as we're all learning about through Spectre style vulnerabilities chooses
162+
![CPU pipeline](https://upload.wikimedia.org/wikipedia/commons/thumb/2/21/Fivestagespipeline.png/800px-Fivestagespipeline.png)
163163

164-
To avoid the stalls inherent with branches and loads
164+
The problem with an instruction pipeline is branch instructions. When a CPU reaches a branch it cannot look beyond the branch for additional instructions to execute. Speculative execution allows the CPU to "guess" which path the branch will take _while the branch instruction is still being processed!_
165165

166-
(super-scalar) -- requires register renaming
167-
speculative execution -- huge power waste
166+
If the CPU predicts the branch correctly then it can keep its pipeline of instructions full. If the CPU fails to predict the correct branch then when it realises the mistake it must roll back any change that were made to its _architectural state_. As we're all learning about through Spectre style vulnerabilities, sometimes this rollback isn't as seamless as promised.
167+
168+
Speculative execution can be very power hungry when branch prediction rates are low. If the branch is misprediction, not only must the CPU backtrace to the point of the misprediction, but the energy expended on the incorrect branch is wasted.
168169

169170
Cliff Click has a [wonderful presentation][10] that argues out of order and speculative execution is most useful for starting cache misses early thereby reducing observed cache latency.
170171

0 commit comments

Comments
 (0)