Skip to content

Commit

Permalink
x86/alternative: Optimize single-byte NOPs at an arbitrary position
Browse files Browse the repository at this point in the history
Up until now the assumption was that an alternative patching site would
have some instructions at the beginning and trailing single-byte NOPs
(0x90) padding. Therefore, the patching machinery would go and optimize
those single-byte NOPs into longer ones.

However, this assumption is broken on 32-bit when code like
hv_do_hypercall() in hyperv_init() would use the ratpoline speculation
killer CALL_NOSPEC. The 32-bit version of that macro would align certain
insns to 16 bytes, leading to the compiler issuing a one or more
single-byte NOPs, depending on the holes it needs to fill for alignment.

That would lead to the warning in optimize_nops() to fire:

  ------------[ cut here ]------------
  Not a NOP at 0xc27fb598
   WARNING: CPU: 0 PID: 0 at arch/x86/kernel/alternative.c:211 optimize_nops.isra.13

due to that function verifying whether all of the following bytes really
are single-byte NOPs.

Therefore, carve out the NOP padding into a separate function and call
it for each NOP range beginning with a single-byte NOP.

Fixes: 23c1ad5 ("x86/alternatives: Optimize optimize_nops()")
Reported-by: Richard Narron <richard@aaazen.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=213301
Link: https://lkml.kernel.org/r/20210601212125.17145-1-bp@alien8.de
  • Loading branch information
suryasaimadhu committed Jun 3, 2021
1 parent 9bfecd0 commit 2b31e8e
Showing 1 changed file with 46 additions and 18 deletions.
64 changes: 46 additions & 18 deletions arch/x86/kernel/alternative.c
Original file line number Diff line number Diff line change
Expand Up @@ -182,42 +182,70 @@ recompute_jump(struct alt_instr *a, u8 *orig_insn, u8 *repl_insn, u8 *insn_buff)
n_dspl, (unsigned long)orig_insn + n_dspl + repl_len);
}

/*
* optimize_nops_range() - Optimize a sequence of single byte NOPs (0x90)
*
* @instr: instruction byte stream
* @instrlen: length of the above
* @off: offset within @instr where the first NOP has been detected
*
* Return: number of NOPs found (and replaced).
*/
static __always_inline int optimize_nops_range(u8 *instr, u8 instrlen, int off)
{
unsigned long flags;
int i = off, nnops;

while (i < instrlen) {
if (instr[i] != 0x90)
break;

i++;
}

nnops = i - off;

if (nnops <= 1)
return nnops;

local_irq_save(flags);
add_nops(instr + off, nnops);
local_irq_restore(flags);

DUMP_BYTES(instr, instrlen, "%px: [%d:%d) optimized NOPs: ", instr, off, i);

return nnops;
}

/*
* "noinline" to cause control flow change and thus invalidate I$ and
* cause refetch after modification.
*/
static void __init_or_module noinline optimize_nops(struct alt_instr *a, u8 *instr)
{
unsigned long flags;
struct insn insn;
int nop, i = 0;
int i = 0;

/*
* Jump over the non-NOP insns, the remaining bytes must be single-byte
* NOPs, optimize them.
* Jump over the non-NOP insns and optimize single-byte NOPs into bigger
* ones.
*/
for (;;) {
if (insn_decode_kernel(&insn, &instr[i]))
return;

/*
* See if this and any potentially following NOPs can be
* optimized.
*/
if (insn.length == 1 && insn.opcode.bytes[0] == 0x90)
break;

if ((i += insn.length) >= a->instrlen)
return;
}
i += optimize_nops_range(instr, a->instrlen, i);
else
i += insn.length;

for (nop = i; i < a->instrlen; i++) {
if (WARN_ONCE(instr[i] != 0x90, "Not a NOP at 0x%px\n", &instr[i]))
if (i >= a->instrlen)
return;
}

local_irq_save(flags);
add_nops(instr + nop, i - nop);
local_irq_restore(flags);

DUMP_BYTES(instr, a->instrlen, "%px: [%d:%d) optimized NOPs: ",
instr, nop, a->instrlen);
}

/*
Expand Down

0 comments on commit 2b31e8e

Please sign in to comment.