Commit 1e36d9c6 authored by Tony Luck's avatar Tony Luck Committed by Borislav Petkov
Browse files

x86/mce: Delay clearing IA32_MCG_STATUS to the end of do_machine_check()



A long time ago, Linux cleared IA32_MCG_STATUS at the very end of machine
check processing.

Then, some fancy recovery and IST manipulation was added in:

  d4812e16 ("x86, mce: Get rid of TIF_MCE_NOTIFY and associated mce tricks")

and clearing IA32_MCG_STATUS was pulled earlier in the function.

Next change moved the actual recovery out of do_machine_check() and
just used task_work_add() to schedule it later (before returning to the
user):

  5567d11c ("x86/mce: Send #MC singal from task work")

Most recently the fancy IST footwork was removed as no longer needed:

  b052df3d ("x86/entry: Get rid of ist_begin/end_non_atomic()")

At this point there is no reason remaining to clear IA32_MCG_STATUS early.
It can move back to the very end of the function.

Also move sync_core(). The comments for this function say that it should
only be called when instructions have been changed/re-mapped. Recovery
for an instruction fetch may change the physical address. But that
doesn't happen until the scheduled work runs (which could be on another
CPU).

 [ bp: Massage commit message. ]

Reported-by: default avatarGabriele Paoloni <gabriele.paoloni@intel.com>
Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200824221237.5397-1-tony.luck@intel.com
parent 368d1887
Loading
Loading
Loading
Loading
+4 −5
Original line number Diff line number Diff line
@@ -1190,6 +1190,7 @@ static void kill_me_maybe(struct callback_head *cb)

	if (!memory_failure(p->mce_addr >> PAGE_SHIFT, flags)) {
		set_mce_nospec(p->mce_addr >> PAGE_SHIFT, p->mce_whole_page);
		sync_core();
		return;
	}

@@ -1330,12 +1331,8 @@ noinstr void do_machine_check(struct pt_regs *regs)
	if (worst > 0)
		irq_work_queue(&mce_irq_work);

	mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);

	sync_core();

	if (worst != MCE_AR_SEVERITY && !kill_it)
		return;
		goto out;

	/* Fault was in user mode and we need to take some action */
	if ((m.cs & 3) == 3) {
@@ -1364,6 +1361,8 @@ noinstr void do_machine_check(struct pt_regs *regs)
				mce_panic("Failed kernel mode recovery", &m, msg);
		}
	}
out:
	mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
}
EXPORT_SYMBOL_GPL(do_machine_check);