Web lists-archives.com

[PATCH 4.9 68/87] MIPS: Fix race on setting and getting cpu_online_mask

4.9-stable review patch.  If anyone has any objections, please let me know.


From: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@xxxxxxxxx>

commit 6f542ebeaee0ee552a902ce3892220fc22c7ec8e upstream.

While testing cpu hoptlug (cpu down and up in loops) on kernel 4.4, it was
observed that occasionally check for cpu online will fail in kernel/cpu.c,

 518        /* Arch-specific enabling code. */
 519        ret = __cpu_up(cpu, idle);
 521        if (ret != 0)
 522                goto out_notify;
 523        BUG_ON(!cpu_online(cpu));

Reason is race between start_secondary and _cpu_up. cpu_callin_map is set
before cpu_online_mask. In __cpu_up, cpu_callin_map is waited for, but cpu
online mask is not, resulting in race in which secondary processor started
and set cpu_callin_map, but not yet set the online mask,resulting in above
BUG being hit.

Upstream differs in the area. cpu_online check is in bringup_wait_for_ap,
which is after cpu reached AP_ONLINE_IDLE,where secondary passed its start
function. Nonetheless, fix makes start_secondary safe and not depending on
other locks throughout the code. It protects as well against cpu_online
checks put in between sometimes in the future.

Fix this by moving completion after all flags are set.

Signed-off-by: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@xxxxxxxxx>
Cc: Alexander Sverdlin <alexander.sverdlin@xxxxxxxxx>
Cc: linux-mips@xxxxxxxxxxxxxx
Patchwork: https://patchwork.linux-mips.org/patch/16925/
Signed-off-by: Ralf Baechle <ralf@xxxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

 arch/mips/kernel/smp.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/arch/mips/kernel/smp.c
+++ b/arch/mips/kernel/smp.c
@@ -371,9 +371,6 @@ asmlinkage void start_secondary(void)
 	cpumask_set_cpu(cpu, &cpu_coherent_mask);
-	complete(&cpu_running);
-	synchronise_count_slave(cpu);
 	set_cpu_online(cpu, true);
@@ -381,6 +378,9 @@ asmlinkage void start_secondary(void)
+	complete(&cpu_running);
+	synchronise_count_slave(cpu);
 	 * irq will be enabled in ->smp_finish(), enabling it too early
 	 * is dangerous.