Re: Re：[PATCH 4.9 81/96] MIPS: Loongson: Introduce and use loongson_llsc_mb()
- Date: Wed, 13 Mar 2019 13:58:02 -0700
- From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
- Subject: Re: Re：[PATCH 4.9 81/96] MIPS: Loongson: Introduce and use loongson_llsc_mb()
On Wed, Mar 13, 2019 at 09:17:15PM +0800, 陈华才 wrote:
> Hi, GREG,
> 4.9 need to modify spinlock.h, please wait my patch.
> 发件人:"Greg Kroah-Hartman"<gregkh@xxxxxxxxxxxxxxxxxxx>
> 发送时间:2019年3月13日(星期三) 凌晨1:10
> 主题:[PATCH 4.9 81/96] MIPS: Loongson: Introduce and use loongson_llsc_mb()
> 4.9-stable review patch. If anyone has any objections, please let me know.
> [ Upstream commit e02e07e3127d8aec1f4bcdfb2fc52a2d99b4859e ]
> On the Loongson-2G/2H/3A/3B there is a hardware flaw that ll/sc and
> lld/scd is very weak ordering. We should add sync instructions "before
> each ll/lld" and "at the branch-target between ll/sc" to workaround.
> Otherwise, this flaw will cause deadlock occasionally (e.g. when doing
> heavy load test with LTP).
> Below is the explaination of CPU designer:
> "For Loongson 3 family, when a memory access instruction (load, store,
> or prefetch)'executing occurs between the execution of LL and SC, the
> success or failure of SC is not predictable. Although programmer would
> not insert memory access instructions between LL and SC, the memory
> instructions before LL in program-order, may dynamically executed
> between the execution of LL/SC, so a memory fence (SYNC) is needed
> before LL/LLD to avoid this situation.
> Since Loongson-3A R2 (3A2000), we have improved our hardware design to
> handle this case. But we later deduce a rarely circumstance that some
> speculatively executed memory instructions due to branch misprediction
> between LL/SC still fall into the above case, so a memory fence (SYNC)
> at branch-target (if its target is not between LL/SC) is needed for
> Loongson 3A1000, 3B1500, 3A2000 and 3A3000.
> Our processor is continually evolving and we aim to to remove all these
> workaround-SYNCs around LL/SC for new-come processor."
> Here is an example:
> Both cpu1 and cpu2 simutaneously run atomic_add by 1 on same atomic var,
> this bug cause both ''un by two cpus (in atomic_add) succeed at same
> time(''eturn 1), and the variable is only *added by 1*, sometimes,
> which is wrong and unacceptable(it should be added by 2).
> Why disable fix-loongson3-llsc in compiler?
> Because compiler fix will cause problems in kernel'__ex_table section.
> This patch fix all the cases in kernel, but:
> +. the fix at the end of futex_atomic_cmpxchg_inatomic is for branch-target
> of 'e'there other cases which smp_mb__before_llsc() and smp_llsc_mb() fix
> the ll and branch-target coincidently such as atomic_sub_if_positive/
> cmpxchg/xchg, just like this one.
> +. Loongson 3 does support CONFIG_EDAC_ATOMIC_SCRUB, so no need to touch
> +. local_ops and cmpxchg_local should not be affected by this bug since
> only the owner can write.
> +. mips_atomic_set for syscall.c is deprecated and rarely used, just let
> it go
> Signed-off-by: Huacai Chen <chenhc@xxxxxxxxxx>
> Signed-off-by: Huang Pei <huangpei@xxxxxxxxxxx>
> - Simplify the addition of -mno-fix-loongson3-llsc to cflags, and add
> a comment describing why it'there.
> - Make loongson_llsc_mb() a no-op when
> CONFIG_CPU_LOONGSON3_WORKAROUNDS=n, rather than a compiler memory
> - Add a comment describing the bug & how loongson_llsc_mb() helps
> in asm/barrier.h.]
> Signed-off-by: Paul Burton <paul.burton@xxxxxxxx>
> Cc: Ralf Baechle <ralf@xxxxxxxxxxxxxx>
> Cc: ambrosehua@xxxxxxxxx
> Cc: Steven J . Hill <Steven.Hill@xxxxxxxxxx>
> Cc: linux-mips@xxxxxxxxxxxxxx
> Cc: Fuxin Zhang <zhangfx@xxxxxxxxxx>
> Cc: Zhangjin Wu <wuzhangjin@xxxxxxxxx>
> Cc: Li Xuefeng <lixuefeng@xxxxxxxxxxx>
> Cc: Xu Chenghua <xuchenghua@xxxxxxxxxxx>
> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
> arch/mips/Kconfig | 15 ++++++++++++++
> arch/mips/include/asm/atomic.h | 6 ++++++
> arch/mips/include/asm/barrier.h | 36 +++++++++++++++++++++++++++++++++
> arch/mips/include/asm/bitops.h | 5 +++++
> arch/mips/include/asm/futex.h | 3 +++
> arch/mips/include/asm/pgtable.h | 2 ++
> arch/mips/loongson64/Platform | 23 +++++++++++++++++++++
> arch/mips/mm/tlbex.c | 10 +++++++++
> 8 files changed, 100 insertions(+)
Ok, I will go drop this from all stable queues now, thanks!