Web lists-archives.com

Machine freezes and crashes with the message "soft lockup - CPU#0 stuck for 22s!"




Hello all,

I'm helping a user to figure out an issue. On her syslog file I could find:
BUG: soft lockup - CPU#0 stuck for 22s!

And the following stack trace:
Stack:
Dec 28 14:34:09 colossus kernel: [946492.108011]  ffffffffa041af86 ffff880429a20728 ffffffffa04a801d 0000000000000001
Dec 28 14:34:09 colossus kernel: [946492.108011]  ffff8802785f2048 ffff8802785f2048 000000007fffffff ffffffff812b5057
Dec 28 14:34:09 colossus kernel: [946492.108011]  0000000000000415 00000000000000d0 ffff8802785f2288 ffffffff812b50d5
Dec 28 14:34:09 colossus kernel: [946492.108011] Call Trace:
Dec 28 14:34:09 colossus kernel: [946492.108011]  [<ffffffffa041af86>] ? ttm_bo_vm_fault+0x4c6/0x560 [ttm]
Dec 28 14:34:09 colossus kernel: [946492.108011]  [<ffffffffa04a801d>] ? radeon_bo_create+0x16d/0x220 [radeon]
Dec 28 14:34:09 colossus kernel: [946492.108011]  [<ffffffff812b5057>] ? idr_mark_full+0x57/0x60
Dec 28 14:34:09 colossus kernel: [946492.108011]  [<ffffffff812b50d5>] ? idr_alloc+0x75/0xd0
Dec 28 14:34:09 colossus kernel: [946492.108011]  [<ffffffffa032991a>] ? drm_gem_handle_create_tail+0xba/0x160 [drm]
Dec 28 14:34:09 colossus kernel: [946492.108011]  [<ffffffffa04ba297>] ? radeon_gem_create_ioctl+0xd7/0x160 [radeon]
Dec 28 14:34:09 colossus kernel: [946492.108011]  [<ffffffffa04a6907>] ? radeon_ttm_fault+0x47/0x60 [radeon]
Dec 28 14:34:09 colossus kernel: [946492.108011]  [<ffffffff8116c04a>] ? __do_fault+0x3a/0xa0
Dec 28 14:34:09 colossus kernel: [946492.108011]  [<ffffffff811768dc>] ? mmap_region+0x19c/0x650
Dec 28 14:34:09 colossus kernel: [946492.108011]  [<ffffffff8116f10f>] ? do_shared_fault.isra.55+0x2f/0x1d0
Dec 28 14:34:09 colossus kernel: [946492.108011]  [<ffffffff811704c5>] ? handle_mm_fault+0x6c5/0x1140
Dec 28 14:34:09 colossus kernel: [946492.108011]  [<ffffffff811770ca>] ? do_mmap_pgoff+0x33a/0x420
Dec 28 14:34:09 colossus kernel: [946492.108011]  [<ffffffff810593c7>] ? __do_page_fault+0x177/0x410
Dec 28 14:34:09 colossus kernel: [946492.108011]  [<ffffffff81527d28>] ? page_fault+0x28/0x30


Her configuration is that one:
- Debian Jessie 8.10

- Kernel : Linux colossus 3.16.0-4-amd64 #1 SMP Debian 3.16.51-3 (2017-12-13) x86_64 GNU/Linux
- Kernel package: linux-image-amd64 version "3.16+63"

- Xorg version: 1.16.4
- Package version of xserver-xorg-core: 2:1.16.4-1+deb8u2

- Package version of firmware-amd-graphics: 20161130-3~bpo8+1
- The graphic card named obtained from lscpi tells "VGA compatible controller: Advanced Micro Devices, Inc.
[AMD/ATI] Cedar [Radeon HD 5000/6000/7350/8350 Series]"

The output of lspci could be get here: https://framabin.org/?2ea8b0daa0892459#YEAgTr5Tt3SrdU7hoySOE4e1iTrKtaGSmqiGNSoRm48=


The issue appears more frequently recently, I suspect a graphic card issue due to the stack trace obtained from syslog but I can't understand how can I determine exactly where the issue comes from and how I could resolve it. I've Googling a lot without find any information that could help me.

Any thoughts?

Best regards.
--
Alex ARNAUD
Visual-Impairment Project Manager
Hypra - "Humanizing technology"