-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cpu hard lockup caused by memory #12
Comments
[root@datanode11 /cgroup/cpu/hadoop-yarn]# cat cpu.cfs_period_us Message from syslogd@datanode11 at Dec 11 03:10:35 ... Message from syslogd@datanode11 at Dec 11 03:10:49 ... Message from syslogd@datanode11 at Dec 11 03:10:49 ... |
The dmesg is memory related lockup. You need patched kernel. CPU related lockup will have |
…sion commit 8afb1d2 upstream. Commit 40f70c0 ("serial: sh-sci: add locking to console write function to avoid SMP lockup") copied the strategy to avoid locking problems in conjuncture with the console from the UART8250 driver. Instead using directly spin_{try}lock_irqsave(), local_irq_save() followed by spin_{try}lock() was used. While this is correct on mainline, for -rt it is a problem. spin_{try}lock() will check if it is running in a valid context. Since the local_irq_save() has already been executed, the context has changed and spin_{try}lock() will complain. The reason why spin_{try}lock() complains is that on -rt the spin locks are turned into mutexes and therefore can sleep. Sleeping with interrupts disabled is not valid. BUG: sleeping function called from invalid context at /home/wagi/work/rt/v4.4-cip-rt/kernel/locking/rtmutex.c:995 in_atomic(): 0, irqs_disabled(): 128, pid: 778, name: irq/76-eth0 CPU: 0 PID: 778 Comm: irq/76-eth0 Not tainted 4.4.126-test-cip22-rt14-00403-gcd03665c8318 #12 Hardware name: Generic RZ/G1 (Flattened Device Tree) Backtrace: [<c00140a0>] (dump_backtrace) from [<c001424c>] (show_stack+0x18/0x1c) r7:c06b01f0 r6:60010193 r5:00000000 r4:c06b01f0 [<c0014234>] (show_stack) from [<c01d3c94>] (dump_stack+0x78/0x94) [<c01d3c1c>] (dump_stack) from [<c004c134>] (___might_sleep+0x134/0x194) r7:60010113 r6:c06d3559 r5:00000000 r4:ffffe000 [<c004c000>] (___might_sleep) from [<c04ded60>] (rt_spin_lock+0x20/0x74) r5:c06f4d60 r4:c06f4d60 [<c04ded40>] (rt_spin_lock) from [<c02577e4>] (serial_console_write+0x100/0x118) r5:c06f4d60 r4:c06f4d60 [<c02576e4>] (serial_console_write) from [<c0061060>] (call_console_drivers.constprop.15+0x10c/0x124) r10:c06d2894 r9:c04e18b0 r8:00000028 r7:00000000 r6:c06d3559 r5:c06d2798 r4:c06b9914 r3:c02576e4 [<c0060f54>] (call_console_drivers.constprop.15) from [<c0062984>] (console_unlock+0x32c/0x430) r10:c06d30d8 r9:00000028 r8:c06dd518 r7:00000005 r6:00000000 r5:c06d2798 r4:c06d2798 r3:00000028 [<c0062658>] (console_unlock) from [<c0062e1c>] (vprintk_emit+0x394/0x4f0) r10:c06d2798 r9:c06d30ee r8:00000006 r7:00000005 r6:c06a78fc r5:00000027 r4:00000003 [<c0062a88>] (vprintk_emit) from [<c0062fa0>] (vprintk+0x28/0x30) r10:c060bd46 r9:00001000 r8:c06b9a90 r7:c06b9a90 r6:c06b994c r5:c06b9a3c r4:c0062fa8 [<c0062f78>] (vprintk) from [<c0062fb8>] (vprintk_default+0x10/0x14) [<c0062fa8>] (vprintk_default) from [<c009cd30>] (printk+0x78/0x84) [<c009ccbc>] (printk) from [<c025afdc>] (credit_entropy_bits+0x17c/0x2cc) r3:00000001 r2:decade60 r1:c061a5ee r0:c061a523 r4:00000006 [<c025ae60>] (credit_entropy_bits) from [<c025bf74>] (add_interrupt_randomness+0x160/0x178) r10:466e7196 r9:1f536000 r8:fffeef74 r7:00000000 r6:c06b9a60 r5:c06b9a3c r4:dfbcf680 [<c025be14>] (add_interrupt_randomness) from [<c006536c>] (irq_thread+0x1e8/0x248) r10:c006537c r9:c06cdf21 r8:c0064fcc r7:df791c24 r6:df791c00 r5:ffffe000 r4:df525180 [<c0065184>] (irq_thread) from [<c003fba4>] (kthread+0x108/0x11c) r10:00000000 r9:00000000 r8:c0065184 r7:df791c00 r6:00000000 r5:df791d00 r4:decac000 [<c003fa9c>] (kthread) from [<c00101b8>] (ret_from_fork+0x14/0x3c) r8:00000000 r7:00000000 r6:00000000 r5:c003fa9c r4:df791d00 Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Daniel Wagner <daniel.wagner@siemens.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…s_replace() commit bbb8c61 upstream. When a chain is updated, a counter can be attached. if so, the nft_counters_enabled should be increased. test commands: %nft add table ip filter %nft add chain ip filter input { type filter hook input priority 4\; } %iptables-compat -Z input %nft delete chain ip filter input we can see below messages. [ 286.443720] jump label: negative count! [ 286.448278] WARNING: CPU: 0 PID: 1459 at kernel/jump_label.c:197 __static_key_slow_dec_cpuslocked+0x6f/0xf0 [ 286.449144] Modules linked in: nf_tables nfnetlink ip_tables x_tables [ 286.449144] CPU: 0 PID: 1459 Comm: nft Tainted: G W 4.17.0-rc2+ #12 [ 286.449144] RIP: 0010:__static_key_slow_dec_cpuslocked+0x6f/0xf0 [ 286.449144] RSP: 0018:ffff88010e5176f0 EFLAGS: 00010286 [ 286.449144] RAX: 000000000000001b RBX: ffffffffc0179500 RCX: ffffffffb8a82522 [ 286.449144] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff88011b7e5eac [ 286.449144] RBP: 0000000000000000 R08: ffffed00236fce5c R09: ffffed00236fce5b [ 286.449144] R10: ffffffffc0179503 R11: ffffed00236fce5c R12: 0000000000000000 [ 286.449144] R13: ffff88011a28e448 R14: ffff88011a28e470 R15: dffffc0000000000 [ 286.449144] FS: 00007f0384328700(0000) GS:ffff88011b600000(0000) knlGS:0000000000000000 [ 286.449144] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 286.449144] CR2: 00007f038394bf10 CR3: 0000000104a86000 CR4: 00000000001006f0 [ 286.449144] Call Trace: [ 286.449144] static_key_slow_dec+0x6a/0x70 [ 286.449144] nf_tables_chain_destroy+0x19d/0x210 [nf_tables] [ 286.449144] nf_tables_commit+0x1891/0x1c50 [nf_tables] [ 286.449144] nfnetlink_rcv+0x1148/0x13d0 [nfnetlink] [ ... ] Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On 4.14.49-3 kernel (without the memory control patch)
The text was updated successfully, but these errors were encountered: