Problem with Vlan-mon and Intel X710/X722

Bug reports
Post Reply
andlui9
Posts: 37
Joined: 20 Jan 2017, 11:46

Problem with Vlan-mon and Intel X710/X722

Post by andlui9 » 21 Jan 2020, 18:54

I recently had the opportunity to install some Accel-PPP servers on good servers with good network cards. However I ended up realizing that when trying to use the vlan monitoring feature, both for pppoe server and ipoe server, linux presents a call-trace, as I will leave below, when turning off the vlan monitoring feature (vlan-mon) o problem does not happen. I even tried to update the network card driver (i40e) and also disable the CPU management mode (powersave), using the performance mode, both in BIOS and Linux, without success in both attempts. I am using Debian 9 with 4.9 kernel.

Even include the following options to inform the kernel about the frequency control over the CPU, "intel_pstate=disable idle=poll intel_idle.max_cstate=1", also successful, I also left the scaling governor as "performance", without success either.

Below the call trace data:
[16611.374728] INFO: rcu_sched self-detected stall on CPU
[16611.374772] 21-...: (5250 ticks this GP) idle=89f/140000000000001/0 softirq=9653/9653 fqs=1733
[16611.374808] (t=5251 jiffies g=717042 c=717041 q=7008)
[16611.374840] Task dump for CPU 21:
[16611.374841] accel-pppd R running task 0 17318 1 0x00000008
[16611.374844] ffffffff9d719a00 ffffffff9caa953b 0000000000000015 ffffffff9d719a00
[16611.374846] ffffffff9cb830ad ffff8987ee5596c0 ffffffff9d64fd80 0000000000000000
[16611.374849] ffffffff9d719a00 00000000ffffffff ffffffff9cae51ca 0000000000000001
[16611.374851] Call Trace:
[16611.374853] <IRQ>
[16611.374859] [<ffffffff9caa953b>] ? sched_show_task+0xcb/0x130
[16611.374863] [<ffffffff9cb830ad>] ? rcu_dump_cpu_stacks+0x92/0xb2
[16611.374866] [<ffffffff9cae51ca>] ? rcu_check_callbacks+0x75a/0x8b0
[16611.374870] [<ffffffff9cafb770>] ? tick_sched_do_timer+0x30/0x30
[16611.374872] [<ffffffff9caebda8>] ? update_process_times+0x28/0x50
[16611.374874] [<ffffffff9cafb170>] ? tick_sched_handle.isra.12+0x20/0x50
[16611.374876] [<ffffffff9cafb7a8>] ? tick_sched_timer+0x38/0x70
[16611.374878] [<ffffffff9caec87e>] ? __hrtimer_run_queues+0xde/0x250
[16611.374880] [<ffffffff9caecf5c>] ? hrtimer_interrupt+0x9c/0x1a0
[16611.374883] [<ffffffff9d021b27>] ? smp_apic_timer_interrupt+0x47/0x60
[16611.374887] [<ffffffff9d02025e>] ? apic_timer_interrupt+0x9e/0xb0
[16611.374887] <EOI>
[16611.374896] [<ffffffffc033c21c>] ? i40e_find_filter+0x2c/0x70 [i40e]
[16611.374900] [<ffffffffc0341b54>] ? i40e_add_filter+0x54/0x140 [i40e]
[16611.374904] [<ffffffffc0343722>] ? i40e_vsi_add_vlan+0xe2/0x2f0 [i40e]
[16611.374908] [<ffffffffc0343963>] ? i40e_vlan_rx_add_vid+0x33/0x50 [i40e]
[16611.374912] [<ffffffffc0404afc>] ? vlan_mon_nl_cmd_add_vlan_mon+0x17c/0x2c0 [vlan_mon]
[16611.374915] [<ffffffff9cf4b9f5>] ? genl_family_rcv_msg+0x1c5/0x360
[16611.374917] [<ffffffff9cefce3e>] ? __kmalloc_reserve.isra.35+0x2e/0x80
[16611.374921] [<ffffffff9cbe9ae6>] ? kmem_cache_alloc_node_trace+0x156/0x5a0
[16611.374923] [<ffffffff9cf4bb90>] ? genl_family_rcv_msg+0x360/0x360
[16611.374925] [<ffffffff9cf4bc12>] ? genl_rcv_msg+0x82/0xc0
[16611.374927] [<ffffffff9cf4b194>] ? netlink_rcv_skb+0xa4/0xc0
[16611.374929] [<ffffffff9cf4b814>] ? genl_rcv+0x24/0x40
[16611.374931] [<ffffffff9cf4ab6a>] ? netlink_unicast+0x18a/0x230
[16611.374933] [<ffffffff9cf4af67>] ? netlink_sendmsg+0x357/0x3b0
[16611.374936] [<ffffffff9cef5996>] ? sock_sendmsg+0x36/0x40
[16611.374938] [<ffffffff9cef6428>] ? ___sys_sendmsg+0x2c8/0x2e0
[16611.374940] [<ffffffff9cf49593>] ? netlink_insert+0x1a3/0x320
[16611.374942] [<ffffffff9cf4979e>] ? netlink_autobind.isra.30+0x8e/0xd0
[16611.374944] [<ffffffff9cc0953a>] ? __check_object_size+0xfa/0x1d8
[16611.374946] [<ffffffff9cef4935>] ? move_addr_to_user+0xb5/0xd0
[16611.374948] [<ffffffff9cef6d31>] ? __sys_sendmsg+0x51/0x90
[16611.374951] [<ffffffff9ca03b7d>] ? do_syscall_64+0x8d/0x100
[16611.374953] [<ffffffff9d01e3ce>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6

lbw
Posts: 13
Joined: 09 Mar 2019, 00:16

Re: Problem with Vlan-mon and Intel X710/X722

Post by lbw » 21 Jan 2020, 23:26

Try turning off with 'ethtool' any VLAN tag related acceleration and see if that solves your issue.

andlui9
Posts: 37
Joined: 20 Jan 2017, 11:46

Re: Problem with Vlan-mon and Intel X710/X722

Post by andlui9 » 31 Jan 2020, 17:54

I did it, but without success.

I did it like this:
ethtool -K enp134s0f2 rxvlan off txvlan off

lbw
Posts: 13
Joined: 09 Mar 2019, 00:16

Re: Problem with Vlan-mon and Intel X710/X722

Post by lbw » 11 Feb 2020, 05:38

I came across this today:

https://serverfault.com/questions/73205 ... untu-14-04

Disable LRO if enabling ip forwarding or bridging

WARNING: The ixgbe driver supports the Large Receive Offload (LRO) feature. This option offers the lowest CPU utilization for receives but is completely incompatible with routing/ip forwarding and bridging. If enabling ip forwarding or bridging is a requirement, it is necessary to disable LRO using compile time options as noted in the LRO section later in this document. The result of not disabling LRO when combined with ip forwarding or bridging can be low throughput or even a kernel panic.

Perhaps it's related?

andlui9
Posts: 37
Joined: 20 Jan 2017, 11:46

Re: Problem with Vlan-mon and Intel X710/X722

Post by andlui9 » 19 Feb 2020, 14:47

in my case it doesn't even allow to change, it was already disabled

Post Reply

Who is online

Users browsing this forum: No registered users and 0 guests