So I had put accel-ppp live with heavy load

PPPoE related questions
hashbang
Posts: 135
Joined: 12 Jul 2015, 10:28

So I had put accel-ppp live with heavy load

Post by hashbang »

Hi,
As I'm conducting live test of accel on heavy load. The server config
Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz x 2
16gb ram
1tb HDD
2 port 10G intel 82599 card
Everything seems fine till 2.5g after that heavy drops on rx packets on wan. Tuned every parameter of ethernet controller (just flow control left). After 2.5g / 2500 pppoe users there seems to be heavy rx packet drops on ethernet card. After that reducing the load to less than 50% resolves the problem or reboot. I suspect it happens when sudden increase in load afer 2400 users or 2.5gbps load and there is uncontrollable rx drops on both lan and wan. Will post more details
Screenshot of health monitoring of server
Screenshot from 2020-02-18 14-12-53.png
Screenshot from 2020-02-18 14-12-53.png (254.51 KiB) Viewed 44042 times
thanks
Attachments
Screenshot from 2020-02-18 15-00-39.png
Screenshot from 2020-02-18 15-00-39.png (70.59 KiB) Viewed 44042 times
dimka88
Posts: 866
Joined: 13 Oct 2014, 05:51
Contact:

Re: So I had put accel-ppp live with heavy load

Post by dimka88 »

Hi, first increase you ring buffers, like in example bellow

Code: Select all

ethtool -G eth0 rx 4096 tx 4096
For check availible values - use `ethtool -g eth0 `
Disable offload (gro, tso, gso, lro) if it enabled.

Code: Select all

ethtool -K eth0 gro off gso off gso off lro off
Use up-limiter=policy instead of htb in section [shaper]
Provide please also output `cat /proc/interrupts` and screenshot command `top` and press 1
Did you see any messages in syslog/journalctl/dmesg from 19:00 to 21:00?
hashbang
Posts: 135
Joined: 12 Jul 2015, 10:28

Re: So I had put accel-ppp live with heavy load

Post by hashbang »

ethtool -G eth0 rx 4096 tx 4096
already done
ethtool -K eth0 gro off gso off gso off lro off
already done
up-limiter=policy instead of htb in
already done
Screenshot from 2020-02-18 18-32-38.png
Screenshot from 2020-02-18 18-32-38.png (182.19 KiB) Viewed 44035 times
Couldnt get any more info as the ssystem was rebooted bcoz of rx drops on both the ports.
thanks
dimka88
Posts: 866
Joined: 13 Oct 2014, 05:51
Contact:

Re: So I had put accel-ppp live with heavy load

Post by dimka88 »

Show please output `cat /proc/interrupts`
I see on the cores `CPU5` and `CPU17` abnormal load. What exactly the connection type are using (pptp,l2tp,pppoe,ipoe)?
If this is L2 traffic, you need enable RPS

Code: Select all

echo ff > /sys/class/net/ethX/queues/rx-0/rps_cpus 
Where ethX your NICs
ps:// I think on this server you can serve more 10K connections
lbw
Posts: 27
Joined: 09 Mar 2019, 00:16

Re: So I had put accel-ppp live with heavy load

Post by lbw »

How many threads do you have configured? It should be (at least) one per CPU.
hashbang
Posts: 135
Joined: 12 Jul 2015, 10:28

Re: So I had put accel-ppp live with heavy load

Post by hashbang »

dimka88 wrote: 18 Feb 2020, 13:28 Show please output `cat /proc/interrupts`
I see on the cores `CPU5` and `CPU17` abnormal load. What exactly the connection type are using (pptp,l2tp,pppoe,ipoe)?
If this is L2 traffic, you need enable RPS

Code: Select all

echo ff > /sys/class/net/ethX/queues/rx-0/rps_cpus 
Where ethX your NICs
ps:// I think on this server you can serve more 10K connections
ty
done as mentioned above sirq seems balanced now. will post after few days of test
hashbang
Posts: 135
Joined: 12 Jul 2015, 10:28

Re: So I had put accel-ppp live with heavy load

Post by hashbang »

lbw wrote: 19 Feb 2020, 07:06 How many threads do you have configured? It should be (at least) one per CPU.
thread of accel-ppp ?
lbw
Posts: 27
Joined: 09 Mar 2019, 00:16

Re: So I had put accel-ppp live with heavy load

Post by lbw »

Yep, as per the following config:

[core]
...
thread-count=8

You should have one thread per core. I haven't turned my mind to if one thread per hyper-thread would be better.
hashbang
Posts: 135
Joined: 12 Jul 2015, 10:28

Re: So I had put accel-ppp live with heavy load

Post by hashbang »

lbw wrote: 23 Feb 2020, 02:56 Yep, as per the following config:

[core]
...
thread-count=8

You should have one thread per core. I haven't turned my mind to if one thread per hyper-thread would be better.
thanx
crashed after 12 days of uptime pppoe sessions 3500 bw 4gbps
drops on rx of ethernet interfaces
cpus usage near 100% on 8 cores rest 16 cores free
Screenshot from 2020-02-29 16-30-48.png
Screenshot from 2020-02-29 16-30-48.png (192.56 KiB) Viewed 43981 times
how to know which application is using sirqxxx ?
increased threads from 4 to 12
thanx
hashbang
Posts: 135
Joined: 12 Jul 2015, 10:28

Re: So I had put accel-ppp live with heavy load

Post by hashbang »

hashbang wrote: 20 Feb 2020, 15:04
dimka88 wrote: 18 Feb 2020, 13:28 Show please output `cat /proc/interrupts`
I see on the cores `CPU5` and `CPU17` abnormal load. What exactly the connection type are using (pptp,l2tp,pppoe,ipoe)?
If this is L2 traffic, you need enable RPS

Code: Select all

echo ff > /sys/class/net/ethX/queues/rx-0/rps_cpus 
Where ethX your NICs
ps:// I think on this server you can serve more 10K connections
ty
done as mentioned above sirq seems balanced now. will post after few days of test
is value ff for 24 cores ?
thanx
Post Reply