- Debian 8 Jessie Linux pppoe 4.4.23 #1 SMP Fri Oct 7 12:23:46 CEST 2016 x86_64 GNU/Linux
- accel-ppp version 1.11.0 amd64
- freeradius 3.0.12
Wondering if anyone experiences anything similar with their accel-ppp installations. Occasionally (every few weeks) randomly on one of my accel-ppp installations radius communication stops completely working:
The radius server itself is fine since other NASes work without problems. It only happens on my accel-ppp NASes.
1) Accel-ppp stops sending accounting updates to radius ( There's no radius packets being sent out at all from accel-ppp - tested with tcpdump )
2) New connected sessions get stuck forever like this:
Code: Select all
#accel-ppp show sessions
| | 24:a4:3c:62:20:9 | 0.0.0.0 | | pppoe | | start | 00:10:34
3) CoA works, I can change parameters of existing connections.
After enabling verbose logging and log level=5, during connection from mac '24:a4:3c:62:20':
Code: Select all
#/var/log/accel-ppp/auth-fail.log
[2017-01-19 15:47:52]: debug: eth5: lcp_layer_init
[2017-01-19 15:47:52]: debug: eth5: auth_layer_init
[2017-01-19 15:47:52]: debug: eth5: ccp_layer_init
[2017-01-19 15:47:52]: debug: eth5: ipcp_layer_init
[2017-01-19 15:47:52]: debug: eth5: ipv6cp_layer_init
[2017-01-19 15:47:52]: debug: eth5: ppp establishing
[2017-01-19 15:47:52]: debug: eth5: lcp_layer_start
[2017-01-19 15:47:52]: info: eth5: send [LCP ConfReq id=1 <auth CHAP-md5> <mru 1492> <magic 4d74d70d>]
[2017-01-19 15:47:52]: info: eth5: recv [LCP ConfReq id=d <mru 1492> <magic d78417f1>]
[2017-01-19 15:47:52]: info: eth5: send [LCP ConfAck id=d ]
[2017-01-19 15:47:55]: debug: eth5: fsm timeout 9
[2017-01-19 15:47:55]: info: eth5: send [LCP ConfReq id=1 <auth CHAP-md5> <mru 1492> <magic 4d74d70d>]
[2017-01-19 15:47:55]: info: eth5: recv [LCP ConfAck id=1 <auth CHAP-md5> <mru 1492> <magic 4d74d70d>]
[2017-01-19 15:47:55]: debug: eth5: lcp_layer_started
[2017-01-19 15:47:55]: debug: eth5: auth_layer_start
[2017-01-19 15:47:55]: info: eth5: send [CHAP Challenge id=1 <f610cf2387e0cc656bcca2654e27945>]
[2017-01-19 15:47:55]: debug: eth5: recv [LCP EchoReq id=0 <magic d78417f1>]
[2017-01-19 15:47:55]: debug: eth5: send [LCP EchoRep id=0 <magic 4d74d70d>]
[2017-01-19 15:47:55]: info: eth5: recv [CHAP Response id=1 <f4b0fc83d84296dddfe6be9d42be6b>, name="user573"]
[2017-01-19 15:47:55]: debug: eth5: radius(1): queue 0x7f890019ee38
[2017-01-19 15:47:56]: debug: eth5: recv [LCP EchoReq id=1 <magic d78417f1>]
[2017-01-19 15:47:56]: debug: eth5: send [LCP EchoRep id=1 <magic 4d74d70d>]
[2017-01-19 15:47:57]: debug: eth5: recv [LCP EchoReq id=2 <magic d78417f1>]
[2017-01-19 15:47:57]: debug: eth5: send [LCP EchoRep id=2 <magic 4d74d70d>]
[2017-01-19 15:47:58]: debug: eth5: recv [LCP EchoReq id=3 <magic d78417f1>]
[2017-01-19 15:47:58]: debug: eth5: send [LCP EchoRep id=3 <magic 4d74d70d>]
[2017-01-19 15:47:59]: debug: eth5: recv [LCP EchoReq id=4 <magic d78417f1>]
[2017-01-19 15:47:59]: debug: eth5: send [LCP EchoRep id=4 <magic 4d74d70d>]
[2017-01-19 15:48:00]: debug: eth5: recv [LCP EchoReq id=5 <magic d78417f1>]
[2017-01-19 15:48:00]: debug: eth5: send [LCP EchoRep id=5 <magic 4d74d70d>]
[2017-01-19 15:48:01]: debug: eth5: recv [LCP EchoReq id=6 <magic d78417f1>]
[2017-01-19 15:48:01]: debug: eth5: send [LCP EchoRep id=6 <magic 4d74d70d>]
[2017-01-19 15:48:02]: debug: eth5: recv [LCP EchoReq id=7 <magic d78417f1>]
[2017-01-19 15:48:02]: debug: eth5: send [LCP EchoRep id=7 <magic 4d74d70d>]
[2017-01-19 15:48:03]: debug: eth5: recv [LCP EchoReq id=8 <magic d78417f1>]
[2017-01-19 15:48:03]: debug: eth5: send [LCP EchoRep id=8 <magic 4d74d70d>]
#/var/log/accel-ppp/accel-ppp.log
[2017-01-19 15:47:52]: debug: ppp226: send [LCP EchoReq id=4f <magic 08c82c8e>]
[2017-01-19 15:47:52]: debug: ppp1314: recv [LCP EchoRep id=53 <magic c2c65a45>]
[2017-01-19 15:47:52]: debug: ppp664: recv [LCP EchoRep id=a2 <magic d0cbb492>]
[2017-01-19 15:47:52]: info: eth5: recv [PPPoE PADI 24:a4:3c:62:20:95 => ff:ff:ff:ff:ff:ff sid=0000 <Service-Name net> <Host-Uniq 000056d2>]
[2017-01-19 15:47:52]: info: eth5: send [PPPoE PADO 00:1c:21:b1:3c:79 => 24:a4:3c:62:20:95 sid=0000 <AC-Name accel-ppp> <Service-Name net> <AC-Cookie $
[2017-01-19 15:47:52]: debug: ppp661: send [LCP EchoReq id=a2 <magic 5963e94c>]
[2017-01-19 15:47:52]: debug: ppp661: recv [LCP EchoRep id=a2 <magic f2728c1e>]
[2017-01-19 15:47:52]: debug: ppp40: send [LCP EchoReq id=b3 <magic 38479dba>]
[2017-01-19 15:47:52]: debug: ppp260: send [LCP EchoReq id=f1 <magic 38bebedd>]
[2017-01-19 15:47:52]: debug: ppp660: send [LCP EchoReq id=a2 <magic 4de64be7>]
[2017-01-19 15:47:52]: info: eth5: recv [PPPoE PADR 24:a4:3c:62:20:95 => 00:1c:21:b1:3c:79 sid=0000 <Service-Name net> <Host-Uniq 000056d2> <AC-Cookie$
[2017-01-19 15:47:52]: info: eth5: send [PPPoE PADS 00:1c:21:b1:3c:79 => 24:a4:3c:62:20:95 sid=1201 <AC-Name accel-ppp> <Service-Name net> <Host-Uniq $
[2017-01-19 15:47:52]: debug: eth5: lcp_layer_init
[2017-01-19 15:47:52]: debug: eth5: auth_layer_init
[2017-01-19 15:47:52]: debug: eth5: ccp_layer_init
[2017-01-19 15:47:52]: debug: eth5: ipcp_layer_init
[2017-01-19 15:47:52]: debug: eth5: ipv6cp_layer_init
[2017-01-19 15:47:52]: debug: eth5: ppp establishing
[2017-01-19 15:47:52]: debug: eth5: lcp_layer_start
[2017-01-19 15:47:52]: info: eth5: send [LCP ConfReq id=1 <auth CHAP-md5> <mru 1492> <magic 4d74d70d>]
[2017-01-19 15:47:52]: debug: ppp660: recv [LCP EchoRep id=a2 <magic 175cb2c8>]
[2017-01-19 15:47:52]: debug: ppp221: send [LCP EchoReq id=c2 <magic 7ee7e4f2>]
[2017-01-19 15:47:52]: debug: ppp40: recv [LCP EchoRep id=b3 <magic 1cbc1d82>]
[2017-01-19 15:47:52]: debug: ppp260: recv [LCP EchoRep id=f1 <magic 747b8cfb>]
[2017-01-19 15:47:52]: debug: ppp7: recv [LCP EchoRep id=42 <magic 2dbf16cd>]
[2017-01-19 15:47:52]: info: eth5: recv [LCP ConfReq id=d <mru 1492> <magic d78417f1>]
[2017-01-19 15:47:52]: info: eth5: send [LCP ConfAck id=d ]
[2017-01-19 15:47:52]: debug: ppp232: send [LCP EchoReq id=95 <magic 04abb9e1>]
[2017-01-19 15:47:52]: debug: ppp866: send [LCP EchoReq id=f0 <magic 5bafd71f>]
It is probably the problem. Radius requests are queued, and never dequeued - causing no requests to go out at all.
accel-ppp show stat
Code: Select all
accel-ppp# show stat
uptime: 34.15:50:25
cpu: 1%
mem(rss/virt): 44208/511724 kB
core:
mempool_allocated: 6913660
mempool_available: 962036
thread_count: 4
thread_active: 1
context_count: 422
context_sleeping: 0
context_pending: 0
md_handler_count: 3486
md_handler_pending: 0
timer_count: 1170
timer_pending: 0
sessions:
starting: 0
active: 411
finishing: 0
pppoe:
starting: 0
active: 411
delayed PADO: 0
recv PADI: 468476
drop PADI: 0
sent PADO: 347042
recv PADR(dup): 40068(98)
sent PADS: 37035
filtered: 0
radius(1, 10.1.1.2):
state: active
fail count: 0
request count: 50
queue length: 751
auth sent: 55539
auth lost(total/5m/1m): 27924/0/0
auth avg query time(5m/1m): 0/0 ms
acct sent: 29663
acct lost(total/5m/1m): 744/0/0
acct avg query time(5m/1m): 0/0 ms
interim sent: 3825115
interim lost(total/5m/1m): 3106/0/0
interim avg query time(5m/1m): 0/0 ms
This looks suspicious:
https://sourceforge.net/p/accel-ppp/cod ... req.c#l355
and this:
https://sourceforge.net/p/accel-ppp/cod ... cct.c#l386
and this: https://sourceforge.net/p/accel-ppp/cod ... erv.c#l376
Code: Select all
if (req->try++ == conf_max_try) {
...
Code: Select all
if (req->try == conf_max_try) {
...
Code: Select all
if (... >= conf_max_try)
accel-ppp.conf used:
Code: Select all
[modules]
log_file
#log_syslog
pppoe
auth_chap_md5
radius
shaper
#connlimit
[core]
log-error=/var/log/accel-ppp/core.log
thread-count=4
[common]
#single-session=deny
[ppp]
verbose=1
min-mtu=1280
max-mtu=1492
#mtu=1492
#mru=1492
ipv4=require
ipv6=deny
ipv6-intf-id=0:0:0:1
ipv6-peer-intf-id=0:0:0:2
ipv6-accept-peer-intf-id=1
lcp-echo-interval=12
lcp-echo-timeout=60
unit-cache=1
mppe=deny
ccp=0
[pppoe]
verbose=1
called-sid=mac
interface=eth5
interface=eth5.2011
pado-delay=0,50:500,60:550,70:600,80:650,90:700,100:750,110:800,120:850,130:900,140:950,150:1000,160:1050,170:1100,180:1150,190:1200,200:1250,210:1300,220:1350,230:1400,240:1450,250:1500,260:1550,270:1600,280:1650,290:1700,300:1750,310:1800,320:1850,330:1900,340:1950,350:2000,360:2050,370:2100,380:2150,390:2200,400:2250,410:2300,420:2350,430:2400,440:2450,450:2500,460:2550,470:2600,480:2650,490:2700,500:2750,510:2800,520:2850,530:2900,540:2950,550:3000
mppe=deny
[dns]
dns1=8.8.4.4
dns2=8.8.8.8
[radius]
nas-identifier=accel-ppp
nas-ip-address=10.1.1.2
gw-ip-address=10.1.1.1
server=10.1.1.2,4LTPv4BJ8gfdgffg324243oFDosdf,auth-port=1812,acct-port=1813,req-limit=50,fail-timeout=0,max-fail=10,weight=1
dae-server=10.1.1.2:3799,hFdxbwfnLDFFDSfs234oiOQdefdLS
acct-interim-interval=600
acct-timeout=0
verbose=0
[log]
log-file=/var/log/accel-ppp/accel-ppp.log
log-emerg=/var/log/accel-ppp/emerg.log
log-fail-file=/var/log/accel-ppp/auth-fail.log
#log-debug=/dev/stdout
#syslog=accel-pppd,daemon
#log-tcp=127.0.0.1:3000
copy=1
#color=1
#per-user-dir=per_user
#per-session-dir=per_session
#per-session=1
level=3
[shaper]
time-range=1,11:00-23:00
time-range=2,23:00-11:00
attr=Filter-Id
up-limiter=police
down-limiter=tbf
verbose=1
#down-limiter=htb
#leaf-qdisc=fq_codel 10240 1024 5 100 1514 ecn
[cli]
verbose=1
telnet=10.1.1.2:5010
tcp=127.0.0.1:2001
password=qPpLgfdFSfdsfds32423122Fgfd
history-file=/dev/null
[snmp]
master=0
agent-name=accel-ppp
[connlimit]
limit=10/min
burst=3
timeout=60