L2TP internal queue is never drained resulting in no new sessions allowed

L2TP related questions
Post Reply
lbw
Posts: 27
Joined: 09 Mar 2019, 00:16

L2TP internal queue is never drained resulting in no new sessions allowed

Post by lbw »

Hi

There appears to be a bug in the L2TP code where the queue is never drained of old packets. I am seeing this:

[2020-01-02 11:39:13]: debug: l2tp tunnel 48946-60025 (1.2.3.4:1701): 1 message processed from reception queue
[2020-01-02 11:39:13]: debug: l2tp tunnel 48946-60025 (1.2.3.4:1701): 0 message sent from send queue
[2020-01-02 11:39:13]: info: l2tp tunnel 48946-60025 (1.2.34:1701): no message sent while processing the send queue (677763 outstanding messages): peer's receive window is full (32000 messages)

What's interesting is that existing sessions seem to be able to pass packets okay, but no new sessions can be created in this state:

[2020-01-02 11:39:13]: info: l2tp session 57883-32220, 65237-59016: deleting session
[2020-01-02 11:39:13]: info: l2tp session 57883-32220, 65237-59016: session destroyed
[2020-01-02 11:39:13]: info: l2tp session 48946-60025, 59120-24016: session establishment timeout, disconnecting session
[2020-01-02 11:39:13]: info: l2tp session 48946-60025, 59120-24016: sending CDN (res: 10, err: 0)

I'm trying to find the issue in the code now, but if anyone else wants to look at the same time, that'd be appreciated.

Thanks
lbw
Posts: 27
Joined: 09 Mar 2019, 00:16

Re: L2TP internal queue is never drained resulting in no new sessions allowed

Post by lbw »

Looking further at this, because it keeps happening, I've gathered the following logs (with some extra debugging entires added by me):

[2020-05-27 10:25:01]: info: l2tp tunnel 20306-17769 (1.2.3.233:1701): retransmission #1
[2020-05-27 10:25:01]: info: l2tp tunnel 20306-17769 (1.2.3.233:1701): retransmit (timeout) [L2TP tid=17769 sid=65292 Ns=30596 Nr=21442 <Message-Type Incoming-Call-Reply> <Assigned-Session-ID -17061>]
[2020-05-27 10:25:01]: info: l2tp tunnel 20306-17769 (1.2.3.233:1701): send [L2TP tid=17769 sid=65292 Ns=30596 Nr=35335 <Message-Type Incoming-Call-Reply> <Assigned-Session-ID -17061>]
[2020-05-27 10:25:01]: warn: l2tp: unknown avp received (type=46, M=0)
[2020-05-27 10:25:01]: debug: l2tp tunnel 20306-17769 (1.2.3.233:1701): 1 message added to reception queue
[2020-05-27 10:25:01]: debug: l2tp tunnel 20306-17769 (1.2.3.233:1701): 1 message acked by peer
[2020-05-27 10:25:01]: info: l2tp tunnel 20306-17769 (1.2.3.233:1701): recv [L2TP tid=20306 sid=0 Ns=35335 Nr=30597 <Message-Type Call-Disconnect-Notify> <Result-Code> <Assigned-Session-ID 5449>]
[2020-05-27 10:25:01]: warn: l2tp tunnel 20306-17769 (1.2.3.233:1701): discarding CDN with no Session ID: disconnecting sessions using Assigned Session ID is currently not supported
[2020-05-27 10:25:01]: debug: l2tp tunnel 20306-17769 (1.2.3.233:1701): 1 message processed from reception queue
[2020-05-27 10:25:01]: info: l2tp tunnel 20306-17769 (1.2.3.233:1701): send [L2TP tid=17769 sid=56139 Ns=31620 Nr=35336 <Message-Type Incoming-Call-Reply> <Assigned-Session-ID -24990>]
[2020-05-27 10:25:01]: debug: l2tp tunnel 20306-17769 (1.2.3.233:1701): l2tp_tunnel_push_sendqueue at line 640 dec send_queue_len = 265816
[2020-05-27 10:25:01]: debug: l2tp tunnel 20306-17769 (1.2.3.233:1701): 1 message sent from send queue
[2020-05-27 10:25:01]: debug: l2tp tunnel 47704-8558 (1.2.3.237:1701): 1 message added to reception queue
[2020-05-27 10:25:01]: debug: l2tp tunnel 47704-8558 (1.2.3.237:1701): 0 message acked by peer
[2020-05-27 10:25:01]: info: l2tp tunnel 47704-8558 (1.2.3.237:1701): recv [L2TP tid=47704 sid=0 Ns=36678 Nr=33885 <Message-Type Incoming-Call-Request> <Call-Serial-Number 1234> <Assigned-Session-ID 23977> <Bearer-Type 0> <Calling-Number 1234> <Called-Number 1234>]
[2020-05-27 10:25:01]: info: l2tp tunnel 47704-8558 (1.2.3.237:1701): handling ICRQ
[2020-05-27 10:25:01]: info: l2tp session 47704-8558, 25026-23977: sending ICRP
[2020-05-27 10:25:01]: debug: l2tp tunnel 47704-8558 (1.2.3.237:1701): l2tp_tunnel_send at line 692 inc send_queue_len = 232376
[2020-05-27 10:25:01]: info: l2tp tunnel 47704-8558 (1.2.3.237:1701): new session 25026-23977 created following reception of ICRQ
[2020-05-27 10:25:01]: debug: l2tp tunnel 47704-8558 (1.2.3.237:1701): 1 message processed from reception queue
[2020-05-27 10:25:01]: debug: l2tp tunnel 47704-8558 (1.2.3.237:1701): 0 message sent from send queue
[2020-05-27 10:25:01]: info: l2tp tunnel 47704-8558 (1.2.3.237:1701): no message sent while processing the send queue (232376 outstanding messages): peer's receive window is full (32000 messages)
[2020-05-27 10:25:01]: debug: l2tp tunnel 47704-8558 (1.2.3.237:1701): sending ZLB
[2020-05-27 10:25:01]: debug: l2tp tunnel 47704-8558 (1.2.3.237:1701): send [L2TP tid=8558 sid=0 Ns=349 Nr=36679]
[2020-05-27 10:25:01]: info: l2tp session 20306-17769, 65096-37182: session establishment timeout, disconnecting session
[2020-05-27 10:25:01]: info: l2tp session 20306-17769, 65096-37182: sending CDN (res: 10, err: 0)
[2020-05-27 10:25:01]: debug: l2tp tunnel 20306-17769 (1.2.3.233:1701): l2tp_tunnel_send at line 692 inc send_queue_len = 265817
[2020-05-27 10:25:01]: debug: l2tp tunnel 20306-17769 (1.2.3.233:1701): 0 message sent from send queue
[2020-05-27 10:25:01]: info: l2tp tunnel 20306-17769 (1.2.3.233:1701): no message sent while processing the send queue (265817 outstanding messages): peer's receive window is full (1024 messages)
[2020-05-27 10:25:01]: info: l2tp session 20306-17769, 65096-37182: deleting session
[2020-05-27 10:25:01]: info: l2tp session 20306-17769, 65096-37182: session destroyed
[2020-05-27 10:25:01]: info: l2tp session 20306-17769, 22573-6851: session establishment timeout, disconnecting session
[2020-05-27 10:25:01]: info: l2tp session 20306-17769, 22573-6851: sending CDN (res: 10, err: 0)
[2020-05-27 10:25:01]: debug: l2tp tunnel 20306-17769 (1.2.3.233:1701): l2tp_tunnel_send at line 692 inc send_queue_len = 265818
[2020-05-27 10:25:01]: debug: l2tp tunnel 20306-17769 (1.2.3.233:1701): 0 message sent from send queue

This renders the entire accel-ppp instance broken (it starts to drop sessions slowly) and doesn't accept new sessions.

The value of conn->retransmit never exceeds 2 so it is not greater than conn->max_retransmit and never causes the tunnel to be torn down. However, the window remains full for hours:

$ fgrep -c 'window is full' accel-ppp.log
783581

My assumption is that a ZLB is not received for the messages in the queue and so the window remains full when those messages should be discarded and the tunnel torn down.

Looking at the code, I can't yet explain why the retransmit never exceeds 2.

Thoughts?
Post Reply