Доброго времени суток!
Тему поднимал на наге. Было подозрение на сервер. Вот уже новое железо. Падает всё так же.
kernel:[ 8239.012639] ------------[ cut here ]------------
Message from syslogd@bras04 at Oct 25 05:40:14 ...
kernel:[ 8239.012738] invalid opcode: 0000 [#1] SMP
Message from syslogd@bras04 at Oct 25 05:40:14 ...
kernel:[ 8239.014550] Stack:
Message from syslogd@bras04 at Oct 25 05:40:14 ...
kernel:[ 8239.014882] Call Trace:
Message from syslogd@bras04 at Oct 25 05:40:14 ...
kernel:[ 8239.016025] Code: 39 eb 75 1e 48 8b 5d 20 48 3b 1c 24 c7 45 60 01 00 00 00 75 0d 41 8b 04 24 48 29 45 30 e9 9a 00 00 00 41 8b 46 18 39 43 20 72 2d <0f> 0b 8b 7b 24 41 8b 46 0c ff c2 41 8b 34 24 89 53 20 0f af c7
Логи во вложении. Сначала были ошибки cannot bind address. Это я nas-ip-address не правильно указал. Потом поправил и рестартанул accel-ppp.
Дальше вот такая ошибка
[2014-10-25 05:37:04]: error: ipoe: netlink error: No buffer space available
И следом
[2014-10-25 05:37:04]: info: terminate, sig = 15
Это наверно и убило систему.
Во втором логе много
[2014-10-25 05:37:05]: error: vlan44: ipoe: nl_create: error talking to kernel
[2014-10-25 05:37:05]: error: vlan44: ipoe: missing IPOE_ATTR_IFINDEX attribute
[2014-10-25 05:37:05]: error: vlan44: ipoe: failed to create interface
[2014-10-25 05:37:05]: info: vlan44: ipoe: session finished
Ядро 3.2.35
accel-ppp ffa229d08ddc026433645dee724276a0a850708c
debug пока не успел снять.
Падение accel-pppd. (ffa229d08ddc026433645dee724276a0a8507)
Падение accel-pppd. (ffa229d08ddc026433645dee724276a0a8507)
- Attachments
-
- logs.tar.bz2
- (5.85 KiB) Downloaded 106 times
Re: Падение accel-pppd. (ffa229d08ddc026433645dee724276a0a85
памяти не хватает ?emp wrote:Дальше вот такая ошибка
[2014-10-25 05:37:04]: error: ipoe: netlink error: No buffer space available
Re: Падение accel-pppd. (ffa229d08ddc026433645dee724276a0a85
памяти 16gb. больше похоже что каких то сетевых буферов не хватает.
Re: Падение accel-pppd. (ffa229d08ddc026433645dee724276a0a85
Судя из лога у вас более 450 обращений в секунду с отрицательным результатом, у меня самого получалось создать с помощью dhcdrop подобную утечку памяти и падение. Надо анализировать память. Мне кажется сервер не может выделить память для буфера, так как все съедено чем то.
Re: Падение accel-pppd. (ffa229d08ddc026433645dee724276a0a85
обновился до 421dac7884ab3b7253ba942aa05983e47289a1a5
заметил что accel-ppp тепереь перезапускается. корка http://dev.zra.ru/files/dump-2014-10-27-01.tar.bz2
[2014-10-28 07:25:41]: msg: accel-ppp version 421dac7884ab3b7253ba942aa05983e47289a1a5
[2014-10-28 07:27:08]: msg: accel-ppp version 421dac7884ab3b7253ba942aa05983e47289a1a5
[2014-10-28 07:27:08]: error: ipoe: netlink error: No buffer space available
[2014-10-28 07:27:26]: error: ipoe: netlink error: No buffer space available
[2014-10-28 07:28:55]: msg: accel-ppp version 421dac7884ab3b7253ba942aa05983e47289a1a5
[2014-10-28 07:28:55]: error: ipoe: netlink error: No buffer space available
[2014-10-28 07:29:16]: error: ipoe: netlink error: No buffer space available
[2014-10-28 07:30:45]: msg: accel-ppp version 421dac7884ab3b7253ba942aa05983e47289a1a5
[2014-10-28 07:30:45]: error: ipoe: netlink error: No buffer space available
[2014-10-28 07:31:06]: error: ipoe: netlink error: No buffer space available
[2014-10-28 07:32:35]: msg: accel-ppp version 421dac7884ab3b7253ba942aa05983e47289a1a5
[2014-10-28 07:32:35]: error: ipoe: netlink error: No buffer space available
[2014-10-28 07:32:57]: error: ipoe: netlink error: No buffer space available
[2014-10-28 07:34:33]: msg: accel-ppp version 421dac7884ab3b7253ba942aa05983e47289a1a5
[2014-10-28 07:34:33]: error: ipoe: netlink error: No buffer space available
[2014-10-28 07:34:56]: error: ipoe: netlink error: No buffer space available
заметил что accel-ppp тепереь перезапускается. корка http://dev.zra.ru/files/dump-2014-10-27-01.tar.bz2
[2014-10-28 07:25:41]: msg: accel-ppp version 421dac7884ab3b7253ba942aa05983e47289a1a5
[2014-10-28 07:27:08]: msg: accel-ppp version 421dac7884ab3b7253ba942aa05983e47289a1a5
[2014-10-28 07:27:08]: error: ipoe: netlink error: No buffer space available
[2014-10-28 07:27:26]: error: ipoe: netlink error: No buffer space available
[2014-10-28 07:28:55]: msg: accel-ppp version 421dac7884ab3b7253ba942aa05983e47289a1a5
[2014-10-28 07:28:55]: error: ipoe: netlink error: No buffer space available
[2014-10-28 07:29:16]: error: ipoe: netlink error: No buffer space available
[2014-10-28 07:30:45]: msg: accel-ppp version 421dac7884ab3b7253ba942aa05983e47289a1a5
[2014-10-28 07:30:45]: error: ipoe: netlink error: No buffer space available
[2014-10-28 07:31:06]: error: ipoe: netlink error: No buffer space available
[2014-10-28 07:32:35]: msg: accel-ppp version 421dac7884ab3b7253ba942aa05983e47289a1a5
[2014-10-28 07:32:35]: error: ipoe: netlink error: No buffer space available
[2014-10-28 07:32:57]: error: ipoe: netlink error: No buffer space available
[2014-10-28 07:34:33]: msg: accel-ppp version 421dac7884ab3b7253ba942aa05983e47289a1a5
[2014-10-28 07:34:33]: error: ipoe: netlink error: No buffer space available
[2014-10-28 07:34:56]: error: ipoe: netlink error: No buffer space available
Re: Падение accel-pppd. (ffa229d08ddc026433645dee724276a0a85
открой корку у себя и выполни thread apply all bt full
Re: Падение accel-pppd. (ffa229d08ddc026433645dee724276a0a85
чегото она у меня не открывается. либо я чтото не так делаю.
# gdb core
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
"/tmp/core": not in executable format: File format not recognized
(gdb) thread apply all bt full
(gdb)
# gdb core
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
"/tmp/core": not in executable format: File format not recognized
(gdb) thread apply all bt full
(gdb)
Re: Падение accel-pppd. (ffa229d08ddc026433645dee724276a0a85
gdb path/to/the/binary path/to/the/core
gdb как бы намекает =)
gdb как бы намекает =)
Code: Select all
/tmp/core": not in executable format: File format not recognized
Re: Падение accel-pppd. (ffa229d08ddc026433645dee724276a0a85
# gdb /usr/sbin/accel-pppd /tmp/core
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/accel-pppd...done.
[New LWP 80687]
warning: Can't read pathname for load map: Ошибка ввода/вывода.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/accel-pppd -d --dump /tmp -p /var/run/accel-pppd.pid -c /etc/accel-pp'.
Program terminated with signal 6, Aborted.
#0 0x00007faa9d62d165 in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) thread apply all bt full
Thread 1 (Thread 0x7faa9b1a0700 (LWP 80687)):
#0 0x00007faa9d62d165 in raise () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1 0x00007faa9d6303e0 in abort () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#2 0x000000000042b226 in sigsegv (num=11) at /home/emp/accel-ppp/accel-ppp-code/accel-pppd/main.c:203
cmd = "gdb -x cmd-1414463830 /usr/sbin/accel-pppd 67686 > dump-1414463830", '\000' <repeats 4029 times>
fname = "cmd-1414463830", '\000' <repeats 113 times>
exec_file = "/usr/sbin/accel-pppd", '\000' <repeats 4075 times>
lim = {rlim_cur = 18446744073709551615, rlim_max = 18446744073709551615}
pid = 0
status = 0
#3 <signal handler called>
No symbol table info available.
#4 0x0000000000426fee in iproute_get (dst=1285983823, gw=0x0) at /home/emp/accel-ppp/accel-ppp-code/accel-pppd/libnetlink/iputils.c:430
req = {n = {nlmsg_len = 0, nlmsg_type = 8, nlmsg_flags = 4, nlmsg_seq = 1500, nlmsg_pid = 1769480}, r = {rtm_family = 0 '\000', rtm_dst_len = 0 '\000', rtm_src_len = 0 '\000',
rtm_tos = 0 '\000', rtm_table = 9 '\t', rtm_protocol = 0 '\000', rtm_scope = 6 '\006', rtm_type = 0 '\000', rtm_flags = 1886351214},
buf = "\000\000\000\000$\000\016", '\000' <repeats 29 times>, "\004\210\377\377\n\000\001\000\000\000\000\000\000\000\000\000\n\000\002\000\000\000\000\000\000\000\000\000`\000\a", '\000' <repeats 93 times>"\274, \000\027", '\000' <repeats 185 times>, "0\002\032\000l\000\002\000h\000\001\000\001", '\000' <repeats 11 times>, "\001\000\000\000\001\000\000\000\001\000\000\000\001\000\000\000\000\000\000\000\001", '\000' <repeats 67 times>"\300, \001\n\000\b\000\001\000\000\000\000\000\024\000\005\000\377\377\000\000h-\325\000\060\222\000\000\350\003\000\000x\000\002\000\000\000\000\000@\000\000\000\334\005\000\000\001\000\000\000\001\000\000\000\001\000\000\000\001\000\000\000\003\000\000\000\240\017\000\000\350\003\000\000\000\000\000\000\200:\t\000\200Q\001\000\003\000\000\000"...}
r = 0x0
tb = {0x0, 0x0, 0x0, 0x7faa9b19ecdc, 0x0, 0x0, 0x7faa9b19ed98, 0x7faa9b19ec2c, 0x0, 0x0, 0x10000003e8, 0xffffc164544f0267, 0x171100010000, 0x1010, 0x656f70690003000a,
0xd000800000032, 0x10000500000064}
len = 0
res = 0
#5 0x00007faa9cfcb7c6 in __ipoe_session_start (ses=0xf58b38) at /home/emp/accel-ppp/accel-ppp-code/accel-pppd/ctrl/ipoe/ipoe.c:727
No locals.
#6 0x00007faa9cfcaef1 in auth_result (ses=0xf58b38, r=0) at /home/emp/accel-ppp/accel-ppp-code/accel-pppd/ctrl/ipoe/ipoe.c:544
username = 0xf59848 "79.x.x.76"
#7 0x00007faa9cdb8de2 in rad_auth_finalize (rpd=0xf59d88, r=0) at /home/emp/accel-ppp/accel-ppp-code/accel-pppd/radius/auth.c:150
No locals.
#8 0x00007faa9cdb9058 in rad_auth_recv (req=0xf5d9f8) at /home/emp/accel-ppp/accel-ppp-code/accel-pppd/radius/auth.c:196
pack = 0x1217348
dt = 23799
#9 0x00007faa9cdb6083 in rad_req_read (h=0xf5da08) at /home/emp/accel-ppp/accel-ppp-code/accel-pppd/radius/req.c:421
req = 0xf5d9f8
pack = 0x1217348
#10 0x00007faa9e849023 in ctx_thread (ctx=0xf58dd8) at /home/emp/accel-ppp/accel-ppp-code/accel-pppd/triton/triton.c:217
h = 0xf5f368
t = 0xf58bf0
call = 0x7faa9415c328
tt = 140370780653636
events = 1
---Type <return> to continue, or q <return> to quit---
#11 0x00007faa9e848dbf in triton_thread (thread=0xf4aa08) at /home/emp/accel-ppp/accel-ppp-code/accel-pppd/triton/triton.c:159
set = {__val = {516, 0 <repeats 15 times>}}
sig = 10
need_free = 0
#12 0x00007faa9e426b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#13 0x00007faa9d6d77bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#14 0x0000000000000000 in ?? ()
No symbol table info available.
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/accel-pppd...done.
[New LWP 80687]
warning: Can't read pathname for load map: Ошибка ввода/вывода.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/accel-pppd -d --dump /tmp -p /var/run/accel-pppd.pid -c /etc/accel-pp'.
Program terminated with signal 6, Aborted.
#0 0x00007faa9d62d165 in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) thread apply all bt full
Thread 1 (Thread 0x7faa9b1a0700 (LWP 80687)):
#0 0x00007faa9d62d165 in raise () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1 0x00007faa9d6303e0 in abort () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#2 0x000000000042b226 in sigsegv (num=11) at /home/emp/accel-ppp/accel-ppp-code/accel-pppd/main.c:203
cmd = "gdb -x cmd-1414463830 /usr/sbin/accel-pppd 67686 > dump-1414463830", '\000' <repeats 4029 times>
fname = "cmd-1414463830", '\000' <repeats 113 times>
exec_file = "/usr/sbin/accel-pppd", '\000' <repeats 4075 times>
lim = {rlim_cur = 18446744073709551615, rlim_max = 18446744073709551615}
pid = 0
status = 0
#3 <signal handler called>
No symbol table info available.
#4 0x0000000000426fee in iproute_get (dst=1285983823, gw=0x0) at /home/emp/accel-ppp/accel-ppp-code/accel-pppd/libnetlink/iputils.c:430
req = {n = {nlmsg_len = 0, nlmsg_type = 8, nlmsg_flags = 4, nlmsg_seq = 1500, nlmsg_pid = 1769480}, r = {rtm_family = 0 '\000', rtm_dst_len = 0 '\000', rtm_src_len = 0 '\000',
rtm_tos = 0 '\000', rtm_table = 9 '\t', rtm_protocol = 0 '\000', rtm_scope = 6 '\006', rtm_type = 0 '\000', rtm_flags = 1886351214},
buf = "\000\000\000\000$\000\016", '\000' <repeats 29 times>, "\004\210\377\377\n\000\001\000\000\000\000\000\000\000\000\000\n\000\002\000\000\000\000\000\000\000\000\000`\000\a", '\000' <repeats 93 times>"\274, \000\027", '\000' <repeats 185 times>, "0\002\032\000l\000\002\000h\000\001\000\001", '\000' <repeats 11 times>, "\001\000\000\000\001\000\000\000\001\000\000\000\001\000\000\000\000\000\000\000\001", '\000' <repeats 67 times>"\300, \001\n\000\b\000\001\000\000\000\000\000\024\000\005\000\377\377\000\000h-\325\000\060\222\000\000\350\003\000\000x\000\002\000\000\000\000\000@\000\000\000\334\005\000\000\001\000\000\000\001\000\000\000\001\000\000\000\001\000\000\000\003\000\000\000\240\017\000\000\350\003\000\000\000\000\000\000\200:\t\000\200Q\001\000\003\000\000\000"...}
r = 0x0
tb = {0x0, 0x0, 0x0, 0x7faa9b19ecdc, 0x0, 0x0, 0x7faa9b19ed98, 0x7faa9b19ec2c, 0x0, 0x0, 0x10000003e8, 0xffffc164544f0267, 0x171100010000, 0x1010, 0x656f70690003000a,
0xd000800000032, 0x10000500000064}
len = 0
res = 0
#5 0x00007faa9cfcb7c6 in __ipoe_session_start (ses=0xf58b38) at /home/emp/accel-ppp/accel-ppp-code/accel-pppd/ctrl/ipoe/ipoe.c:727
No locals.
#6 0x00007faa9cfcaef1 in auth_result (ses=0xf58b38, r=0) at /home/emp/accel-ppp/accel-ppp-code/accel-pppd/ctrl/ipoe/ipoe.c:544
username = 0xf59848 "79.x.x.76"
#7 0x00007faa9cdb8de2 in rad_auth_finalize (rpd=0xf59d88, r=0) at /home/emp/accel-ppp/accel-ppp-code/accel-pppd/radius/auth.c:150
No locals.
#8 0x00007faa9cdb9058 in rad_auth_recv (req=0xf5d9f8) at /home/emp/accel-ppp/accel-ppp-code/accel-pppd/radius/auth.c:196
pack = 0x1217348
dt = 23799
#9 0x00007faa9cdb6083 in rad_req_read (h=0xf5da08) at /home/emp/accel-ppp/accel-ppp-code/accel-pppd/radius/req.c:421
req = 0xf5d9f8
pack = 0x1217348
#10 0x00007faa9e849023 in ctx_thread (ctx=0xf58dd8) at /home/emp/accel-ppp/accel-ppp-code/accel-pppd/triton/triton.c:217
h = 0xf5f368
t = 0xf58bf0
call = 0x7faa9415c328
tt = 140370780653636
events = 1
---Type <return> to continue, or q <return> to quit---
#11 0x00007faa9e848dbf in triton_thread (thread=0xf4aa08) at /home/emp/accel-ppp/accel-ppp-code/accel-pppd/triton/triton.c:159
set = {__val = {516, 0 <repeats 15 times>}}
sig = 10
need_free = 0
#12 0x00007faa9e426b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#13 0x00007faa9d6d77bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#14 0x0000000000000000 in ?? ()
No symbol table info available.
Re: Падение accel-pppd. (ffa229d08ddc026433645dee724276a0a85
спасибо, исправил
насчёт error: No buffer space available, глянь в этот момент ip link show | grep ipoe | wc -l
насчёт error: No buffer space available, глянь в этот момент ip link show | grep ipoe | wc -l