Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM / memory leak? #6

Open
cpalmer9 opened this issue Sep 9, 2015 · 7 comments
Open

OOM / memory leak? #6

cpalmer9 opened this issue Sep 9, 2015 · 7 comments

Comments

@cpalmer9
Copy link

cpalmer9 commented Sep 9, 2015

Hi,
We've been running ptm built from here for some time and we see that the memory size for this process continually grows. Eventually Linux oom-killer intervenes. The ptmd process is about 3GB at this stage. It may take several days for it to reach this.
Restarting the process brings it back to < 1MB.

Yesterday I ran:
valgrind --leak-check=yes /usr/sbin/ptmd -l INFO
...for about 16 hours, then eventually hit ^C.
The end of the output was this:

==47300== 2,749,568 (207,616 direct, 2,541,952 indirect) bytes in 6,488 blocks are definitely lost in loss record 138 of 141
==47300== at 0x4C2820A: malloc (vg_replace_malloc.c:296)
==47300== by 0x414D39: csv_encode (in /usr/sbin/ptmd)
==47300== by 0x403660: ptm_status_lldp (in /usr/sbin/ptmd)
==47300== by 0x40A5A2: ptm_conf_get_port_status (in /usr/sbin/ptmd)
==47300== by 0x40A7BA: ptm_conf_ctl_cmd_get_status (in /usr/sbin/ptmd)
==47300== by 0x40A21B: ptm_conf_process_client_query (in /usr/sbin/ptmd)
==47300== by 0x407514: ptm_event_ctl (in /usr/sbin/ptmd)
==47300== by 0x4078DF: ptm_module_handle_event_cb (in /usr/sbin/ptmd)
==47300== by 0x4075C6: ptm_process_ctl (in /usr/sbin/ptmd)
==47300== by 0x408525: main (in /usr/sbin/ptmd)
==47300==
==47300== 4,404,190 (202,976 direct, 4,201,214 indirect) bytes in 6,343 blocks are definitely lost in loss record 141 of 141
==47300== at 0x4C2820A: malloc (vg_replace_malloc.c:296)
==47300== by 0x414905: csv_concat_record (in /usr/sbin/ptmd)
==47300== by 0x40A522: ptm_conf_get_port_status (in /usr/sbin/ptmd)
==47300== by 0x40A7BA: ptm_conf_ctl_cmd_get_status (in /usr/sbin/ptmd)
==47300== by 0x40A21B: ptm_conf_process_client_query (in /usr/sbin/ptmd)
==47300== by 0x407514: ptm_event_ctl (in /usr/sbin/ptmd)
==47300== by 0x4078DF: ptm_module_handle_event_cb (in /usr/sbin/ptmd)
==47300== by 0x4075C6: ptm_process_ctl (in /usr/sbin/ptmd)
==47300== by 0x408525: main (in /usr/sbin/ptmd)
==47300==
==47300== LEAK SUMMARY:
==47300== definitely lost: 1,181,783 bytes in 36,934 blocks
==47300== indirectly lost: 14,435,998 bytes in 451,125 blocks
==47300== possibly lost: 3,408 bytes in 78 blocks
==47300== still reachable: 60,530 bytes in 364 blocks
==47300== suppressed: 0 bytes in 0 blocks
==47300== Reachable blocks (those to which a pointer was found) are not shown.
==47300== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==47300==
==47300== For counts of detected and suppressed errors, rerun with: -v
==47300== Use --track-origins=yes to see where uninitialised values come from
==47300== ERROR SUMMARY: 14471 errors from 32 contexts (suppressed: 7 from 5)

@kanrajag
Copy link
Contributor

Hi Christopher
let me take a look and get back to you.

On Wed, Sep 9, 2015 at 3:46 PM Christopher Palmer notifications@github.com
wrote:

Hi,
We've been running ptm built from here for some time and we see that the
memory size for this process continually grows. Eventually Linux oom-killer
intervenes. The ptmd process is about 3GB at this stage. It may take
several days for it to reach this.
Restarting the process brings it back to < 1MB.

Yesterday I ran:
valgrind --leak-check=yes /usr/sbin/ptmd -l INFO
...for about 16 hours, then eventually hit ^C.
The end of the output was this:

==47300== 2,749,568 (207,616 direct, 2,541,952 indirect) bytes in 6,488
blocks are definitely lost in loss record 138 of 141
==47300== at 0x4C2820A: malloc (vg_replace_malloc.c:296)
==47300== by 0x414D39: csv_encode (in /usr/sbin/ptmd)
==47300== by 0x403660: ptm_status_lldp (in /usr/sbin/ptmd)
==47300== by 0x40A5A2: ptm_conf_get_port_status (in /usr/sbin/ptmd)
==47300== by 0x40A7BA: ptm_conf_ctl_cmd_get_status (in /usr/sbin/ptmd)
==47300== by 0x40A21B: ptm_conf_process_client_query (in /usr/sbin/ptmd)
==47300== by 0x407514: ptm_event_ctl (in /usr/sbin/ptmd)
==47300== by 0x4078DF: ptm_module_handle_event_cb (in /usr/sbin/ptmd)
==47300== by 0x4075C6: ptm_process_ctl (in /usr/sbin/ptmd)
==47300== by 0x408525: main (in /usr/sbin/ptmd)
==47300==
==47300== 4,404,190 (202,976 direct, 4,201,214 indirect) bytes in 6,343
blocks are definitely lost in loss record 141 of 141
==47300== at 0x4C2820A: malloc (vg_replace_malloc.c:296)
==47300== by 0x414905: csv_concat_record (in /usr/sbin/ptmd)
==47300== by 0x40A522: ptm_conf_get_port_status (in /usr/sbin/ptmd)
==47300== by 0x40A7BA: ptm_conf_ctl_cmd_get_status (in /usr/sbin/ptmd)
==47300== by 0x40A21B: ptm_conf_process_client_query (in /usr/sbin/ptmd)
==47300== by 0x407514: ptm_event_ctl (in /usr/sbin/ptmd)
==47300== by 0x4078DF: ptm_module_handle_event_cb (in /usr/sbin/ptmd)
==47300== by 0x4075C6: ptm_process_ctl (in /usr/sbin/ptmd)
==47300== by 0x408525: main (in /usr/sbin/ptmd)
==47300==
==47300== LEAK SUMMARY:
==47300== definitely lost: 1,181,783 bytes in 36,934 blocks
==47300== indirectly lost: 14,435,998 bytes in 451,125 blocks
==47300== possibly lost: 3,408 bytes in 78 blocks
==47300== still reachable: 60,530 bytes in 364 blocks
==47300== suppressed: 0 bytes in 0 blocks
==47300== Reachable blocks (those to which a pointer was found) are not
shown.
==47300== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==47300==
==47300== For counts of detected and suppressed errors, rerun with: -v
==47300== Use --track-origins=yes to see where uninitialised values come
from
==47300== ERROR SUMMARY: 14471 errors from 32 contexts (suppressed: 7 from
5)


Reply to this email directly or view it on GitHub
#6.

kanrajag pushed a commit that referenced this issue Sep 10, 2015
free up the field struct when freeing up a record

Issue #6
#6
@kanrajag
Copy link
Contributor

field structs were not free'd when free'ing up the record.

@cpalmer9 - let me know if this fixes your issue. we can then close it out.

thanks!

@cpalmer9
Copy link
Author

Hi, I'm not able to build an RPM anymore with. Here's what I see:

Making all in src
make[2]: Entering directory `/home/cpalmer/rpm-root/BUILD/ptm-master/src'
gcc -DHAVE_CONFIG_H -I. -I..  -I../lib -Igraph -Ihash  -Wno-enum-compare -std=gnu99 -Wall -Werror -g -O0 -DPTMD_VERSION=\"TMD_VERSION\" -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -MT ptm_lldp.o -MD -MP -MF .deps/ptm_lldp.Tpo -c -o ptm_lldp.o ptm_lldp.c
gcc -DHAVE_CONFIG_H -I. -I..  -I../lib -Igraph -Ihash  -Wno-enum-compare -std=gnu99 -Wall -Werror -g -O0 -DPTMD_VERSION=\"TMD_VERSION\" -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -MT ptm_ctl.o -MD -MP -MF .deps/ptm_ctl.Tpo -c -o ptm_ctl.o ptm_ctl.c
gcc -DHAVE_CONFIG_H -I. -I..  -I../lib -Igraph -Ihash  -Wno-enum-compare -std=gnu99 -Wall -Werror -g -O0 -DPTMD_VERSION=\"TMD_VERSION\" -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -MT ptm_event.o -MD -MP -MF .deps/ptm_event.Tpo -c -o ptm_event.o ptm_event.c
cc1: warning: command line option "-Wno-enum-compare" is valid for C++/ObjC++ but not for C
cc1: warning: command line option "-Wno-enum-compare" is valid for C++/ObjC++ but not for C
cc1: warning: command line option "-Wno-enum-compare" is valid for C++/ObjC++ but not for C
cc1: warnings being treated as errors
ptm_event.c: In function 'main':
ptm_event.c:562: error: ignoring return value of 'fscanf', declared with attribute warn_unused_result
ptm_event.c:581: error: ignoring return value of 'daemon', declared with attribute warn_unused_result
ptm_event.c:584: error: ignoring return value of 'ftruncate', declared with attribute warn_unused_result
make[2]: *** [ptm_event.o] Error 1
make[2]: *** Waiting for unfinished jobs....
mv -f .deps/ptm_ctl.Tpo .deps/ptm_ctl.Po
cc1: warnings being treated as errors
In file included from /usr/include/stdio.h:932,
                 from ptm_event.h:6,
                 from ptm_lldp.c:17:
In function 'snprintf',
    inlined from 'ptm_init_lldp' at ptm_lldp.c:799:
/usr/include/bits/stdio2.h:65: error: call to __builtin___snprintf_chk will always overflow destination buffer
make[2]: *** [ptm_lldp.o] Error 1

Is this something you can address?
Building on Oracle Linux Server release 6.5
3.8.13-26.2.2.el6uek.x86_64
rpm-build-4.8.0-37.el6.x86_64

Additionally (and this is a lesser issue, but I've seen this for some time), "./configure" would show:

checking for agfstout in -lcgraph... yes
checking for lldpctl_atom_dec_ref in -llldpctl... yes
checking for lldpctl_process_conn_buffer in -llldpctl... yes
./configure: line 6616: 0: command not found
configure: creating ./config.status
config.status: creating Makefile
config.status: creating src/Makefile

Would changing "configure.ac" to below help ?
AC_CHECK_LIB(lldpctl, lldpctl_process_conn_buffer, [], AC_MSG_ERROR([Need LLDPD > 0.7.10]))

@kanrajag
Copy link
Contributor

When did this start failing? I can compile clean on a debian box. let me see what i do.

@cpalmer9
Copy link
Author

AFAICT the code has not changed, but in our build, gcc stack protection is now enabled automatically with the -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector flags. From code inspection it looks like it is copying a hostname into a buffer meant for an address string?

@kanrajag
Copy link
Contributor

Thanks. I fixed this right now. let me know if your compilation is successful

@cpalmer9
Copy link
Author

Hi, I needed to make a change on my build SPEC file for it to compile in that environment.

export CFLAGS=-Wno-attributes

...without that flag, I see:

csv.c: In function 'csv_decode':
csv.c:386: warning: 'save1' may be used uninitialized in this function
mv -f .deps/hash.Tpo .deps/hash.Po
gcc -DHAVE_CONFIG_H -I. -I..     -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -MT itimer.o -MD -MP -MF .deps/itimer.Tpo -c -o itimer.o itimer.c
mv -f .deps/csv.Tpo .deps/csv.Po
mv -f .deps/itimer.Tpo .deps/itimer.Po
mv -f .deps/log.Tpo .deps/log.Po
rm -f libptmdep.a
ar cru libptmdep.a csv.o hash.o hashtable.o log.o itimer.o 
ranlib libptmdep.a
make[2]: Leaving directory `/home/cpalmer/rpm-root/BUILD/ptm-master/lib'
Making all in src
make[2]: Entering directory `/home/cpalmer/rpm-root/BUILD/ptm-master/src'
gcc -DHAVE_CONFIG_H -I. -I..  -I../lib -Igraph -Ihash  -Wno-enum-compare -std=gnu99 -Wall -Werror -g -O0 -DPTMD_VERSION=\"TMD_VERSION\" -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -MT ptm_lldp.o -MD -MP -MF .deps/ptm_lldp.Tpo -c -o ptm_lldp.o ptm_lldp.c
gcc -DHAVE_CONFIG_H -I. -I..  -I../lib -Igraph -Ihash  -Wno-enum-compare -std=gnu99 -Wall -Werror -g -O0 -DPTMD_VERSION=\"TMD_VERSION\" -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -MT ptm_ctl.o -MD -MP -MF .deps/ptm_ctl.Tpo -c -o ptm_ctl.o ptm_ctl.c
gcc -DHAVE_CONFIG_H -I. -I..  -I../lib -Igraph -Ihash  -Wno-enum-compare -std=gnu99 -Wall -Werror -g -O0 -DPTMD_VERSION=\"TMD_VERSION\" -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -MT ptm_event.o -MD -MP -MF .deps/ptm_event.Tpo -c -o ptm_event.o ptm_event.c
cc1: warning: command line option "-Wno-enum-compare" is valid for C++/ObjC++ but not for C
cc1: warning: command line option "-Wno-enum-compare" is valid for C++/ObjC++ but not for C
cc1: warning: command line option "-Wno-enum-compare" is valid for C++/ObjC++ but not for C
cc1: warnings being treated as errors
ptm_event.c: In function 'main':
ptm_event.c:562: error: ignoring return value of 'fscanf', declared with attribute warn_unused_result
ptm_event.c:581: error: ignoring return value of 'daemon', declared with attribute warn_unused_result
ptm_event.c:584: error: ignoring return value of 'ftruncate', declared with attribute warn_unused_result
make[2]: *** [ptm_event.o] Error 1
make[2]: *** Waiting for unfinished jobs....
mv -f .deps/ptm_ctl.Tpo .deps/ptm_ctl.Po
mv -f .deps/ptm_lldp.Tpo .deps/ptm_lldp.Po
make[2]: Leaving directory `/home/cpalmer/rpm-root/BUILD/ptm-master/src'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/cpalmer/rpm-root/BUILD/ptm-master'
make: *** [all] Error 2
error: Bad exit status from /var/tmp/rpm-tmp.Vg77jv (%build)
$ gcc --version
gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)

Also, can you run autoreconfig to update the configure file? (still see ./configure: line 6616: 0: command not found)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants