Bugzilla – Bug 130955
Memory leak (possibly in orinoco kernel driver)
Last modified: 2006-03-01 09:47:15 UTC
I'm running Suse 10.0 on my laptop (Toshiba Satellite Pro 6100) with all of the latest patches and using it as a web/ssh server (using the integrated 802.11b wireless card). I've come across an issue that forces me to reboot the machine about every 10 days. The memory is gradually consumed until all free memory is filled, and everything must be done in swap (which makes things unacceptably slow). I tried to track down the offending process using top, but the memory didn't seem to be associated with any process. I then used slabtop, and determined that the 'size-64' allocations were out of control. I noticed that the number of allocations increased with network traffic, and hence I believe the bug is in the orinoco_cs or orinoco driver for my network card. Here's an illustration of the bug: andy@vandemar:~> uptime 10:41pm up 8:02, 1 user, load average: 0.00, 0.00, 0.00 From slabtop: OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 117059 117058 99% 0.06K 1919 61 7676K size-64 To generate some network load from another machine on the network (left it running for a couple of minutes) andy@neverwhere:~$ sudo ping -f 10.0.1.4 andy@vandemar:~> uptime 10:47pm up 8:08, 1 user, load average: 0.00, 0.00, 0.00 From slabtop: OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 118584 118583 99% 0.06K 1944 61 7776K size-64 Notice how many extra 'size-64' objects have been allocated. As far as I can tell, these are never freed, and aren't associated with any process. As you can probably imagine, with heavy network traffic over a couple of days, available memory is quickly consumed as these 'size-64' objects are created but not released, and there is no way to recover it except by a reboot. Here's the current kernel modules that are loaded: vandemar:~ # lsmod Module Size Used by ipt_pkttype 1664 1 ipt_LOG 6912 9 ipt_limit 2304 9 cpufreq_ondemand 6044 0 cpufreq_userspace 4444 0 cpufreq_powersave 1792 0 speedstep_ich 5004 0 speedstep_lib 4228 1 speedstep_ich freq_table 4612 1 speedstep_ich toshiba_acpi 5908 0 button 7056 0 battery 10244 0 ac 5252 0 af_packet 21384 2 edd 9824 0 snd_pcm_oss 59168 0 snd_mixer_oss 18944 1 snd_pcm_oss snd_seq 51984 0 snd_seq_device 8588 1 snd_seq orinoco_cs 13448 1 orinoco 38548 1 orinoco_cs hermes 7296 2 orinoco_cs,orinoco pcmcia 37176 1 orinoco_cs firmware_class 9856 1 pcmcia ip6t_REJECT 5504 3 ipt_REJECT 5632 3 ipt_state 1920 12 e100 35456 0 mii 5504 1 e100 yenta_socket 23820 5 rsrc_nonstatic 12800 1 yenta_socket pcmcia_core 39952 3 pcmcia,yenta_socket,rsrc_nonstatic generic 4484 0 [permanent] snd_intel8x0 33408 0 iptable_mangle 2688 0 snd_ac97_codec 90876 1 snd_intel8x0 snd_ac97_bus 2432 1 snd_ac97_codec iptable_nat 22228 0 snd_pcm 93064 3 snd_pcm_oss,snd_intel8x0,snd_ac97_codec snd_timer 24452 2 snd_seq,snd_pcm snd 60420 8 snd_pcm_oss,snd_mixer_oss,snd_seq,snd_seq_device,snd_intel8x0,snd_ac97_codec,snd_pcm,snd_timer soundcore 9184 1 snd snd_page_alloc 10632 2 snd_intel8x0,snd_pcm iptable_filter 2816 1 intel_agp 22044 1 agpgart 33096 1 intel_agp uhci_hcd 32016 0 ip6table_mangle 2304 0 pci_hotplug 26164 0 usbcore 112640 2 uhci_hcd ip_conntrack 42168 2 ipt_state,iptable_nat ip_tables 19456 8 ipt_pkttype,ipt_LOG,ipt_limit,ipt_REJECT,ipt_state,iptable_mangle,iptable_nat,iptable_filter ip6table_filter 2688 1 ip6_tables 18176 3 ip6t_REJECT,ip6table_mangle,ip6table_filter ipv6 242752 26 ip6t_REJECT parport_pc 38980 0 lp 11460 0 parport 33864 2 parport_pc,lp dm_mod 54972 0 ext3 130440 1 jbd 59940 1 ext3 ide_cd 39684 0 cdrom 36896 1 ide_cd fan 4996 0 thermal 14472 0 processor 24252 1 thermal piix 9988 0 [permanent] ide_disk 17152 3 ide_core 122380 4 generic,ide_cd,piix,ide_disk If there's any other information you need, let me know.
Just to show a more normal distribution of 'size-64' objects, and to highlight that the problem is serious, here's the output from slabtop from my desktop system that's been up for a couple of days now (also with very heavy network traffic, as well as constant use, but with a completely different 802.11g network card that uses ndiswrapper): OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 2501 2500 99% 0.06K 41 61 164K size-64 Notice that the number of 'size-64' objects is completely reasonable on this machine.
I just compiled and installed the vanilla kernel.org 2.6.14 kernel, using the config from /boot/config-2.6.13-15-default and the problem has been completely resolved. I've been testing the machine over the past day, with the usual heavy network traffic, and the number 'size-64' objects has remained steady. andy@vandemar:~> uptime 5:14pm up 14:32, 2 users, load average: 0.00, 0.00, 0.00 From slabtop: OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 1121 1060 94% 0.06K 19 59 76K size-64 Compare this to before, on the default 2.6.13-15 kernel. There are over 100 times more 'size-64' objects with the previous kernel! Also, with the 2.6.14 kernel, the number of 'size-64' objects remains steady, unlike the 2.6.13-15 kernel, where they multiplied uncontrollably. It seems that there is definitely a bug in the 2.6.13-15 kernel version.
could you test the latest CVS? (kernel of the day or something like that) Karsten checked in a fix for a orinoco memleak into the sl10.0 kernel on 2005-10-13
no response for several months, closing