Bug 130955 - Memory leak (possibly in orinoco kernel driver)
Summary: Memory leak (possibly in orinoco kernel driver)
Status: RESOLVED FIXED
Alias: None
Product: SUSE LINUX 10.0
Classification: openSUSE
Component: Kernel (show other bugs)
Version: Final
Hardware: i586 SuSE Linux 10.0
: P3 - Medium : Critical
Target Milestone: ---
Assignee: Andrea Arcangeli
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-10-27 04:04 UTC by Andrew Collins
Modified: 2006-03-01 09:47 UTC (History)
0 users

See Also:
Found By: Customer
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Collins 2005-10-27 04:04:37 UTC
I'm running Suse 10.0 on my laptop (Toshiba Satellite Pro 6100) with all of the latest patches and using it as a web/ssh server (using the integrated 802.11b wireless card).  I've come across an issue that forces me to reboot the machine about every 10 days.  The memory is gradually consumed until all free memory is filled, and everything must be done in swap (which makes things unacceptably slow).  I tried to track down the offending process using top, but the memory didn't seem to be associated with any process.  I then used slabtop, and determined that the 'size-64' allocations were out of control.  I noticed that the number of allocations increased with network traffic, and hence I believe the bug is in the orinoco_cs or orinoco driver for my network card.

Here's an illustration of the bug:

andy@vandemar:~> uptime
10:41pm  up   8:02,  1 user,  load average: 0.00, 0.00, 0.00

From slabtop:
OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
117059 117058  99%    0.06K   1919       61      7676K size-64

To generate some network load from another machine on the network (left it running for a couple of minutes)
andy@neverwhere:~$ sudo ping -f 10.0.1.4

andy@vandemar:~> uptime
10:47pm  up   8:08,  1 user,  load average: 0.00, 0.00, 0.00

From slabtop:
OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
118584 118583  99%    0.06K   1944       61      7776K size-64

Notice how many extra 'size-64' objects have been allocated.  As far as I can tell, these are never freed, and aren't associated with any process.  As you can probably imagine, with heavy network traffic over a couple of days, available memory is quickly consumed as these 'size-64' objects are created but not released, and there is no way to recover it except by a reboot.

Here's the current kernel modules that are loaded:

vandemar:~ # lsmod
Module                  Size  Used by
ipt_pkttype             1664  1
ipt_LOG                 6912  9
ipt_limit               2304  9
cpufreq_ondemand        6044  0
cpufreq_userspace       4444  0
cpufreq_powersave       1792  0
speedstep_ich           5004  0
speedstep_lib           4228  1 speedstep_ich
freq_table              4612  1 speedstep_ich
toshiba_acpi            5908  0
button                  7056  0
battery                10244  0
ac                      5252  0
af_packet              21384  2
edd                     9824  0
snd_pcm_oss            59168  0
snd_mixer_oss          18944  1 snd_pcm_oss
snd_seq                51984  0
snd_seq_device          8588  1 snd_seq
orinoco_cs             13448  1
orinoco                38548  1 orinoco_cs
hermes                  7296  2 orinoco_cs,orinoco
pcmcia                 37176  1 orinoco_cs
firmware_class          9856  1 pcmcia
ip6t_REJECT             5504  3
ipt_REJECT              5632  3
ipt_state               1920  12
e100                   35456  0
mii                     5504  1 e100
yenta_socket           23820  5
rsrc_nonstatic         12800  1 yenta_socket
pcmcia_core            39952  3 pcmcia,yenta_socket,rsrc_nonstatic
generic                 4484  0 [permanent]
snd_intel8x0           33408  0
iptable_mangle          2688  0
snd_ac97_codec         90876  1 snd_intel8x0
snd_ac97_bus            2432  1 snd_ac97_codec
iptable_nat            22228  0
snd_pcm                93064  3 snd_pcm_oss,snd_intel8x0,snd_ac97_codec
snd_timer              24452  2 snd_seq,snd_pcm
snd                    60420  8 snd_pcm_oss,snd_mixer_oss,snd_seq,snd_seq_device,snd_intel8x0,snd_ac97_codec,snd_pcm,snd_timer
soundcore               9184  1 snd
snd_page_alloc         10632  2 snd_intel8x0,snd_pcm
iptable_filter          2816  1
intel_agp              22044  1
agpgart                33096  1 intel_agp
uhci_hcd               32016  0
ip6table_mangle         2304  0
pci_hotplug            26164  0
usbcore               112640  2 uhci_hcd
ip_conntrack           42168  2 ipt_state,iptable_nat
ip_tables              19456  8 ipt_pkttype,ipt_LOG,ipt_limit,ipt_REJECT,ipt_state,iptable_mangle,iptable_nat,iptable_filter
ip6table_filter         2688  1
ip6_tables             18176  3 ip6t_REJECT,ip6table_mangle,ip6table_filter
ipv6                  242752  26 ip6t_REJECT
parport_pc             38980  0
lp                     11460  0
parport                33864  2 parport_pc,lp
dm_mod                 54972  0
ext3                  130440  1
jbd                    59940  1 ext3
ide_cd                 39684  0
cdrom                  36896  1 ide_cd
fan                     4996  0
thermal                14472  0
processor              24252  1 thermal
piix                    9988  0 [permanent]
ide_disk               17152  3
ide_core              122380  4 generic,ide_cd,piix,ide_disk

If there's any other information you need, let me know.
Comment 1 Andrew Collins 2005-10-27 04:13:43 UTC
Just to show a more normal distribution of 'size-64' objects, and to highlight that the problem is serious, here's the output from slabtop from my desktop system that's been up for a couple of days now (also with very heavy network traffic, as well as constant use, but with a completely different 802.11g network card that uses ndiswrapper):

OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
2501   2500  99%    0.06K     41       61       164K size-64

Notice that the number of 'size-64' objects is completely reasonable on this machine.
Comment 2 Andrew Collins 2005-10-29 22:21:21 UTC
I just compiled and installed the vanilla kernel.org 2.6.14 kernel, using the config from /boot/config-2.6.13-15-default and the problem has been completely resolved.  I've been testing the machine over the past day, with the usual heavy network traffic, and the number 'size-64' objects has remained steady.

andy@vandemar:~> uptime
5:14pm  up  14:32,  2 users,  load average: 0.00, 0.00, 0.00

From slabtop:
OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
1121   1060  94%    0.06K     19       59        76K size-64

Compare this to before, on the default 2.6.13-15 kernel.  There are over 100 times more 'size-64' objects with the previous kernel!  Also, with the 2.6.14 kernel, the number of 'size-64' objects remains steady, unlike the 2.6.13-15 kernel, where they multiplied uncontrollably.  It seems that there is definitely a bug in the 2.6.13-15 kernel version.
 
Comment 3 Andrea Arcangeli 2005-11-16 10:27:36 UTC
could you test the latest CVS? (kernel of the day or something like that)

Karsten checked in a fix for a orinoco memleak into the sl10.0 kernel on 2005-10-13
Comment 4 Olaf Kirch 2006-03-01 09:47:15 UTC
no response for several months, closing