Bug 113595

Summary: utdelete logging
Product: [openSUSE] SUSE LINUX 10.0 Reporter: Robert Boehm <boehm.robert1>
Component: OtherAssignee: Andreas Kleen <ak>
Status: RESOLVED INVALID QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: acpi
Version: Beta 3   
Target Milestone: ---   
Hardware: x86   
OS: All   
Whiteboard:
Found By: Beta-Customer Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: acpidmp hex data
Battery State
One second of logging output

Description Robert Boehm 2005-08-28 00:35:11 UTC
Starting with Beta 2 and continuing with beta 3, after the computer has run for
a few hours, the system and warning logs log this (snippet of about 1.5 seconds):

Aug 23 14:32:17 bobslaptop kernel: utdelete-0404 [14] ut_update_ref_count : ****
Warning **** Large Reference Count (1597) in object ddfd7328
Aug 23 14:32:17 bobslaptop kernel:
Aug 23 14:32:17 bobslaptop kernel: utdelete-0404 [17] ut_update_ref_count : ****
Warning **** Large Reference Count (1596) in object ddfd7328
Aug 23 14:32:18 bobslaptop kernel:
Aug 23 14:32:18 bobslaptop kernel: utdelete-0404 [16] ut_update_ref_count : ****
Warning **** Large Reference Count (1597) in object ddfd7328
Aug 23 14:32:18 bobslaptop kernel:
Aug 23 14:32:18 bobslaptop kernel: utdelete-0404 [15] ut_update_ref_count : ****
Warning **** Large Reference Count (1598) in object ddfd7328
Aug 23 14:32:18 bobslaptop kernel:
Aug 23 14:32:18 bobslaptop kernel: utdelete-0404 [17] ut_update_ref_count : ****
Warning **** Large Reference Count (1597) in object ddfd7328
Aug 23 14:32:18 bobslaptop kernel:
Aug 23 14:32:18 bobslaptop kernel: utdelete-0404 [16] ut_update_ref_count : ****
Warning **** Large Reference Count (1598) in object ddfd7328
Aug 23 14:32:18 bobslaptop kernel:
Aug 23 14:32:18 bobslaptop kernel: utdelete-0404 [15] ut_update_ref_count : ****
Warning **** Large Reference Count (1599) in object ddfd7328
Aug 23 14:32:18 bobslaptop kernel:
Aug 23 14:32:18 bobslaptop kernel: utdelete-0404 [17] ut_update_ref_count : ****
Warning **** Large Reference Count (1598) in object ddfd7328
Aug 23 14:32:18 bobslaptop kernel:
Aug 23 14:32:18 bobslaptop kernel: utdelete-0404 [16] ut_update_ref_count : ****
Warning **** Large Reference Count (1599) in object ddfd7328

....continuing ad infinitum until a re-boot, where it will calm down until a
couple of hours pass.  I think this is ACPI related, but not sure.  Log files
get big as you can see about 10 to 15 lines every SECOND.

I cannot find the reason for this....
Comment 1 Robert Boehm 2005-08-28 12:18:30 UTC
I just reviewed my logs from yesterday, and there are at least 5000 or maybe
10000 entries of the utdelete "large Reference Count"...I see where there is a
"sanity check" in the checking script...which has been there for many kernel
versions...

But I don't know why on MY machine that it has become an issue when it never has
been before on any previous versions of SUSE through 9.3.
Comment 2 Robert Boehm 2005-08-28 19:45:13 UTC
Last comment on this, but further observations:  the logging continues at a
rapid pace...ALWAYS pertaining to object ddfd7328.  I don't know what that
object is or how to trace it.  The log file is filling up at the rate of about
2MB per hour which is manageable.  Logrotate seems to keep things under control.

I modified postfix to allow logdigest to mail me the daily message, which
postfix had been dumping becuase it was beyond the default limit size of about
10mb (TEXT FILE!!!).  I changed it to 100MB temporarily until this is cleared up.

Am I the only one who has this problem?  I wish I had another machine to beta
test with now, but I don't and can't say if it would happen on my other
machines...just the laptop (ASUS A2500H with a Celeron) and 512MB RAM.

This is perplexing for two reasons.  I don't like the idea of the thing being so
busy logging so much in the background (now about 50 lines per second) and I
don't know ANYTHING about the utdelete function except for reading part of the
code that makes it work (and not understanding a lot of it).

Thanks...
Comment 3 Hendrik Vogelsang 2005-08-29 09:15:06 UTC
ak any idea?
Comment 4 Andreas Kleen 2005-08-29 09:30:09 UTC
I already disabled CONFIG_ACPI_DEBUG, but that will just remove the  
message, but not make the underlying message go away.  
  
It's probably a reference counting bug somewhere. Robert, can you 
perhaps add acpidmp output from your machine? 
 
Intel folks, FYI. 
 
 
Comment 5 Robert Boehm 2005-08-29 10:55:29 UTC
Yes...I can get that for you...just to clarify...acpidmp sends a bunch of hex
data...is that what you are looking for?  My machine is now installing a bunch
of packages from a repository (it got stuck on a dependency while I was
sleeping)..but I'll do the commmand and redirect it to a file that Ill send to
you shortly...thanks.

Yes....removing the message is one thing...but the problem is another...but I am
having trouble figuring out what is unique about that particular machine.  I'll
get the file within a half-hour or so..

Bob
Comment 6 Andreas Kleen 2005-08-29 11:03:43 UTC
Yes, the hex data is what we want. 
 
Comment 7 Robert Boehm 2005-08-29 11:19:46 UTC
Created attachment 47964 [details]
acpidmp hex data

Here is the data....I can see a lot of output...I hope that it means something
for you...I'm looking at it now but I'm a little slow in the morning!
Comment 8 Robert Boehm 2005-08-29 17:49:17 UTC
Hey guys...I just got my logdigest report....postfix is working it now since I
modified the message size limit...but dang near choked anyway:

If you want a bit MORE of the log, fine, but it's all the same with ths numbers
just increasing slightly on the "count"...but are you ready for this:

For a 24 hour period the messages and warning log files had....ready?

376,043 lines!!!!!!!

Now...THAT's a big log file for a home laptop!!!

Anyway, if you guys can tell me how to turn the LOGGING of these events off,
even if we don't get to the root of the problem yet, that would be nice...I just
don't know where to go for that.  But of course, if we could find the root cause
and stop this, that would be better.  Is this a kernel issue talking with ACPI?
 Kernel 2.6.13 was released today and if it's a kernel bug...well...

Thanks again....I"m in and out but I'll keep watch on the e-mail...
Comment 9 Andreas Kleen 2005-08-29 18:04:59 UTC
Booting with acpi_dbg_level=0 should disable it.
Comment 10 Robert Boehm 2005-08-29 18:16:06 UTC
Thanks...I'll do that and see what happens on the logging anyway.

Bob
Comment 11 Alexey Starikovskiy 2005-08-30 13:43:26 UTC
What kernel/ACPICA version you are using? could you try
http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/release/2.6.13/acpi-20050815-2.6.13.diff.bz2
?

alex
Comment 12 Robert Boehm 2005-08-30 15:20:28 UTC
I'm sorry, but I am not sure how to get the ACPICA information fromt he kernel.

This is the current kernel:  2.6.13-rc6-git13-4-default #1 Mon Aug 22 18:38:22

Which will soon be updated to the stable 2.6.13 release.  This is the stock
default kernel provided with SUSE 10.0 Beta 3, so you might call Novell
and get the info from them.

How do I get the current ACPICA information from my kernel build?
I'm not planning on compiling a kernel on this machine since it would
take forever....and I'm not confident in doing it.

Thanks
Comment 13 Andreas Kleen 2005-08-30 16:30:43 UTC
Sorry Alexey, I don't think it's reasonable to ask the SUSE betatesters
to recompile kernels with random patches.

Did you fix any reference count issues that might explain that message 
in the patchkit? 
Comment 14 Alexey Starikovskiy 2005-08-30 16:37:35 UTC
yes, there are some changes in that area in June.
Comment 15 Robert Boehm 2005-08-30 16:44:27 UTC
Sorry about not wanting to recompile.  No problems with previous kernels for me.
 Current bios on the ASUS A2500H (A2H) is 0213 (I think the latest) and it has a
sis chipset, if that is relenant.

I guess my question to you guys is if you are getting any other reports of this
or can duplicate it with any other machines.  Seems like I'm the only one.  I
have to now take some heavy boxes to the attic for the wife!
Comment 16 Robert Boehm 2005-08-30 17:49:32 UTC
The very last of the 367,000 entries for the 24 hour period before I shut down
the debugging...I don't know if it is relevant or helps or not...to see how the
numbers changed from before, but here are about 10 lines anyway.  Then I'll
leave you guys alone...after reading that driver patch, it's out of my league!

Aug 29 13:30:10 bobslaptop kernel:
Aug 29 13:30:10 bobslaptop kernel: utdelete-0404 [15] ut_update_ref_count   :
**** Warning **** Large Reference Count (590B
) in object ddfd8328
Aug 29 13:30:10 bobslaptop kernel:
Aug 29 13:30:10 bobslaptop kernel: utdelete-0404 [18] ut_update_ref_count   :
**** Warning **** Large Reference Count (590A
) in object ddfd8328
Aug 29 13:30:10 bobslaptop kernel:
Aug 29 13:30:10 bobslaptop kernel: utdelete-0404 [17] ut_update_ref_count   :
**** Warning **** Large Reference Count (590B
) in object ddfd8328
Aug 29 13:30:10 bobslaptop kernel:
Aug 29 13:30:10 bobslaptop kernel: utdelete-0404 [16] ut_update_ref_count   :
**** Warning **** Large Reference Count (590C
) in object ddfd8328
Aug 29 13:30:10 bobslaptop kernel:
Aug 29 13:30:10 bobslaptop kernel: utdelete-0404 [15] ut_update_ref_count   :
**** Warning **** Large Reference Count (590B
) in object ddfd8328
Aug 29 13:30:10 bobslaptop kernel:
Aug 29 13:30:10 bobslaptop kernel: utdelete-0404 [18] ut_update_ref_count   :
**** Warning **** Large Reference Count (590A
) in object ddfd8328
Aug 29 13:30:10 bobslaptop kernel:
Aug 29 13:30:10 bobslaptop kernel: utdelete-0404 [17] ut_update_ref_count   :
**** Warning **** Large Reference Count (590B
) in object ddfd8328
Aug 29 13:30:10 bobslaptop kernel:
Aug 29 13:30:10 bobslaptop kernel: utdelete-0404 [16] ut_update_ref_count   :
**** Warning **** Large Reference Count (590C
) in object ddfd8328
Aug 29 13:30:10 bobslaptop kernel:
Aug 29 13:30:10 bobslaptop kernel: utdelete-0404 [15] ut_update_ref_count   :
**** Warning **** Large Reference Count (590B
) in object ddfd8328

Thanks for all of the help, folks.

I plan on a fresh install of beta 4 in a couple of days....new kernel...we'll
see what happens.
Comment 17 Andreas Kleen 2005-08-30 22:03:32 UTC
Can you test if the kernel in 

ftp://ftp.suse.com/pub/people/ak/acpi-text/kernel-default-2.6.13-3.i586.rpm

still emits the messages? It includes the ACPI patchkit

(might take a few minutes until the kernel appears on the server)


Comment 18 Robert Boehm 2005-08-30 22:11:31 UTC
Yes...I'm downloading it now.  It will take a couple of hours after I boot the
new kernel for the logging to start, so be patient.  Also, I'll leave the
computer on all night with the new kernel...I'm leaving for a trip early in the
morning...I'll send a message then...

Bob
Comment 19 Robert Boehm 2005-08-30 23:08:59 UTC
Pardon me for it taking a while...I'm downloading the sources too and the ftp is
kinda slow.  I should have the kernel booted within the hour...then we'll see
what happens...
Comment 20 Robert Boehm 2005-08-31 00:14:34 UTC
Test Kernel booted properly at 0012 UTC...I will report on logging before
heading to bed and after I get up...
Comment 21 Robert Boehm 2005-08-31 01:50:48 UTC
Bad news guys.....utdelete warnings started again about an hour and a half after
reboot....you'll notice that the object number is different...but the symptoms
are exactly the same.  This is with the 2.6.13 acpt test kernel.  Here is a log
sample:

Aug 30 20:44:04 bobslaptop kernel:
Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [18] ut_update_ref_count   :
**** Warning **** Large Reference Count (4AE) in object ddfda7e0
Aug 30 20:44:04 bobslaptop kernel:
Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [17] ut_update_ref_count   :
**** Warning **** Large Reference Count (4AF) in object ddfda7e0
Aug 30 20:44:04 bobslaptop kernel:
Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [16] ut_update_ref_count   :
**** Warning **** Large Reference Count (4B0) in object ddfda7e0
Aug 30 20:44:04 bobslaptop kernel:
Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [15] ut_update_ref_count   :
**** Warning **** Large Reference Count (4AF) in object ddfda7e0
Aug 30 20:44:04 bobslaptop kernel:
Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [18] ut_update_ref_count   :
**** Warning **** Large Reference Count (4AE) in object ddfda7e0
Aug 30 20:44:04 bobslaptop kernel:
Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [17] ut_update_ref_count   :
**** Warning **** Large Reference Count (4AF) in object ddfda7e0
Aug 30 20:44:04 bobslaptop kernel:
Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [16] ut_update_ref_count   :
**** Warning **** Large Reference Count (4B0) in object ddfda7e0
Aug 30 20:44:04 bobslaptop kernel:
Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [15] ut_update_ref_count   :
**** Warning **** Large Reference Count (4AF) in object ddfda7e0
Aug 30 20:44:04 bobslaptop kernel:
Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [18] ut_update_ref_count   :
**** Warning **** Large Reference Count (4AE) in object ddfda7e0
Aug 30 20:44:04 bobslaptop kernel:
Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [17] ut_update_ref_count   :
**** Warning **** Large Reference Count (4AF) in object ddfda7e0
Aug 30 20:44:04 bobslaptop kernel:
Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [16] ut_update_ref_count   :
**** Warning **** Large Reference Count (4B0) in object ddfda7e0
Aug 30 20:44:04 bobslaptop kernel:
Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [15] ut_update_ref_count   :
**** Warning **** Large Reference Count (4AF) in object ddfda7e0
Aug 30 20:44:04 bobslaptop kernel:
Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [18] ut_update_ref_count   :
**** Warning **** Large Reference Count (4AE) in object ddfda7e0
Aug 30 20:44:04 bobslaptop kernel:
Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [17] ut_update_ref_count   :
**** Warning **** Large Reference Count (4AF) in object ddfda7e0
Aug 30 20:44:04 bobslaptop kernel:
Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [16] ut_update_ref_count   :
**** Warning **** Large Reference Count (4B0) in object ddfda7e0
Aug 30 20:44:04 bobslaptop kernel:
Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [15] ut_update_ref_count   :
**** Warning **** Large Reference Count (4AF) in object ddfda7e0

I will not be around for most of the day on Wednesday.  I don't know how much I
can help now anyway.  If I have this trouble on the beta 4 install, I'll let you
know...and if it is in the final next month, I'll just stop logging it by
passing a permenant boot-option in the boot-loader.  If the system runs fine,
that's all that really matters.

I am still curious if this is duplicated anywhere, however....let me know if you
guys have seen it.

Bob
Comment 22 Andreas Kleen 2005-08-31 01:57:43 UTC
Thanks for testing. Alexey, this means the -mm* ACPI still has the 
reference count bug.

Hmm, I should have slipped a WARN_ON(1) in there...

beta4 won't show the logging because ACPI debugging is turned off, but
the underlying bug will be still there, just undetected.
Comment 23 Andreas Kleen 2005-08-31 03:45:12 UTC
Can you please try again with the kernel in ftp.suse.com:/pub/people/ak/acpi-test2
and report what it outputs? (might become noisy) 

Thanks,
Comment 24 Robert Boehm 2005-08-31 08:35:50 UTC
Yes...I'll grab it.  If I can get it downloaded and installed before going to
work, I'll boot into it and after work (after 6pm Chicago time) send you what it
is saying on the output.

Once more...have you guys seen this on any other machines?  Am I the only one
who sees this?  Just wondering...
Comment 25 Andreas Kleen 2005-08-31 08:38:30 UTC
You seem to be the only one so far.

Comment 26 Robert Boehm 2005-08-31 08:44:29 UTC
Hmmm...up kind of late, aren't we?  I'm up early.  I'm downloading it now...this
time I'll skip the sources for now (I'll get it later) and just install the
kernel to get it running.  Dowload will be done soon.  But as I said, I won't be
able to report until I'm back from Virginia tonight...I'll have it crunch and
log all day while I'm gone.  Thanks...
Comment 27 Robert Boehm 2005-08-31 09:23:21 UTC
New acpt test 2 kernel installed and booted at 0915 UTC.  I'll get the info to
you tonight from the logs..

I'm curious.  I have, in previous versions, used only final versions on this
computer.  I wonder if this problem has been here all along, but with acpi
debugging off, just never saw it?  If so, I know that it must not be detrimental
as I have had this unit on for days at a time without problems.  Also it seems
as if all acpi features work without problem...

OK....off to work...talk to you folks tonight!
Comment 28 Robert Boehm 2005-09-01 00:05:29 UTC
Sorry about the time delay.  I am now back from out of town and getting a cat of
the VERY large log file with the acpi information.  I will be leaving town again
tomorrow but will take this machine with me, however probably won't have the
normal facilities that I do here.  OK...it's about done...here is a snippet of
the output from just minutes ago (running for about 15 hours with the new acpi
test 2 kernel:

Aug 31 19:02:34 bobslaptop kernel:
Aug 31 19:02:34 bobslaptop kernel: Badness in acpi_ut_update_ref_count at
drivers/acpi/utilities/utdelete.c:380
Aug 31 19:02:34 bobslaptop kernel:  [<c023a594>]
acpi_ut_update_object_reference+0xce/0x1a0
Aug 31 19:02:34 bobslaptop kernel:  [<c023a77e>] acpi_ut_remove_reference+0x8f/0xa9
Aug 31 19:02:34 bobslaptop kernel:  [<c02191ec>]
acpi_ds_clear_implicit_return+0x40/0x4f
Aug 31 19:02:34 bobslaptop kernel:  [<c02173b2>]
acpi_ds_exec_end_control_op+0x152/0x2b4
Aug 31 19:02:34 bobslaptop kernel:  [<c0217b1f>] acpi_ds_exec_end_op+0x1e8/0x5b5
Aug 31 19:02:34 bobslaptop kernel:  [<c02309de>] acpi_ps_parse_loop+0x73a/0xb2c
Aug 31 19:02:34 bobslaptop kernel:  [<c02300a0>] acpi_ps_parse_aml+0xc6/0x2ca
Aug 31 19:02:34 bobslaptop kernel:  [<c0231587>] acpi_ps_execute_pass+0xa4/0xd1
Aug 31 19:02:34 bobslaptop kernel:  [<c0231424>] acpi_ps_execute_method+0x108/0x197
Aug 31 19:02:34 bobslaptop kernel:  [<c022ce7e>]
acpi_ns_execute_control_method+0x108/0x12e
Aug 31 19:02:34 bobslaptop kernel:  [<c022cd3d>]
acpi_ns_evaluate_by_handle+0xef/0x128
Aug 31 19:02:34 bobslaptop kernel:  [<c022ca97>]
acpi_ns_evaluate_relative+0x167/0x1d1
Aug 31 19:02:34 bobslaptop kernel:  [<c022beed>] acpi_evaluate_object+0x141/0x257
Aug 31 19:02:34 bobslaptop kernel:  [<c0175790>] seq_read+0x0/0x2e0
Aug 31 19:02:34 bobslaptop kernel:  [<df475292>]
acpi_battery_get_status+0xa2/0x1f0 [battery]
Aug 31 19:02:34 bobslaptop kernel:  [<c0175790>] seq_read+0x0/0x2e0
Aug 31 19:02:34 bobslaptop kernel:  [<df475975>]
acpi_battery_read_state+0x68/0x1d3 [battery]
Aug 31 19:02:34 bobslaptop kernel:  [<c01757fa>] seq_read+0x6a/0x2e0
Aug 31 19:02:34 bobslaptop kernel:  [<c0175790>] seq_read+0x0/0x2e0
Aug 31 19:02:34 bobslaptop kernel:  [<c01590db>] vfs_read+0x8b/0x170
Aug 31 19:02:34 bobslaptop kernel:  [<c015947c>] sys_read+0x3c/0x70
Aug 31 19:02:34 bobslaptop kernel:  [<c0102d1b>] sysenter_past_esp+0x54/0x79
Aug 31 19:02:34 bobslaptop kernel: utdelete-0379 [18] ut_update_ref_count   :
**** Warning **** Large Reference Count (2A2E) in object ddfda7e0
Aug 31 19:02:34 bobslaptop kernel:
Aug 31 19:02:34 bobslaptop kernel: Badness in acpi_ut_update_ref_count at
drivers/acpi/utilities/utdelete.c:380
Aug 31 19:02:34 bobslaptop kernel:  [<c023a594>]
acpi_ut_update_object_reference+0xce/0x1a0
Aug 31 19:02:34 bobslaptop kernel:  [<c023a6d5>] acpi_ut_add_reference+0x6f/0x89
Aug 31 19:02:34 bobslaptop kernel:  [<c0228d8a>]
acpi_ex_resolve_node_to_value+0x326/0x3f4
Aug 31 19:02:34 bobslaptop kernel:  [<c022140c>] acpi_ex_resolve_to_value+0xf0/0x15f
Aug 31 19:02:34 bobslaptop kernel:  [<c0224bd8>]
acpi_ex_resolve_operands+0x4f6/0x956
Aug 31 19:02:34 bobslaptop kernel:  [<c0217a2d>] acpi_ds_exec_end_op+0xf6/0x5b5
Aug 31 19:02:34 bobslaptop kernel:  [<c02309de>] acpi_ps_parse_loop+0x73a/0xb2c
Aug 31 19:02:34 bobslaptop kernel:  [<c02300a0>] acpi_ps_parse_aml+0xc6/0x2ca
Aug 31 19:02:34 bobslaptop kernel:  [<c0231587>] acpi_ps_execute_pass+0xa4/0xd1
Aug 31 19:02:34 bobslaptop kernel:  [<c0231424>] acpi_ps_execute_method+0x108/0x197
Aug 31 19:02:34 bobslaptop kernel:  [<c022ce7e>]
acpi_ns_execute_control_method+0x108/0x12e
Aug 31 19:02:34 bobslaptop kernel:  [<c022cd3d>]
acpi_ns_evaluate_by_handle+0xef/0x128
Aug 31 19:02:34 bobslaptop kernel:  [<c022ca97>]
acpi_ns_evaluate_relative+0x167/0x1d1
Aug 31 19:02:34 bobslaptop kernel:  [<c022beed>] acpi_evaluate_object+0x141/0x257
Aug 31 19:02:34 bobslaptop kernel:  [<c0175790>] seq_read+0x0/0x2e0
Aug 31 19:02:34 bobslaptop kernel:  [<df475292>]
acpi_battery_get_status+0xa2/0x1f0 [battery]
Aug 31 19:02:34 bobslaptop kernel:  [<c0175790>] seq_read+0x0/0x2e0
Aug 31 19:02:34 bobslaptop kernel:  [<df475975>]
acpi_battery_read_state+0x68/0x1d3 [battery]
Aug 31 19:02:34 bobslaptop kernel:  [<c01757fa>] seq_read+0x6a/0x2e0
Aug 31 19:02:34 bobslaptop kernel:  [<c0175790>] seq_read+0x0/0x2e0
Aug 31 19:02:34 bobslaptop kernel:  [<c01590db>] vfs_read+0x8b/0x170
Aug 31 19:02:34 bobslaptop kernel:  [<c015947c>] sys_read+0x3c/0x70
Aug 31 19:02:34 bobslaptop kernel:  [<c0102d1b>] sysenter_past_esp+0x54/0x79
Aug 31 19:02:34 bobslaptop kernel: utdelete-0379 [17] ut_update_ref_count   :
**** Warning **** Large Reference Count (2A2F) in object ddfda7e0
Aug 31 19:02:34 bobslaptop kernel:
Aug 31 19:02:34 bobslaptop kernel: Badness in acpi_ut_update_ref_count at
drivers/acpi/utilities/utdelete.c:380
Aug 31 19:02:34 bobslaptop kernel:  [<c023a594>]
acpi_ut_update_object_reference+0xce/0x1a0
Aug 31 19:02:34 bobslaptop kernel:  [<c023a6d5>] acpi_ut_add_reference+0x6f/0x89
Aug 31 19:02:34 bobslaptop kernel:  [<c021926c>]
acpi_ds_do_implicit_return+0x71/0x7c
Aug 31 19:02:34 bobslaptop kernel:  [<c02192f3>] acpi_ds_is_result_used+0x7c/0x253
Aug 31 19:02:34 bobslaptop kernel:  [<c021954d>]
acpi_ds_delete_result_if_not_used+0x83/0xb8
Aug 31 19:02:34 bobslaptop kernel:  [<c0217eb3>] acpi_ds_exec_end_op+0x57c/0x5b5
Aug 31 19:02:34 bobslaptop kernel:  [<c02309de>] acpi_ps_parse_loop+0x73a/0xb2c
Aug 31 19:02:34 bobslaptop kernel:  [<c02300a0>] acpi_ps_parse_aml+0xc6/0x2ca
Aug 31 19:02:34 bobslaptop kernel:  [<c0231587>] acpi_ps_execute_pass+0xa4/0xd1
Aug 31 19:02:34 bobslaptop kernel:  [<c0231424>] acpi_ps_execute_method+0x108/0x197
Aug 31 19:02:34 bobslaptop kernel:  [<c022ce7e>]
acpi_ns_execute_control_method+0x108/0x12e
Aug 31 19:02:34 bobslaptop kernel:  [<c022cd3d>]
acpi_ns_evaluate_by_handle+0xef/0x128
Aug 31 19:02:34 bobslaptop kernel:  [<c022ca97>]
acpi_ns_evaluate_relative+0x167/0x1d1
Aug 31 19:02:34 bobslaptop kernel:  [<c022beed>] acpi_evaluate_object+0x141/0x257
Aug 31 19:02:34 bobslaptop kernel:  [<c0175790>] seq_read+0x0/0x2e0
Aug 31 19:02:34 bobslaptop kernel:  [<df475292>]
acpi_battery_get_status+0xa2/0x1f0 [battery]
Aug 31 19:02:34 bobslaptop kernel:  [<c0175790>] seq_read+0x0/0x2e0
Aug 31 19:02:34 bobslaptop kernel:  [<df475975>]
acpi_battery_read_state+0x68/0x1d3 [battery]
Aug 31 19:02:34 bobslaptop kernel:  [<c01757fa>] seq_read+0x6a/0x2e0
Aug 31 19:02:34 bobslaptop kernel:  [<c0175790>] seq_read+0x0/0x2e0
Aug 31 19:02:34 bobslaptop kernel:  [<c01590db>] vfs_read+0x8b/0x170
Aug 31 19:02:34 bobslaptop kernel:  [<c015947c>] sys_read+0x3c/0x70
Aug 31 19:02:34 bobslaptop kernel:  [<c0102d1b>] sysenter_past_esp+0x54/0x79
Aug 31 19:02:34 bobslaptop kernel: utdelete-0379 [16] ut_update_ref_count   :
**** Warning **** Large Reference Count (2A30) in object ddfda7e0
Aug 31 19:02:34 bobslaptop kernel:
Aug 31 19:02:34 bobslaptop kernel: Badness in acpi_ut_update_ref_count at
drivers/acpi/utilities/utdelete.c:380
Aug 31 19:02:34 bobslaptop kernel:  [<c023a594>]
acpi_ut_update_object_reference+0xce/0x1a0
Aug 31 19:02:34 bobslaptop kernel:  [<c023a77e>] acpi_ut_remove_reference+0x8f/0xa9
Aug 31 19:02:34 bobslaptop kernel:  [<c0219565>]
acpi_ds_delete_result_if_not_used+0x9b/0xb8
Aug 31 19:02:34 bobslaptop kernel:  [<c0217eb3>] acpi_ds_exec_end_op+0x57c/0x5b5
Aug 31 19:02:34 bobslaptop kernel:  [<c02309de>] acpi_ps_parse_loop+0x73a/0xb2c
Aug 31 19:02:34 bobslaptop kernel:  [<c02300a0>] acpi_ps_parse_aml+0xc6/0x2ca
Aug 31 19:02:34 bobslaptop kernel:  [<c0231587>] acpi_ps_execute_pass+0xa4/0xd1
Aug 31 19:02:34 bobslaptop kernel:  [<c0231424>] acpi_ps_execute_method+0x108/0x197
Aug 31 19:02:34 bobslaptop kernel:  [<c022ce7e>]
acpi_ns_execute_control_method+0x108/0x12e
Aug 31 19:02:34 bobslaptop kernel:  [<c022cd3d>]
acpi_ns_evaluate_by_handle+0xef/0x128
Aug 31 19:02:34 bobslaptop kernel:  [<c022ca97>]
acpi_ns_evaluate_relative+0x167/0x1d1
Aug 31 19:02:34 bobslaptop kernel:  [<c022beed>] acpi_evaluate_object+0x141/0x257
Aug 31 19:02:34 bobslaptop kernel:  [<c0175790>] seq_read+0x0/0x2e0
Aug 31 19:02:34 bobslaptop kernel:  [<df475292>]
acpi_battery_get_status+0xa2/0x1f0 [battery]
Aug 31 19:02:34 bobslaptop kernel:  [<c0175790>] seq_read+0x0/0x2e0
Aug 31 19:02:34 bobslaptop kernel:  [<df475975>]
acpi_battery_read_state+0x68/0x1d3 [battery]
Aug 31 19:02:34 bobslaptop kernel:  [<c01757fa>] seq_read+0x6a/0x2e0
Aug 31 19:02:34 bobslaptop kernel:  [<c0175790>] seq_read+0x0/0x2e0
Aug 31 19:02:34 bobslaptop kernel:  [<c01590db>] vfs_read+0x8b/0x170
Aug 31 19:02:34 bobslaptop kernel:  [<c015947c>] sys_read+0x3c/0x70
Aug 31 19:02:34 bobslaptop kernel:  [<c0102d1b>] sysenter_past_esp+0x54/0x79
Aug 31 19:02:34 bobslaptop kernel: utdelete-0379 [15] ut_update_ref_count   :
**** Warning **** Large Reference Count (2A2F) in object ddfda7e0

You can see the last line of each group is the original reference count
output...the first line is intersting in the "badness" check....Hopefully you
guys can figure this out.
Comment 29 Len Brown 2005-09-08 18:18:09 UTC
(In reply to comment #12)
> I'm sorry, but I am not sure how to get the ACPICA information

dmesg -s64000 | grep Subsystem
or
cat /proc/acpi/info
or
grep ACPI_CA_VERSION $KERNEL_SOURCE/include/acconfig.h

Note that this version number is updated for ACPICA (core interpreter)
updates, and that the Latest ACPI patch on kernel.org referenced
above may change without incrementing this number -- but this version
number only increments, so it does give a good idea what is running.
Comment 30 Len Brown 2005-09-08 18:57:38 UTC
I don't suppose I could ask you to test a kernel.org kernel?
There have been some updates in this area and it would be good
to find out if this is fixed upstream and if we need to backport
something to SuSE, or if this is still broken in the latest upstream.

In particular, ACPICA 20050902 contains a change in this area.
You can get it tonight's kernel snapshot from kernel.org;
or if you have a 2.6.13 (final) tree, you can apply the ACPI patch to it:
http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/release/2.6.13/
acpi-20050902-2.6.13.diff.gz
Comment 31 Robert Boehm 2005-09-08 19:44:22 UTC
OK...here is the situation.  Since the last time we talked, I did a clean
install of beta 4 which has the 2.6.13 final tree by default.  acpi debugging is
off by default so I'm not getting any output now, of course.  To turn acpi
debugging at boot time, should I set acpi_dbg_level=1 or some other integer?

Also, I downloaded the patch...can I apply the patch without recompiling the
kernel?  My question is important because now that I have this thing running
well, I'll need it on a trip coming up soon and don't want to mess it up too
much...so if no recompile is necessary, what is the procedure for applying the
patch?  Sorry for my ignorance...I'll go look it up while I wait for answers on
this.  I'll be happy to apply the patch and then turn debugging on again..and
let you see the output...

Thanks...

Bob
Comment 32 Robert Boehm 2005-09-08 21:05:12 UTC
Current ACPICA version:  20050408 (with the stock beta 4 kernel...I do think
that we tested this patch when I put in the kernel recommended before on beta 3.
 The output was the same...and there was some test output also..that you can see
above.

I'm going to turn on acpi debuggine now with this kernel and see what happens...
Comment 33 Robert Boehm 2005-09-08 21:36:49 UTC
Scratch that...I'm changing gears and don't know how much help I'll be...I
decided to install the upgraded packages from RC-1 into my beta 4 to see what
happens...included is a newly spun kernel as well...

I'm getting busy on other things now also...and don't know now much time I'll
have to keep on this thing....

The computer is running great though....so I'm not going to worry about it too
much...seems like my machine was the only one spitting that stuff out
anyway...and if I don't know about it...who cares?  I know...bad attitude......
Comment 34 Robert Boehm 2005-09-08 23:33:07 UTC
OK...update....the latest tree 2.6.13 kernels with the patch seem to have solved
the problem.  I have acpi debugging turned back on...the computer has run for
about 2.5 hours now and there is no more utdelete logging...it seems as if this
problem has been solved.  If you don't hear from me in the next few
hours...everything is normal.  I think that this bug is fixed....
Comment 35 Andreas Kleen 2005-09-09 01:15:27 UTC
The debugging is just turned off, so the problem 
is hidden.

Intel folks - does the backtrace in comment #28 trigger any bells? 
I just added a WARN_ON(1) to the error message.
Comment 36 Robert Boehm 2005-09-09 01:26:18 UTC
To clarify...I know that the debugging is turned off by default, but I put a
boot option acpi_dbg_level=1 which should have turned it on, right?  Or is that
option disabled.  With that option, I have no output now...where before I
did...this kernel is the one that came with beta 4.
Comment 37 Alexey Starikovskiy 2005-09-09 08:56:00 UTC
(In reply to comment #35)
> Intel folks - does the backtrace in comment #28 trigger any bells? 
> I just added a WARN_ON(1) to the error message.
Yes, from this backtrace it's obvious that problem is inside _BST (battery
status) method execution, but I cannot repeat that in userspace simulator. May
be we need full trace from ACPICA in order to debug this further.
This could be done with putting 0xffffffff into /proc/acpi/acpi_dbg_level on
kernel with debug output enabled. It is safer to set it to 0 again after reading
of /proc/acpi/battery/XXXX/state.
Comment 38 Andreas Kleen 2005-09-09 10:06:12 UTC
Robert can you please downgrade to the older test kernel i posted (which had
full debug enabled) and generate this information and attach it to the bug? Thanks.
Comment 39 Robert Boehm 2005-09-09 10:18:29 UTC
Yes I can...do I have to set any parameters for debugging?  I'm assuming that
acpi debugging is turned ON by default...
Comment 40 Andreas Kleen 2005-09-09 10:29:13 UTC
Do what Alexey recommended in comment #37.
Comment 41 Robert Boehm 2005-09-09 10:42:56 UTC
Yes, so I put 0xffffffff as the debug level at boot-time option, right?  I'll
boot the test 2 kernel with that option...get the /proc/acpi/battery/XXXX/state
info, AND the logging output for you.
Comment 42 Andreas Kleen 2005-09-09 10:53:43 UTC
Please do


echo 0xffffffff > /proc/acpi/acpi_dbg_level
for i in /proc/acpi/battery/*/state ; do cat $i ; done
echo 0 > /proc/acpi/acpi_dbg_level

and attach the output that generates.
Comment 43 Robert Boehm 2005-09-09 11:04:09 UTC
Got it...I'm booting into thet test 2 kernel now...had to grab it again...I'll
put it in a safe place this time.  I'll get the output for you...however it
might take an hour or so for the funny stuff to start...

By the way, on the NEW kernel that I was running, I looked at the
/proc/acpi/battery/*/state (and all looked normal, but what the heck do I
know?)...we'll get you the info soon.  

Bob
Comment 44 Robert Boehm 2005-09-09 11:33:33 UTC
OK...thing is logging like crazy now...however I can't get the output to
redirect to a file...so I'll grab the ouput directly and make a file for
you...also I'll put a snippet of the log file too...wait just a sec.

Also, in my computer it's /proc/acpi/debug_level and not acpi_dbg_level for what
that is worth...  Info coming shortly...
Comment 45 Robert Boehm 2005-09-09 11:41:07 UTC
THis time I wrote the script and executed it...ouput on next e-mail
Comment 46 Andreas Kleen 2005-09-09 11:42:41 UTC
It should be all in /var/log/messages
Comment 47 Robert Boehm 2005-09-09 11:50:59 UTC
Created attachment 49367 [details]
Battery State
Comment 48 Robert Boehm 2005-09-09 11:51:40 UTC
Created attachment 49368 [details]
One second of logging output
Comment 49 Robert Boehm 2005-09-09 12:10:22 UTC
Can I reboot now with either debugging turned off or set back to the defaul,
which I think is "1" on this kernel?

Also, if you guys have enough information, let me know when I can continue to
change my other software and the kernel again to the one with RC-1...I'll hang
onto this one in a safe place..
Comment 50 Andreas Kleen 2005-09-09 12:11:54 UTC
Sure you can reboot again.
Comment 51 Alexey Starikovskiy 2005-09-09 12:25:27 UTC
Robert,

I need part of the log with the error (Large Reference Count) present. The one
you just posted does not contain this string, making it useless. Could you
please try to find such a fragment in your logs?
Comment 52 Robert Boehm 2005-09-09 12:38:38 UTC
DAMN...I should have seen that...sorry.  I guess it wasn't running long enough
for that to start...it usually takes an hour or so....so, I had to re-boot with
the debugging parameter back on...it might take a bit of time...I'll get you a
snip WITH the reference count warning when it starts...

Again...sorry...and I feel so stupid!  This stuff is WAY over my head so forgive
me if I sound lost half the time......
Comment 53 Robert Boehm 2005-09-09 13:43:07 UTC
Uptime almost an hour now...still no bad reference count logging....should start
soon.  I do notice that the utdelete function seems normal for now, which I
think you can see on the snippet that I gave you...to compare with the section I
WILL give you once the thing sees a "high" reference count...assuming that it
will...I anticipate it starting shortly...this thing is logging so much that my
hard drive is going to blow up soon....(!)
Comment 54 Andreas Kleen 2005-09-09 13:49:01 UTC
If you add fsync(no) to /var/log/messages in /etc/syslog-nf/syslog-nf.conf and
reload it it will be smoother.
Comment 55 Robert Boehm 2005-09-09 14:02:53 UTC
Oh...that's OK....I'll just turn the logging off after I get you the
info..should be out soon...Hope you are having fun....
Comment 56 Alexey Starikovskiy 2005-09-09 14:09:28 UTC
You wait for error with debug_level=1 and not 0xffffffff, right? You can switch
to debug_level=0xffffffff after you see the error to save your disk and time :)
Comment 57 Robert Boehm 2005-09-09 14:14:24 UTC
Oh....no...I have debug set to 0xffffffff....so NOW you are having fun laughing
at me, right?

How do I change that on the fly?
Comment 58 Alexey Starikovskiy 2005-09-09 14:31:14 UTC
This set of commands will do it: 
1. Set debug_level to max
echo 0xffffffff > /proc/acpi/debug_level
2. Set debug_level to output only warnings/errors
echo 0x1F > /proc/acpi/debug_level
So now you should set debug_level to 0x1F and wait... 

Comment 59 Robert Boehm 2005-09-09 14:35:37 UTC
Well...OK...but if I just leave things as they are, I'll still get the error,
I'm assuming....however, I am becoming a little perplexed that by NOW, we should
be getting the bad reference count...but....I'm not.  Everything seems
normal...every time the utdelete function does the count...it sees things as
normal...does it's delete routine and exits....so I'm wondering what is
different now....other than the underlying system is from beta 4 instead of beta
3 (not counting the kernel)....I'll wait a little longer...but if it doesn't do
it in another hour or so, I'm not sure that it will!

Geez, guys....it seems like I'm wasting a lot of your time on this esoteric
thing....
Comment 60 Andreas Kleen 2005-09-09 14:39:09 UTC
The logging slows it down a lot so it could take much longer now.
Comment 61 Robert Boehm 2005-09-09 14:42:43 UTC
Well...OK...but I'm getting what appears as real-time data...the text file is
filling up right up to current clock time...unless that is
irrelevant...considering that the system is running as before.  I guess I don't
understand that, but I'll continue to wait....I'm staining some wood now....if
anything changes I'll get it to you el pronto...I"m ready for this stuff to end
soon.....
Comment 62 Robert Boehm 2005-09-09 15:30:41 UTC
OK...no change.  Just so you know...the reference count varies from "0" to
"3"...most of the time it is "Decremented" and at some times "Incremented"..but
deleted..but don't see the count rising at all....it always seems to stay in
that range.  Prior, when we were getting the "big count" it was in the 100's and
rising...so now I'm not sure that the problem is existing...and I don't have an
explanation for that.

Do you want me to continue waiting?  Something tells me that the problem is not
existant with this configuration.  I changed nothing else except that this is
beta 4 now (but downgraded to your test 2 kernel), instead of beta 3.  I did not
change any BIOS settings.

I am wondering if we had a fluke or if there might be something that has changed
that is causing the system to work normally now.

Let me know what to do...

Bob
Comment 63 Robert Boehm 2005-09-09 15:34:42 UTC
Shoud I send you a tail -500 or so of the current messages log?
Comment 64 Alexey Starikovskiy 2005-09-09 15:46:30 UTC
If you see mentioned error once again, create trace with the method Andy
described to you in #42.
If the problem does not show up, then just enjoy :)
Comment 65 Robert Boehm 2005-09-09 15:54:25 UTC
Thanks....I am really sorry if I caused a bunch of useless activity over
there....I am not sure what's up with this...but with all of those millions of
lines of code, weird things happen, I'm sure.

Thanks...
Comment 66 Robert Boehm 2005-09-09 18:53:47 UTC
Guys....Last e-mail....just so you know.  I cannot duplicate this bug anymore. 
I have tried and tried and for some reason...it's just GONE.  I have no
explaination for this...so, I'm going to donsider this closed.  Apparently the
driver saw something it didn't like at the time, but is happy now.  I'm happy
too, as the system is running wonderfully...and Alexey, even though I showed
signs of fatigue and frustration...I am enjoying this.  SUSE 10.0 will be a
dynamite release.  For me...ALL of the issues I have had are now resolved.

Would MS ever let people have so much fun?  I don't think so.

Thanks for all of the hard work, guys...it's paying off..
Comment 67 Andreas Kleen 2005-10-06 19:21:21 UTC
Closing for now.