|
Bugzilla – Full Text Bug Listing |
| Summary: | utdelete logging | ||
|---|---|---|---|
| Product: | [openSUSE] SUSE LINUX 10.0 | Reporter: | Robert Boehm <boehm.robert1> |
| Component: | Other | Assignee: | Andreas Kleen <ak> |
| Status: | RESOLVED INVALID | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P5 - None | CC: | acpi |
| Version: | Beta 3 | ||
| Target Milestone: | --- | ||
| Hardware: | x86 | ||
| OS: | All | ||
| Whiteboard: | |||
| Found By: | Beta-Customer | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
acpidmp hex data
Battery State One second of logging output |
||
|
Description
Robert Boehm
2005-08-28 00:35:11 UTC
I just reviewed my logs from yesterday, and there are at least 5000 or maybe 10000 entries of the utdelete "large Reference Count"...I see where there is a "sanity check" in the checking script...which has been there for many kernel versions... But I don't know why on MY machine that it has become an issue when it never has been before on any previous versions of SUSE through 9.3. Last comment on this, but further observations: the logging continues at a rapid pace...ALWAYS pertaining to object ddfd7328. I don't know what that object is or how to trace it. The log file is filling up at the rate of about 2MB per hour which is manageable. Logrotate seems to keep things under control. I modified postfix to allow logdigest to mail me the daily message, which postfix had been dumping becuase it was beyond the default limit size of about 10mb (TEXT FILE!!!). I changed it to 100MB temporarily until this is cleared up. Am I the only one who has this problem? I wish I had another machine to beta test with now, but I don't and can't say if it would happen on my other machines...just the laptop (ASUS A2500H with a Celeron) and 512MB RAM. This is perplexing for two reasons. I don't like the idea of the thing being so busy logging so much in the background (now about 50 lines per second) and I don't know ANYTHING about the utdelete function except for reading part of the code that makes it work (and not understanding a lot of it). Thanks... ak any idea? I already disabled CONFIG_ACPI_DEBUG, but that will just remove the message, but not make the underlying message go away. It's probably a reference counting bug somewhere. Robert, can you perhaps add acpidmp output from your machine? Intel folks, FYI. Yes...I can get that for you...just to clarify...acpidmp sends a bunch of hex data...is that what you are looking for? My machine is now installing a bunch of packages from a repository (it got stuck on a dependency while I was sleeping)..but I'll do the commmand and redirect it to a file that Ill send to you shortly...thanks. Yes....removing the message is one thing...but the problem is another...but I am having trouble figuring out what is unique about that particular machine. I'll get the file within a half-hour or so.. Bob Yes, the hex data is what we want. Created attachment 47964 [details]
acpidmp hex data
Here is the data....I can see a lot of output...I hope that it means something
for you...I'm looking at it now but I'm a little slow in the morning!
Hey guys...I just got my logdigest report....postfix is working it now since I modified the message size limit...but dang near choked anyway: If you want a bit MORE of the log, fine, but it's all the same with ths numbers just increasing slightly on the "count"...but are you ready for this: For a 24 hour period the messages and warning log files had....ready? 376,043 lines!!!!!!! Now...THAT's a big log file for a home laptop!!! Anyway, if you guys can tell me how to turn the LOGGING of these events off, even if we don't get to the root of the problem yet, that would be nice...I just don't know where to go for that. But of course, if we could find the root cause and stop this, that would be better. Is this a kernel issue talking with ACPI? Kernel 2.6.13 was released today and if it's a kernel bug...well... Thanks again....I"m in and out but I'll keep watch on the e-mail... Booting with acpi_dbg_level=0 should disable it. Thanks...I'll do that and see what happens on the logging anyway. Bob What kernel/ACPICA version you are using? could you try http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/release/2.6.13/acpi-20050815-2.6.13.diff.bz2 ? alex I'm sorry, but I am not sure how to get the ACPICA information fromt he kernel. This is the current kernel: 2.6.13-rc6-git13-4-default #1 Mon Aug 22 18:38:22 Which will soon be updated to the stable 2.6.13 release. This is the stock default kernel provided with SUSE 10.0 Beta 3, so you might call Novell and get the info from them. How do I get the current ACPICA information from my kernel build? I'm not planning on compiling a kernel on this machine since it would take forever....and I'm not confident in doing it. Thanks Sorry Alexey, I don't think it's reasonable to ask the SUSE betatesters to recompile kernels with random patches. Did you fix any reference count issues that might explain that message in the patchkit? yes, there are some changes in that area in June. Sorry about not wanting to recompile. No problems with previous kernels for me. Current bios on the ASUS A2500H (A2H) is 0213 (I think the latest) and it has a sis chipset, if that is relenant. I guess my question to you guys is if you are getting any other reports of this or can duplicate it with any other machines. Seems like I'm the only one. I have to now take some heavy boxes to the attic for the wife! The very last of the 367,000 entries for the 24 hour period before I shut down the debugging...I don't know if it is relevant or helps or not...to see how the numbers changed from before, but here are about 10 lines anyway. Then I'll leave you guys alone...after reading that driver patch, it's out of my league! Aug 29 13:30:10 bobslaptop kernel: Aug 29 13:30:10 bobslaptop kernel: utdelete-0404 [15] ut_update_ref_count : **** Warning **** Large Reference Count (590B ) in object ddfd8328 Aug 29 13:30:10 bobslaptop kernel: Aug 29 13:30:10 bobslaptop kernel: utdelete-0404 [18] ut_update_ref_count : **** Warning **** Large Reference Count (590A ) in object ddfd8328 Aug 29 13:30:10 bobslaptop kernel: Aug 29 13:30:10 bobslaptop kernel: utdelete-0404 [17] ut_update_ref_count : **** Warning **** Large Reference Count (590B ) in object ddfd8328 Aug 29 13:30:10 bobslaptop kernel: Aug 29 13:30:10 bobslaptop kernel: utdelete-0404 [16] ut_update_ref_count : **** Warning **** Large Reference Count (590C ) in object ddfd8328 Aug 29 13:30:10 bobslaptop kernel: Aug 29 13:30:10 bobslaptop kernel: utdelete-0404 [15] ut_update_ref_count : **** Warning **** Large Reference Count (590B ) in object ddfd8328 Aug 29 13:30:10 bobslaptop kernel: Aug 29 13:30:10 bobslaptop kernel: utdelete-0404 [18] ut_update_ref_count : **** Warning **** Large Reference Count (590A ) in object ddfd8328 Aug 29 13:30:10 bobslaptop kernel: Aug 29 13:30:10 bobslaptop kernel: utdelete-0404 [17] ut_update_ref_count : **** Warning **** Large Reference Count (590B ) in object ddfd8328 Aug 29 13:30:10 bobslaptop kernel: Aug 29 13:30:10 bobslaptop kernel: utdelete-0404 [16] ut_update_ref_count : **** Warning **** Large Reference Count (590C ) in object ddfd8328 Aug 29 13:30:10 bobslaptop kernel: Aug 29 13:30:10 bobslaptop kernel: utdelete-0404 [15] ut_update_ref_count : **** Warning **** Large Reference Count (590B ) in object ddfd8328 Thanks for all of the help, folks. I plan on a fresh install of beta 4 in a couple of days....new kernel...we'll see what happens. Can you test if the kernel in ftp://ftp.suse.com/pub/people/ak/acpi-text/kernel-default-2.6.13-3.i586.rpm still emits the messages? It includes the ACPI patchkit (might take a few minutes until the kernel appears on the server) Yes...I'm downloading it now. It will take a couple of hours after I boot the new kernel for the logging to start, so be patient. Also, I'll leave the computer on all night with the new kernel...I'm leaving for a trip early in the morning...I'll send a message then... Bob Pardon me for it taking a while...I'm downloading the sources too and the ftp is kinda slow. I should have the kernel booted within the hour...then we'll see what happens... Test Kernel booted properly at 0012 UTC...I will report on logging before heading to bed and after I get up... Bad news guys.....utdelete warnings started again about an hour and a half after reboot....you'll notice that the object number is different...but the symptoms are exactly the same. This is with the 2.6.13 acpt test kernel. Here is a log sample: Aug 30 20:44:04 bobslaptop kernel: Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [18] ut_update_ref_count : **** Warning **** Large Reference Count (4AE) in object ddfda7e0 Aug 30 20:44:04 bobslaptop kernel: Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [17] ut_update_ref_count : **** Warning **** Large Reference Count (4AF) in object ddfda7e0 Aug 30 20:44:04 bobslaptop kernel: Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [16] ut_update_ref_count : **** Warning **** Large Reference Count (4B0) in object ddfda7e0 Aug 30 20:44:04 bobslaptop kernel: Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [15] ut_update_ref_count : **** Warning **** Large Reference Count (4AF) in object ddfda7e0 Aug 30 20:44:04 bobslaptop kernel: Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [18] ut_update_ref_count : **** Warning **** Large Reference Count (4AE) in object ddfda7e0 Aug 30 20:44:04 bobslaptop kernel: Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [17] ut_update_ref_count : **** Warning **** Large Reference Count (4AF) in object ddfda7e0 Aug 30 20:44:04 bobslaptop kernel: Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [16] ut_update_ref_count : **** Warning **** Large Reference Count (4B0) in object ddfda7e0 Aug 30 20:44:04 bobslaptop kernel: Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [15] ut_update_ref_count : **** Warning **** Large Reference Count (4AF) in object ddfda7e0 Aug 30 20:44:04 bobslaptop kernel: Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [18] ut_update_ref_count : **** Warning **** Large Reference Count (4AE) in object ddfda7e0 Aug 30 20:44:04 bobslaptop kernel: Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [17] ut_update_ref_count : **** Warning **** Large Reference Count (4AF) in object ddfda7e0 Aug 30 20:44:04 bobslaptop kernel: Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [16] ut_update_ref_count : **** Warning **** Large Reference Count (4B0) in object ddfda7e0 Aug 30 20:44:04 bobslaptop kernel: Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [15] ut_update_ref_count : **** Warning **** Large Reference Count (4AF) in object ddfda7e0 Aug 30 20:44:04 bobslaptop kernel: Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [18] ut_update_ref_count : **** Warning **** Large Reference Count (4AE) in object ddfda7e0 Aug 30 20:44:04 bobslaptop kernel: Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [17] ut_update_ref_count : **** Warning **** Large Reference Count (4AF) in object ddfda7e0 Aug 30 20:44:04 bobslaptop kernel: Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [16] ut_update_ref_count : **** Warning **** Large Reference Count (4B0) in object ddfda7e0 Aug 30 20:44:04 bobslaptop kernel: Aug 30 20:44:04 bobslaptop kernel: utdelete-0379 [15] ut_update_ref_count : **** Warning **** Large Reference Count (4AF) in object ddfda7e0 I will not be around for most of the day on Wednesday. I don't know how much I can help now anyway. If I have this trouble on the beta 4 install, I'll let you know...and if it is in the final next month, I'll just stop logging it by passing a permenant boot-option in the boot-loader. If the system runs fine, that's all that really matters. I am still curious if this is duplicated anywhere, however....let me know if you guys have seen it. Bob Thanks for testing. Alexey, this means the -mm* ACPI still has the reference count bug. Hmm, I should have slipped a WARN_ON(1) in there... beta4 won't show the logging because ACPI debugging is turned off, but the underlying bug will be still there, just undetected. Can you please try again with the kernel in ftp.suse.com:/pub/people/ak/acpi-test2 and report what it outputs? (might become noisy) Thanks, Yes...I'll grab it. If I can get it downloaded and installed before going to work, I'll boot into it and after work (after 6pm Chicago time) send you what it is saying on the output. Once more...have you guys seen this on any other machines? Am I the only one who sees this? Just wondering... You seem to be the only one so far. Hmmm...up kind of late, aren't we? I'm up early. I'm downloading it now...this time I'll skip the sources for now (I'll get it later) and just install the kernel to get it running. Dowload will be done soon. But as I said, I won't be able to report until I'm back from Virginia tonight...I'll have it crunch and log all day while I'm gone. Thanks... New acpt test 2 kernel installed and booted at 0915 UTC. I'll get the info to you tonight from the logs.. I'm curious. I have, in previous versions, used only final versions on this computer. I wonder if this problem has been here all along, but with acpi debugging off, just never saw it? If so, I know that it must not be detrimental as I have had this unit on for days at a time without problems. Also it seems as if all acpi features work without problem... OK....off to work...talk to you folks tonight! Sorry about the time delay. I am now back from out of town and getting a cat of the VERY large log file with the acpi information. I will be leaving town again tomorrow but will take this machine with me, however probably won't have the normal facilities that I do here. OK...it's about done...here is a snippet of the output from just minutes ago (running for about 15 hours with the new acpi test 2 kernel: Aug 31 19:02:34 bobslaptop kernel: Aug 31 19:02:34 bobslaptop kernel: Badness in acpi_ut_update_ref_count at drivers/acpi/utilities/utdelete.c:380 Aug 31 19:02:34 bobslaptop kernel: [<c023a594>] acpi_ut_update_object_reference+0xce/0x1a0 Aug 31 19:02:34 bobslaptop kernel: [<c023a77e>] acpi_ut_remove_reference+0x8f/0xa9 Aug 31 19:02:34 bobslaptop kernel: [<c02191ec>] acpi_ds_clear_implicit_return+0x40/0x4f Aug 31 19:02:34 bobslaptop kernel: [<c02173b2>] acpi_ds_exec_end_control_op+0x152/0x2b4 Aug 31 19:02:34 bobslaptop kernel: [<c0217b1f>] acpi_ds_exec_end_op+0x1e8/0x5b5 Aug 31 19:02:34 bobslaptop kernel: [<c02309de>] acpi_ps_parse_loop+0x73a/0xb2c Aug 31 19:02:34 bobslaptop kernel: [<c02300a0>] acpi_ps_parse_aml+0xc6/0x2ca Aug 31 19:02:34 bobslaptop kernel: [<c0231587>] acpi_ps_execute_pass+0xa4/0xd1 Aug 31 19:02:34 bobslaptop kernel: [<c0231424>] acpi_ps_execute_method+0x108/0x197 Aug 31 19:02:34 bobslaptop kernel: [<c022ce7e>] acpi_ns_execute_control_method+0x108/0x12e Aug 31 19:02:34 bobslaptop kernel: [<c022cd3d>] acpi_ns_evaluate_by_handle+0xef/0x128 Aug 31 19:02:34 bobslaptop kernel: [<c022ca97>] acpi_ns_evaluate_relative+0x167/0x1d1 Aug 31 19:02:34 bobslaptop kernel: [<c022beed>] acpi_evaluate_object+0x141/0x257 Aug 31 19:02:34 bobslaptop kernel: [<c0175790>] seq_read+0x0/0x2e0 Aug 31 19:02:34 bobslaptop kernel: [<df475292>] acpi_battery_get_status+0xa2/0x1f0 [battery] Aug 31 19:02:34 bobslaptop kernel: [<c0175790>] seq_read+0x0/0x2e0 Aug 31 19:02:34 bobslaptop kernel: [<df475975>] acpi_battery_read_state+0x68/0x1d3 [battery] Aug 31 19:02:34 bobslaptop kernel: [<c01757fa>] seq_read+0x6a/0x2e0 Aug 31 19:02:34 bobslaptop kernel: [<c0175790>] seq_read+0x0/0x2e0 Aug 31 19:02:34 bobslaptop kernel: [<c01590db>] vfs_read+0x8b/0x170 Aug 31 19:02:34 bobslaptop kernel: [<c015947c>] sys_read+0x3c/0x70 Aug 31 19:02:34 bobslaptop kernel: [<c0102d1b>] sysenter_past_esp+0x54/0x79 Aug 31 19:02:34 bobslaptop kernel: utdelete-0379 [18] ut_update_ref_count : **** Warning **** Large Reference Count (2A2E) in object ddfda7e0 Aug 31 19:02:34 bobslaptop kernel: Aug 31 19:02:34 bobslaptop kernel: Badness in acpi_ut_update_ref_count at drivers/acpi/utilities/utdelete.c:380 Aug 31 19:02:34 bobslaptop kernel: [<c023a594>] acpi_ut_update_object_reference+0xce/0x1a0 Aug 31 19:02:34 bobslaptop kernel: [<c023a6d5>] acpi_ut_add_reference+0x6f/0x89 Aug 31 19:02:34 bobslaptop kernel: [<c0228d8a>] acpi_ex_resolve_node_to_value+0x326/0x3f4 Aug 31 19:02:34 bobslaptop kernel: [<c022140c>] acpi_ex_resolve_to_value+0xf0/0x15f Aug 31 19:02:34 bobslaptop kernel: [<c0224bd8>] acpi_ex_resolve_operands+0x4f6/0x956 Aug 31 19:02:34 bobslaptop kernel: [<c0217a2d>] acpi_ds_exec_end_op+0xf6/0x5b5 Aug 31 19:02:34 bobslaptop kernel: [<c02309de>] acpi_ps_parse_loop+0x73a/0xb2c Aug 31 19:02:34 bobslaptop kernel: [<c02300a0>] acpi_ps_parse_aml+0xc6/0x2ca Aug 31 19:02:34 bobslaptop kernel: [<c0231587>] acpi_ps_execute_pass+0xa4/0xd1 Aug 31 19:02:34 bobslaptop kernel: [<c0231424>] acpi_ps_execute_method+0x108/0x197 Aug 31 19:02:34 bobslaptop kernel: [<c022ce7e>] acpi_ns_execute_control_method+0x108/0x12e Aug 31 19:02:34 bobslaptop kernel: [<c022cd3d>] acpi_ns_evaluate_by_handle+0xef/0x128 Aug 31 19:02:34 bobslaptop kernel: [<c022ca97>] acpi_ns_evaluate_relative+0x167/0x1d1 Aug 31 19:02:34 bobslaptop kernel: [<c022beed>] acpi_evaluate_object+0x141/0x257 Aug 31 19:02:34 bobslaptop kernel: [<c0175790>] seq_read+0x0/0x2e0 Aug 31 19:02:34 bobslaptop kernel: [<df475292>] acpi_battery_get_status+0xa2/0x1f0 [battery] Aug 31 19:02:34 bobslaptop kernel: [<c0175790>] seq_read+0x0/0x2e0 Aug 31 19:02:34 bobslaptop kernel: [<df475975>] acpi_battery_read_state+0x68/0x1d3 [battery] Aug 31 19:02:34 bobslaptop kernel: [<c01757fa>] seq_read+0x6a/0x2e0 Aug 31 19:02:34 bobslaptop kernel: [<c0175790>] seq_read+0x0/0x2e0 Aug 31 19:02:34 bobslaptop kernel: [<c01590db>] vfs_read+0x8b/0x170 Aug 31 19:02:34 bobslaptop kernel: [<c015947c>] sys_read+0x3c/0x70 Aug 31 19:02:34 bobslaptop kernel: [<c0102d1b>] sysenter_past_esp+0x54/0x79 Aug 31 19:02:34 bobslaptop kernel: utdelete-0379 [17] ut_update_ref_count : **** Warning **** Large Reference Count (2A2F) in object ddfda7e0 Aug 31 19:02:34 bobslaptop kernel: Aug 31 19:02:34 bobslaptop kernel: Badness in acpi_ut_update_ref_count at drivers/acpi/utilities/utdelete.c:380 Aug 31 19:02:34 bobslaptop kernel: [<c023a594>] acpi_ut_update_object_reference+0xce/0x1a0 Aug 31 19:02:34 bobslaptop kernel: [<c023a6d5>] acpi_ut_add_reference+0x6f/0x89 Aug 31 19:02:34 bobslaptop kernel: [<c021926c>] acpi_ds_do_implicit_return+0x71/0x7c Aug 31 19:02:34 bobslaptop kernel: [<c02192f3>] acpi_ds_is_result_used+0x7c/0x253 Aug 31 19:02:34 bobslaptop kernel: [<c021954d>] acpi_ds_delete_result_if_not_used+0x83/0xb8 Aug 31 19:02:34 bobslaptop kernel: [<c0217eb3>] acpi_ds_exec_end_op+0x57c/0x5b5 Aug 31 19:02:34 bobslaptop kernel: [<c02309de>] acpi_ps_parse_loop+0x73a/0xb2c Aug 31 19:02:34 bobslaptop kernel: [<c02300a0>] acpi_ps_parse_aml+0xc6/0x2ca Aug 31 19:02:34 bobslaptop kernel: [<c0231587>] acpi_ps_execute_pass+0xa4/0xd1 Aug 31 19:02:34 bobslaptop kernel: [<c0231424>] acpi_ps_execute_method+0x108/0x197 Aug 31 19:02:34 bobslaptop kernel: [<c022ce7e>] acpi_ns_execute_control_method+0x108/0x12e Aug 31 19:02:34 bobslaptop kernel: [<c022cd3d>] acpi_ns_evaluate_by_handle+0xef/0x128 Aug 31 19:02:34 bobslaptop kernel: [<c022ca97>] acpi_ns_evaluate_relative+0x167/0x1d1 Aug 31 19:02:34 bobslaptop kernel: [<c022beed>] acpi_evaluate_object+0x141/0x257 Aug 31 19:02:34 bobslaptop kernel: [<c0175790>] seq_read+0x0/0x2e0 Aug 31 19:02:34 bobslaptop kernel: [<df475292>] acpi_battery_get_status+0xa2/0x1f0 [battery] Aug 31 19:02:34 bobslaptop kernel: [<c0175790>] seq_read+0x0/0x2e0 Aug 31 19:02:34 bobslaptop kernel: [<df475975>] acpi_battery_read_state+0x68/0x1d3 [battery] Aug 31 19:02:34 bobslaptop kernel: [<c01757fa>] seq_read+0x6a/0x2e0 Aug 31 19:02:34 bobslaptop kernel: [<c0175790>] seq_read+0x0/0x2e0 Aug 31 19:02:34 bobslaptop kernel: [<c01590db>] vfs_read+0x8b/0x170 Aug 31 19:02:34 bobslaptop kernel: [<c015947c>] sys_read+0x3c/0x70 Aug 31 19:02:34 bobslaptop kernel: [<c0102d1b>] sysenter_past_esp+0x54/0x79 Aug 31 19:02:34 bobslaptop kernel: utdelete-0379 [16] ut_update_ref_count : **** Warning **** Large Reference Count (2A30) in object ddfda7e0 Aug 31 19:02:34 bobslaptop kernel: Aug 31 19:02:34 bobslaptop kernel: Badness in acpi_ut_update_ref_count at drivers/acpi/utilities/utdelete.c:380 Aug 31 19:02:34 bobslaptop kernel: [<c023a594>] acpi_ut_update_object_reference+0xce/0x1a0 Aug 31 19:02:34 bobslaptop kernel: [<c023a77e>] acpi_ut_remove_reference+0x8f/0xa9 Aug 31 19:02:34 bobslaptop kernel: [<c0219565>] acpi_ds_delete_result_if_not_used+0x9b/0xb8 Aug 31 19:02:34 bobslaptop kernel: [<c0217eb3>] acpi_ds_exec_end_op+0x57c/0x5b5 Aug 31 19:02:34 bobslaptop kernel: [<c02309de>] acpi_ps_parse_loop+0x73a/0xb2c Aug 31 19:02:34 bobslaptop kernel: [<c02300a0>] acpi_ps_parse_aml+0xc6/0x2ca Aug 31 19:02:34 bobslaptop kernel: [<c0231587>] acpi_ps_execute_pass+0xa4/0xd1 Aug 31 19:02:34 bobslaptop kernel: [<c0231424>] acpi_ps_execute_method+0x108/0x197 Aug 31 19:02:34 bobslaptop kernel: [<c022ce7e>] acpi_ns_execute_control_method+0x108/0x12e Aug 31 19:02:34 bobslaptop kernel: [<c022cd3d>] acpi_ns_evaluate_by_handle+0xef/0x128 Aug 31 19:02:34 bobslaptop kernel: [<c022ca97>] acpi_ns_evaluate_relative+0x167/0x1d1 Aug 31 19:02:34 bobslaptop kernel: [<c022beed>] acpi_evaluate_object+0x141/0x257 Aug 31 19:02:34 bobslaptop kernel: [<c0175790>] seq_read+0x0/0x2e0 Aug 31 19:02:34 bobslaptop kernel: [<df475292>] acpi_battery_get_status+0xa2/0x1f0 [battery] Aug 31 19:02:34 bobslaptop kernel: [<c0175790>] seq_read+0x0/0x2e0 Aug 31 19:02:34 bobslaptop kernel: [<df475975>] acpi_battery_read_state+0x68/0x1d3 [battery] Aug 31 19:02:34 bobslaptop kernel: [<c01757fa>] seq_read+0x6a/0x2e0 Aug 31 19:02:34 bobslaptop kernel: [<c0175790>] seq_read+0x0/0x2e0 Aug 31 19:02:34 bobslaptop kernel: [<c01590db>] vfs_read+0x8b/0x170 Aug 31 19:02:34 bobslaptop kernel: [<c015947c>] sys_read+0x3c/0x70 Aug 31 19:02:34 bobslaptop kernel: [<c0102d1b>] sysenter_past_esp+0x54/0x79 Aug 31 19:02:34 bobslaptop kernel: utdelete-0379 [15] ut_update_ref_count : **** Warning **** Large Reference Count (2A2F) in object ddfda7e0 You can see the last line of each group is the original reference count output...the first line is intersting in the "badness" check....Hopefully you guys can figure this out. (In reply to comment #12) > I'm sorry, but I am not sure how to get the ACPICA information dmesg -s64000 | grep Subsystem or cat /proc/acpi/info or grep ACPI_CA_VERSION $KERNEL_SOURCE/include/acconfig.h Note that this version number is updated for ACPICA (core interpreter) updates, and that the Latest ACPI patch on kernel.org referenced above may change without incrementing this number -- but this version number only increments, so it does give a good idea what is running. I don't suppose I could ask you to test a kernel.org kernel? There have been some updates in this area and it would be good to find out if this is fixed upstream and if we need to backport something to SuSE, or if this is still broken in the latest upstream. In particular, ACPICA 20050902 contains a change in this area. You can get it tonight's kernel snapshot from kernel.org; or if you have a 2.6.13 (final) tree, you can apply the ACPI patch to it: http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/release/2.6.13/ acpi-20050902-2.6.13.diff.gz OK...here is the situation. Since the last time we talked, I did a clean install of beta 4 which has the 2.6.13 final tree by default. acpi debugging is off by default so I'm not getting any output now, of course. To turn acpi debugging at boot time, should I set acpi_dbg_level=1 or some other integer? Also, I downloaded the patch...can I apply the patch without recompiling the kernel? My question is important because now that I have this thing running well, I'll need it on a trip coming up soon and don't want to mess it up too much...so if no recompile is necessary, what is the procedure for applying the patch? Sorry for my ignorance...I'll go look it up while I wait for answers on this. I'll be happy to apply the patch and then turn debugging on again..and let you see the output... Thanks... Bob Current ACPICA version: 20050408 (with the stock beta 4 kernel...I do think that we tested this patch when I put in the kernel recommended before on beta 3. The output was the same...and there was some test output also..that you can see above. I'm going to turn on acpi debuggine now with this kernel and see what happens... Scratch that...I'm changing gears and don't know how much help I'll be...I decided to install the upgraded packages from RC-1 into my beta 4 to see what happens...included is a newly spun kernel as well... I'm getting busy on other things now also...and don't know now much time I'll have to keep on this thing.... The computer is running great though....so I'm not going to worry about it too much...seems like my machine was the only one spitting that stuff out anyway...and if I don't know about it...who cares? I know...bad attitude...... OK...update....the latest tree 2.6.13 kernels with the patch seem to have solved the problem. I have acpi debugging turned back on...the computer has run for about 2.5 hours now and there is no more utdelete logging...it seems as if this problem has been solved. If you don't hear from me in the next few hours...everything is normal. I think that this bug is fixed.... The debugging is just turned off, so the problem is hidden. Intel folks - does the backtrace in comment #28 trigger any bells? I just added a WARN_ON(1) to the error message. To clarify...I know that the debugging is turned off by default, but I put a boot option acpi_dbg_level=1 which should have turned it on, right? Or is that option disabled. With that option, I have no output now...where before I did...this kernel is the one that came with beta 4. (In reply to comment #35) > Intel folks - does the backtrace in comment #28 trigger any bells? > I just added a WARN_ON(1) to the error message. Yes, from this backtrace it's obvious that problem is inside _BST (battery status) method execution, but I cannot repeat that in userspace simulator. May be we need full trace from ACPICA in order to debug this further. This could be done with putting 0xffffffff into /proc/acpi/acpi_dbg_level on kernel with debug output enabled. It is safer to set it to 0 again after reading of /proc/acpi/battery/XXXX/state. Robert can you please downgrade to the older test kernel i posted (which had full debug enabled) and generate this information and attach it to the bug? Thanks. Yes I can...do I have to set any parameters for debugging? I'm assuming that acpi debugging is turned ON by default... Do what Alexey recommended in comment #37. Yes, so I put 0xffffffff as the debug level at boot-time option, right? I'll boot the test 2 kernel with that option...get the /proc/acpi/battery/XXXX/state info, AND the logging output for you. Please do echo 0xffffffff > /proc/acpi/acpi_dbg_level for i in /proc/acpi/battery/*/state ; do cat $i ; done echo 0 > /proc/acpi/acpi_dbg_level and attach the output that generates. Got it...I'm booting into thet test 2 kernel now...had to grab it again...I'll put it in a safe place this time. I'll get the output for you...however it might take an hour or so for the funny stuff to start... By the way, on the NEW kernel that I was running, I looked at the /proc/acpi/battery/*/state (and all looked normal, but what the heck do I know?)...we'll get you the info soon. Bob OK...thing is logging like crazy now...however I can't get the output to redirect to a file...so I'll grab the ouput directly and make a file for you...also I'll put a snippet of the log file too...wait just a sec. Also, in my computer it's /proc/acpi/debug_level and not acpi_dbg_level for what that is worth... Info coming shortly... THis time I wrote the script and executed it...ouput on next e-mail It should be all in /var/log/messages Created attachment 49367 [details]
Battery State
Created attachment 49368 [details]
One second of logging output
Can I reboot now with either debugging turned off or set back to the defaul, which I think is "1" on this kernel? Also, if you guys have enough information, let me know when I can continue to change my other software and the kernel again to the one with RC-1...I'll hang onto this one in a safe place.. Sure you can reboot again. Robert, I need part of the log with the error (Large Reference Count) present. The one you just posted does not contain this string, making it useless. Could you please try to find such a fragment in your logs? DAMN...I should have seen that...sorry. I guess it wasn't running long enough for that to start...it usually takes an hour or so....so, I had to re-boot with the debugging parameter back on...it might take a bit of time...I'll get you a snip WITH the reference count warning when it starts... Again...sorry...and I feel so stupid! This stuff is WAY over my head so forgive me if I sound lost half the time...... Uptime almost an hour now...still no bad reference count logging....should start soon. I do notice that the utdelete function seems normal for now, which I think you can see on the snippet that I gave you...to compare with the section I WILL give you once the thing sees a "high" reference count...assuming that it will...I anticipate it starting shortly...this thing is logging so much that my hard drive is going to blow up soon....(!) If you add fsync(no) to /var/log/messages in /etc/syslog-nf/syslog-nf.conf and reload it it will be smoother. Oh...that's OK....I'll just turn the logging off after I get you the info..should be out soon...Hope you are having fun.... You wait for error with debug_level=1 and not 0xffffffff, right? You can switch to debug_level=0xffffffff after you see the error to save your disk and time :) Oh....no...I have debug set to 0xffffffff....so NOW you are having fun laughing at me, right? How do I change that on the fly? This set of commands will do it: 1. Set debug_level to max echo 0xffffffff > /proc/acpi/debug_level 2. Set debug_level to output only warnings/errors echo 0x1F > /proc/acpi/debug_level So now you should set debug_level to 0x1F and wait... Well...OK...but if I just leave things as they are, I'll still get the error, I'm assuming....however, I am becoming a little perplexed that by NOW, we should be getting the bad reference count...but....I'm not. Everything seems normal...every time the utdelete function does the count...it sees things as normal...does it's delete routine and exits....so I'm wondering what is different now....other than the underlying system is from beta 4 instead of beta 3 (not counting the kernel)....I'll wait a little longer...but if it doesn't do it in another hour or so, I'm not sure that it will! Geez, guys....it seems like I'm wasting a lot of your time on this esoteric thing.... The logging slows it down a lot so it could take much longer now. Well...OK...but I'm getting what appears as real-time data...the text file is filling up right up to current clock time...unless that is irrelevant...considering that the system is running as before. I guess I don't understand that, but I'll continue to wait....I'm staining some wood now....if anything changes I'll get it to you el pronto...I"m ready for this stuff to end soon..... OK...no change. Just so you know...the reference count varies from "0" to "3"...most of the time it is "Decremented" and at some times "Incremented"..but deleted..but don't see the count rising at all....it always seems to stay in that range. Prior, when we were getting the "big count" it was in the 100's and rising...so now I'm not sure that the problem is existing...and I don't have an explanation for that. Do you want me to continue waiting? Something tells me that the problem is not existant with this configuration. I changed nothing else except that this is beta 4 now (but downgraded to your test 2 kernel), instead of beta 3. I did not change any BIOS settings. I am wondering if we had a fluke or if there might be something that has changed that is causing the system to work normally now. Let me know what to do... Bob Shoud I send you a tail -500 or so of the current messages log? If you see mentioned error once again, create trace with the method Andy described to you in #42. If the problem does not show up, then just enjoy :) Thanks....I am really sorry if I caused a bunch of useless activity over there....I am not sure what's up with this...but with all of those millions of lines of code, weird things happen, I'm sure. Thanks... Guys....Last e-mail....just so you know. I cannot duplicate this bug anymore. I have tried and tried and for some reason...it's just GONE. I have no explaination for this...so, I'm going to donsider this closed. Apparently the driver saw something it didn't like at the time, but is happy now. I'm happy too, as the system is running wonderfully...and Alexey, even though I showed signs of fatigue and frustration...I am enjoying this. SUSE 10.0 will be a dynamite release. For me...ALL of the issues I have had are now resolved. Would MS ever let people have so much fun? I don't think so. Thanks for all of the hard work, guys...it's paying off.. Closing for now. |