|
Bugzilla – Full Text Bug Listing |
| Summary: | dns resolution problems - possible getaddrinfo() bugs | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE 11.1 | Reporter: | Andy Harrison <aharrison> |
| Component: | Other | Assignee: | E-mail List <bnc-team-screening> |
| Status: | RESOLVED DUPLICATE | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Major | ||
| Priority: | P5 - None | CC: | luca.gugelmann |
| Version: | Final | ||
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | openSUSE 11.1 | ||
| Whiteboard: | |||
| Found By: | Community User | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: | The test program discussed in the comment above. | ||
|
Description
Andy Harrison
2008-12-29 15:04:07 UTC
Further lending credit that this may be a bug related to ipv6, I went through my /etc/ssh/ssh_config and ~/.ssh/config files and made sure that the AddressFamily keywords all had an argument of "inet" instead of "any" and now the ssh command is successful 100% of the time when resolving names. Other commands such as whois continue to fail. I have the same problem and it seems indeed to be getaddrinfo() related. Specifically DNS resolution fails when ai_family is set to AF_UNSPEC in the second argument to getaddrinfo. AF_INET works as intended, so my understanding is that AF_UNSPEC should at least return the IPV4 address instead of failing. Attached is a small test program, which I hope shows the problem. The output on my side: > ./dnstest novell.com AF_INET: 130.57.5.70 AF_INET6: getaddrinfo: Name or service not known AF_UNSPEC: getaddrinfo: Name or service not known compare with localhost (which does not go through a dns server): > ./dnstest localhost AF_INET: 127.0.0.1 127.0.0.1 AF_INET6: ::1 AF_UNSPEC: 127.0.0.1 ::1 I've been through much of the same troubleshooting as above, no success. ipv6 is disabled on my system (to at least have most of the gui internet apps work). Created attachment 262821 [details]
The test program discussed in the comment above.
gcc dnstest.c -o dnstest
Further testing showed that once every few dozen queries the AF_UNSPEC case returns a correct answer. I tried to reproduce it and looking at the wireshark logs stumbled upon the following behavior: - on an AF_INET query a request for the A record goes out and the correct answer comes in. Everything ok. - on an AF_INET6 query a request for the AAAA record goes out and "not implemented" is the router's answer (as expected). (This is repeated 4 times.) - on an AF_UNSPEC query a request for the A record goes out, then a request for the AAAA record goes out, then the answer for the AAAA query comes in (not implemented) and finally the answer to the A query. Note that my router answers the queries in reverse order. In this case getaddrinfo fails. Once in a while the order in which the answers come in is correct (I'm on a wireless network, so I assume sometimes the first packet is delayed). When the order of the answers is consistent with the order of the queries (that is answer to A first, AAAA later) getaddrinfo returns the correct ip. I'm no longer at my parent's house and the problem disappeared. Switching to a different router apparently fixes the problem, without requiring any configuration changes. This definitely points towards getaddrinfo choking on the answers by some broken(?) dns servers. I tried the dnstest program attached by Luca and got the same failures. I tried installing the factory repository at http://download.opensuse.org/repositories/Base:/build/standard/ and seeing if those updates would help (in case they included glibc updates), but no joy. So, as a work-around, I installed a recursion-only instance of named locally and pointed my resolv.conf to 127.0.0.1. Works well enough. If I can assist with further troubleshooting of the actual problem, I'd be happy to assist. The problem here is that getaddrinfo() still tries to resolve IPv6 AAAAs if IPv6 is disabled on your system - does ./dnstest localhost show AF_INET6 results if IPv6 is turned off? Can you paste your ip addr show output? lsmod | grep ipv6? Either IPv6 disabling is not working properly or there is bug in getaddrinfo() IPv6 auto-detection. Oh, I have just noticed - your getaddrinfo() call in ./dnstest has no AI_ADDRCONFIG in the ai_flags field - could you set it there instead of zero and try again? To clarify, we will skip AAAA queries only if AI_ADDRCONFIG flag is used and no IPv6 interfaces are available. Not all applications use AI_ADDRCONFIG, but what is confusing is that your firefox still does not work with IPv6 disabled since it definitely should use AI_ADDRCONFIG. Apologies, I shouldn't have included firefox in this bug. I was too liberal in my cutting and pasting of previous communications. I'm not sure what I did to get firefox working correctly and even though it was showing these symptoms immediately after initial o/s installation, firefox was one of the first apps to start working smoothly for me when I started troubleshooting the problem. I have ipv6 disabled. Here's the proof:
# lsmod | grep -i ipv6
#
# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
inet 127.0.0.2/8 brd 127.255.255.255 scope host secondary lo
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
link/ether 00:03:ba:f0:ce:50 brd ff:ff:ff:ff:ff:ff
inet 192.168.3.104/20 brd 192.168.15.255 scope global eth0
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 00:50:04:d2:73:7d brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 00:50:04:62:0a:00 brd ff:ff:ff:ff:ff:ff
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
link/ether 00:03:ba:f0:ce:51 brd ff:ff:ff:ff:ff:ff
inet 172.24.1.55/23 brd 172.24.1.255 scope global eth3
As for the dnstest program, I'm barely a read-only c programmer, so hopefully I did this correctly. I changed hints.ai_flags to...
hints.ai_flags |= AI_ADDRCONFIG;
...and recompiled. AF_UNSPEC results are successful 100% of the time now.
I grabbed the glibc src rpm you attached to bug 441947 and I'm compiling it now.
Setting AI_ADDRCONFIG produces correct results with AF_UNSPEC queries here too. Further I tested the glibc from bug 441947 (of which this bug can now probably considered a duplicate) and the problem has disappeared regardless whether AI_ADDRCONF is set or not. Confirmed, glibc-2.9-5 from bug 441947 fixed the problem for me. *** This bug has been marked as a duplicate of bug 441947 *** |