Bugzilla – Bug 618050
getaddrinfo() breaks when resolv.conf points at buggy nameservers
Last modified: 2011-02-04 01:10:45 UTC
+++ This bug was initially created as a clone of Bug #441947 +++ I am observing the 5 seconds timeout with every DNS request that has been reported for older OpenSUSE versions earlier (see e.g. bug #441947). The bug was supposed to be fixed but it seems to have reappeared with 11.3. # time dig +short www.google.de www.google.com. www.l.google.com. 74.125.77.99 74.125.77.104 74.125.77.147 real 0m0.452s user 0m0.000s sys 0m0.008s # time curl www.google.de >/dev/null % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 8520 0 8520 0 0 1602 0 --:--:-- 0:00:05 --:--:-- 41359 real 0m5.324s user 0m0.008s sys 0m0.000s Here is an strace of curl showing the 5s timeout. 22:18:06.830849 socket(PF_FILE, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3 22:18:06.830883 connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) 22:18:06.832544 socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 3 22:18:06.832591 connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, 28) = 0 22:18:06.832656 poll([{fd=3, events=POLLOUT}], 1, 0) = 1 ([{fd=3, revents=POLLOUT}]) 22:18:06.832704 send(3, "\351a\1\0\0\1\0\0\0\0\0\0\3www\6google\2de\0\0\1\0\1", 31, MSG_NOSIGNAL) = 31 22:18:06.832783 poll([{fd=3, events=POLLIN|POLLOUT}], 1, 5000) = 1 ([{fd=3, revents=POLLOUT}]) 22:18:06.832823 send(3, "\247\304\1\0\0\1\0\0\0\0\0\0\3www\6google\2de\0\0\34\0\1", 31, MSG_NOSIGNAL) = 31 22:18:06.832884 poll([{fd=3, events=POLLIN}], 1, 4999) = 1 ([{fd=3, revents=POLLIN}]) 22:18:06.883920 recvfrom(3, "\351a\201\200\0\1\0\5\0\0\0\0\3www\6google\2de\0\0\1\0\1\300"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, [16]) = 127 22:18:06.884451 poll([{fd=3, events=POLLIN}], 1, 4948) = 0 (Timeout) 22:18:11.834354 poll([{fd=3, events=POLLOUT}], 1, 0) = 1 ([{fd=3, revents=POLLOUT}]) 22:18:11.834480 send(3, "\351a\1\0\0\1\0\0\0\0\0\0\3www\6google\2de\0\0\1\0\1", 31, MSG_NOSIGNAL) = 31 22:18:11.834639 poll([{fd=3, events=POLLIN}], 1, 5000) = 1 ([{fd=3, revents=POLLIN}]) 22:18:11.937264 recvfrom(3, "\351a\201\200\0\1\0\5\0\0\0\0\3www\6google\2de\0\0\1\0\1\300"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, [16]) = 127 22:18:11.937510 poll([{fd=3, events=POLLOUT}], 1, 4896) = 1 ([{fd=3, revents=POLLOUT}]) DNS system is an old Zyxel router. I have IPv6 disabled and nscd disabled. I see the same timeouts with other network programs, in particular no difference between firefox and e.g. curl and wget. This makes the surfing experience with 11.3 very cumbersome.
Created attachment 372259 [details] supportconfig
tcpdump output for a curl run: 22:30:50.767774 IP 192.168.1.11.43963 > 192.168.1.1.domain: 37639+ A? www.google.de. (31) 22:30:50.767844 IP 192.168.1.11.43963 > 192.168.1.1.domain: 48617+ AAAA? www.google.de. (31) 22:30:50.789994 IP 192.168.1.1.domain > 192.168.1.11.43963: 37639 5/0/0 CNAME www.google.com., CNAME www.l.google.com., A 74.125.77.104, A 74.125.77.147, A 74.125.77.99 (127) 22:30:55.771636 IP 192.168.1.11.43963 > 192.168.1.1.domain: 37639+ A? www.google.de. (31) 22:30:55.794732 IP 192.168.1.1.domain > 192.168.1.11.43963: 37639 5/0/0 CNAME www.google.com., CNAME www.l.google.com., A 74.125.77.104, A 74.125.77.147, A 74.125.77.99 (127) 22:30:55.794816 IP 192.168.1.11.43963 > 192.168.1.1.domain: 48617+ AAAA? www.google.de. (31) 22:30:55.818130 IP 192.168.1.1.domain > 192.168.1.11.43963: 48617 2/1/0 CNAME www.google.com., CNAME www.l.google.com. (129)
Thanks for the report, but why are you adding the other poor 18 people to Cc? Temporarily, we have resolved the bug by disabling the parallel resolving method, but for newer OpenSUSE releases we have switched back to the parallel resolving for the speed improvement. nscd should have alleviated that problem (having to find out only once that single-request mode has to be used), unfortunately unscd has trouble with sharing the resolver state between subprocesses and the fix turned out to be a bit complex - perhaps I should have prioritized it more. The simplest workaround is to add 'options single-request' to /etc/resolv.conf. Can you please verify that it fixes the problem for you? Coolo, do you think it is worth mentioning it again in the release notes?
(In reply to comment #3) > Thanks for the report, but why are you adding the other poor 18 people to Cc? I just cloned the bug, assuming the cc list members were still interested. Apologies to those who aren't any more. > The simplest workaround is to add 'options single-request' to /etc/resolv.conf. > Can you please verify that it fixes the problem for you? > > Coolo, do you think it is worth mentioning it again in the release notes? Please find a way to document it in a place that is easy to find. I (not having used OpenSUSE with my router at home for some time) found it pretty hard to find information (for example, a search for DNS at wiki.opensuse.org doesn't reveal useful results). 'options single-request' definitely slipped my attention. Btw, how can that option be used together with NetworkManager? If yes, how?
export RES_OPTIONS="single-request" in the profile solves all problems, independent of NetworkManager. Thanks for the hint.
http://en.opensuse.org/Speed_up_slow_DNS_lookups please review.
Thank you guys. -=terry=-
. *** This bug has been marked as a duplicate of bug 594447 ***
sorry for the typo :) *** This bug has been marked as a duplicate of bug 549447 ***