Bug 618050 - getaddrinfo() breaks when resolv.conf points at buggy nameservers
Summary: getaddrinfo() breaks when resolv.conf points at buggy nameservers
Status: RESOLVED DUPLICATE of bug 549447
Alias: None
Product: openSUSE 11.3
Classification: openSUSE
Component: Basesystem (show other bugs)
Version: RC 1
Hardware: Other openSUSE 11.3
: P5 - None : Major (vote)
Target Milestone: ---
Assignee: Petr Baudis
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on: 441947
Blocks:
  Show dependency treegraph
 
Reported: 2010-06-28 20:23 UTC by Martin Wilck
Modified: 2011-02-04 01:10 UTC (History)
15 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
supportconfig (2.66 MB, application/x-bzip-compressed-tar)
2010-06-28 20:30 UTC, Martin Wilck
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Wilck 2010-06-28 20:23:12 UTC
+++ This bug was initially created as a clone of Bug #441947 +++

I am observing the 5 seconds timeout with every DNS request that has been reported for older OpenSUSE versions earlier (see e.g. bug #441947). The bug was supposed to be fixed but it seems to have reappeared with 11.3.

# time dig +short www.google.de
www.google.com.
www.l.google.com.
74.125.77.99
74.125.77.104
74.125.77.147

real    0m0.452s
user    0m0.000s
sys     0m0.008s


# time curl www.google.de >/dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  8520    0  8520    0     0   1602      0 --:--:--  0:00:05 --:--:-- 41359

real    0m5.324s
user    0m0.008s
sys     0m0.000s

Here is an strace of curl showing the 5s timeout.

22:18:06.830849 socket(PF_FILE, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
22:18:06.830883 connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
22:18:06.832544 socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 3
22:18:06.832591 connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, 28) = 0
22:18:06.832656 poll([{fd=3, events=POLLOUT}], 1, 0) = 1 ([{fd=3, revents=POLLOUT}])
22:18:06.832704 send(3, "\351a\1\0\0\1\0\0\0\0\0\0\3www\6google\2de\0\0\1\0\1", 31, MSG_NOSIGNAL) = 31
22:18:06.832783 poll([{fd=3, events=POLLIN|POLLOUT}], 1, 5000) = 1 ([{fd=3, revents=POLLOUT}])
22:18:06.832823 send(3, "\247\304\1\0\0\1\0\0\0\0\0\0\3www\6google\2de\0\0\34\0\1", 31, MSG_NOSIGNAL) = 31
22:18:06.832884 poll([{fd=3, events=POLLIN}], 1, 4999) = 1 ([{fd=3, revents=POLLIN}])
22:18:06.883920 recvfrom(3, "\351a\201\200\0\1\0\5\0\0\0\0\3www\6google\2de\0\0\1\0\1\300"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, [16]) = 127
22:18:06.884451 poll([{fd=3, events=POLLIN}], 1, 4948) = 0 (Timeout)
22:18:11.834354 poll([{fd=3, events=POLLOUT}], 1, 0) = 1 ([{fd=3, revents=POLLOUT}])
22:18:11.834480 send(3, "\351a\1\0\0\1\0\0\0\0\0\0\3www\6google\2de\0\0\1\0\1", 31, MSG_NOSIGNAL) = 31
22:18:11.834639 poll([{fd=3, events=POLLIN}], 1, 5000) = 1 ([{fd=3, revents=POLLIN}])
22:18:11.937264 recvfrom(3, "\351a\201\200\0\1\0\5\0\0\0\0\3www\6google\2de\0\0\1\0\1\300"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, [16]) = 127
22:18:11.937510 poll([{fd=3, events=POLLOUT}], 1, 4896) = 1 ([{fd=3, revents=POLLOUT}])

DNS system is an old Zyxel router.
I have IPv6 disabled and nscd disabled.
I see the same timeouts with other network programs, in particular no difference between firefox and e.g. curl and wget.

This makes the surfing experience with 11.3 very cumbersome.
Comment 1 Martin Wilck 2010-06-28 20:30:22 UTC
Created attachment 372259 [details]
supportconfig
Comment 2 Martin Wilck 2010-06-28 20:32:15 UTC
tcpdump output for a curl run:

22:30:50.767774 IP 192.168.1.11.43963 > 192.168.1.1.domain: 37639+ A? www.google.de. (31)
22:30:50.767844 IP 192.168.1.11.43963 > 192.168.1.1.domain: 48617+ AAAA? www.google.de. (31)
22:30:50.789994 IP 192.168.1.1.domain > 192.168.1.11.43963: 37639 5/0/0 CNAME www.google.com., CNAME www.l.google.com., A 74.125.77.104, A 74.125.77.147, A 74.125.77.99 (127)
22:30:55.771636 IP 192.168.1.11.43963 > 192.168.1.1.domain: 37639+ A? www.google.de. (31)
22:30:55.794732 IP 192.168.1.1.domain > 192.168.1.11.43963: 37639 5/0/0 CNAME www.google.com., CNAME www.l.google.com., A 74.125.77.104, A 74.125.77.147, A 74.125.77.99 (127)
22:30:55.794816 IP 192.168.1.11.43963 > 192.168.1.1.domain: 48617+ AAAA? www.google.de. (31)
22:30:55.818130 IP 192.168.1.1.domain > 192.168.1.11.43963: 48617 2/1/0 CNAME www.google.com., CNAME www.l.google.com. (129)
Comment 3 Petr Baudis 2010-06-28 21:20:52 UTC
Thanks for the report, but why are you adding the other poor 18 people to Cc?

Temporarily, we have resolved the bug by disabling the parallel resolving method, but for newer OpenSUSE releases we have switched back to the parallel resolving for the speed improvement. nscd should have alleviated that problem (having to find out only once that single-request mode has to be used), unfortunately unscd has trouble with sharing the resolver state between subprocesses and the fix turned out to be a bit complex - perhaps I should have prioritized it more.

The simplest workaround is to add 'options single-request' to /etc/resolv.conf. Can you please verify that it fixes the problem for you?

Coolo, do you think it is worth mentioning it again in the release notes?
Comment 4 Martin Wilck 2010-06-29 08:31:06 UTC
(In reply to comment #3)
> Thanks for the report, but why are you adding the other poor 18 people to Cc?

I just cloned the bug, assuming the cc list members were still interested. Apologies to those who aren't any more.
 
> The simplest workaround is to add 'options single-request' to /etc/resolv.conf.
> Can you please verify that it fixes the problem for you?
> 
> Coolo, do you think it is worth mentioning it again in the release notes?

Please find a way to document it in a place that is easy to find. I (not having used OpenSUSE with my router at home for some time) found it pretty hard to find information (for example, a search for DNS at wiki.opensuse.org doesn't reveal useful results). 'options single-request' definitely slipped my attention. 

Btw, how can that option be used together with NetworkManager? If yes, how?
Comment 5 Martin Wilck 2010-06-29 21:19:17 UTC
export RES_OPTIONS="single-request" 

in the profile solves all problems, independent of NetworkManager.

Thanks for the hint.
Comment 6 Martin Wilck 2010-06-29 22:45:31 UTC
http://en.opensuse.org/Speed_up_slow_DNS_lookups

please review.
Comment 7 Teruel de Campo MD 2010-06-30 11:04:45 UTC
Thank you guys.
-=terry=-
Comment 8 Petr Baudis 2011-02-04 01:09:34 UTC
.

*** This bug has been marked as a duplicate of bug 594447 ***
Comment 9 Petr Baudis 2011-02-04 01:10:45 UTC
sorry for the typo :)

*** This bug has been marked as a duplicate of bug 549447 ***