|
Bugzilla – Full Text Bug Listing |
| Summary: | NFS with kerberos identification isn't working | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE 11.4 | Reporter: | Martin Caj <mcaj> |
| Component: | Basesystem | Assignee: | Neil Brown <nfbrown> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P2 - High | CC: | cbenson, dipeit, forgotten_b5BnQSUi71, lchiquitto, mawa, mcaj, meissner, mlmuit, msvec, mvidner, nfbrown, okir, puzel, R.Smits, update |
| Version: | Final | ||
| Target Milestone: | Final | ||
| Hardware: | x86-64 | ||
| OS: | openSUSE 11.3 | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
Upstream patch
output of rpc.gssd -f -vvvv > /tmp/11.3.log 2>&1 rpc.gssd.11.4 -f -vvvv > /tmp/11.4.log 2>&1 rpc.gssd binary for testing (x86-64) Output of the test binary bug-614293_rpc.gssd Alternate rpc.gssd binary for testing Output of the test binary bug-614293_rpc2.gssd New rpc.gssd which attempts to auto-correct |
||
It is possible to get a trace of what rpc.svcgssd is doing at this time? I don't use IM, and being in a total different time zone it probably wouldn't help anyway. Thanks. Hi Neil, in the attachment I added log file. I'm not sure if it help you. But I have got an idea. I will provide access into my testing machine. Details I will write you in separate email. You can play with this machines as you need . Martin Martin, Neil I can confirm this. I first thought that it was related to the new krb5-client 1.8.1 but when I installed the 11.3 krb5-client rpm on an 11.2 system it worked just fine. Then I installed krb5-client 1.7.1 from 11.2 on 11.3 and it did not work. So it is pretty safe to say that this issue is not related to krb5-client The build dependencies for krb5.client are: BuildRequires: bison libcom_err-devel ncurses-devel BuildRequires: keyutils keyutils-devel BuildRequires: libopenssl-devel openldap2-devel 11.3 with krb5-client 1.7.1 rpc.gssd -fvvvvvvvvvv beginning poll handling krb5 upcall Full hostname for 'PHSAUDIT1.XXXXX.ORG' is 'phsaudit1.xxxxx.org' Full hostname for 'cpprojweb.xxxxx.org' is 'cpprojweb.xxxxx.org' Key table entry not found while getting keytab entry for 'root/cpprojweb.xxxxx.org@XXXXX.ORG' Success getting keytab entry for 'nfs/cpprojweb.xxxxx.org@XXXXX.ORG' WARNING: Key table entry not found while getting initial ticket for principal 'nfs/cpprojweb.xxxxx.org@XXXXX.ORG' using keytab 'WRFILE:/etc/krb5.keytab' ERROR: No credentials found for connection to server PHSAUDIT1.XXXXX.ORG doing error downcall May be this is related to nfs-client-1.2.1 (11.3) vs nfs-client-1.1.3 (11.2) ? destroying client clnt3c I have the same problem with mounting an opensolaris(b134) and an netapp filesystem: ######################################################################### + mount -t nfs4 -o sec=krb5p netapp1.xxx:/vol/home /mnt mount.nfs4: access denied by server while mounting netapp1.xxx:/vol/home + mount -t nfs4 -o sec=krb5p sunfs4.xxx:/vol/home /mnt mount.nfs4: access denied by server while mounting sunfs4.xxx:/vol/home ######################################################################### If I just only replace the rpc.gssd with the version from 11.2-x64_86 the mounts succeed. Yes, I already use "allow_weak_crypto=true" . The used binaries are: # md5sum rpc.gssd.* a8508d91fb7d05c043e4d7e1aaeae58a rpc.gssd.11.2 f2b870026fa7bcd6745a92080e4e5d88 rpc.gssd.11.3 The bug report appears similar to the bug fixed in upstream (tirpc: allow large ticket sizes with RPCSEC_GSS). To narrow down the problem, I have built nfs-utils with (--disable-tirpc) and made the test package available here: http://www.suse.com/~sjayaraman/test-pkgs/nfs-krb/ (syncing might take a few hours) Could you please try this new package and see whether the problem is reproducible? Also, report the size of the cred cache (krb5cc_xxx) file find in the client /tmp dir. no problem with your test package! # ls -l /tmp/krb5* -rw------- 1 mawa users 3069 Jul 29 16:13 /tmp/krb5cc_1003 -rw------- 1 root root 3778 Jul 29 16:08 /tmp/krb5cc_machine_PUBLIC.ADS.UNI-FREIBURG.DE Hi I install your packages include nfs-client-debuginfo on my testing machine: mount $myserver:/home /mnt/nfs -t nfs -o sec=krb5i,ro mount.nfs: access denied by server while mounting $myserver rpcsec_gss: gss_init_sec_context: (major) Unspecified GSS failure. Minor code may provide more information - (minor) No supported encryption types (config file error?) size of cache file is 577 B And I have valid ticket from KDC server. (kinit and klist are working properly). I will send you my idea how to continue via email. Martin PS: as Martin Walter wrote here, I think the problem is in rpc.gssd Hi I install your packages include nfs-client-debuginfo on my testing machine: mount $myserver:/home /mnt/nfs -t nfs -o sec=krb5i,ro mount.nfs: access denied by server while mounting $myserver rpcsec_gss: gss_init_sec_context: (major) Unspecified GSS failure. Minor code may provide more information - (minor) No supported encryption types (config file error?) size of cache file is 577 B And I have valid ticket from KDC server. (kinit and klist are working properly). I will send you my idea how to continue via email. Martin PS: as Martin Walter wrote here, I think the problem is in rpc.gssd Thanks for providing access to the setup. It turns out to be a configuring issue with the newer krb5-1.8 version (which disable DES by default - http://www.mit.edu/~kerberos/krb5-1.8/). Adding "allow_weak_crypto = true" to the client's /etc/krb5.conf file makes the problem go away. I'm able to mount using krb5i without any issue. Please check whether the same works with other setups you might have.. Created attachment 379434 [details]
Upstream patch
There are two separate issues in this bugzilla.
(1) kerberos mount failing due to larger krb ticket sizes (> 1024) which is likely in case of Windows AD
(2) kerberos mount failing due to DES being disabled by default in the recent version of krb5 (krb5 >= 1.8)
The attached patch fixes the issue (1). The issue (2) could be fixed by a configuration change - adding 'allow_weak_crypto=1' to krb5.conf.
Olaf: pdb.suse.de suggests that you are the maintainer of libtirpc. Could you please review and take the attached patch in Comment#12? I have an i586 so I could not use Suresh's x86_64 packages from comment 7. So I built the packages in OBS: https://build.opensuse.org/package/show?package=nfs-utils&project=home%3Amvidner%3Abranches%3AopenSUSE%3A11.3%3AUpdate%3ATest - configure with --disable-tirpc to work around kerberos mounts failing with large ticket sizes (sjayaraman, bnc#614293) I updated nfs-client.rpm from http://download.opensuse.org/repositories/home:/mvidner:/branches:/openSUSE:/11.3:/Update:/Test/standard and changed the config as in comment 11 and it works for me now. Looks like SLED11 SP1 is also affected with the same problem. There is a report on nfsv4@linux-nfs.org. Since the list has been deprecated for a few months now there are no archives, so I'm pasting the email thread here: Subject: Re: krb5 authentication error with nfs client 1.2.x From: "J. Bruce Fields" <bfields@fieldses.org> Date: Tue, 31 Aug 2010 14:30:23 -0400 To: Richard Smits <R.Smits@tudelft.nl> CC: "nfsv4@linux-nfs.org" <nfsv4@linux-nfs.org> On Tue, Aug 31, 2010 at 04:49:19PM +0200, Richard Smits wrote: > > Hello, > > > > We are working on a problem here what is getting bigger. I will explain. > > > > Our clients are using SLED 11. If they upgrade to sp1, they get a > > newer nfs client. > > > > Client before update : nfs-client-1.1.3-18.17 > > Client after update : nfs-client-1.2.1-2.6.6 > > > > We are using krb5 authentication with an active directory. The nfs > > mount we are trying to make is on a netapp nashead. > > > > The scenario is as followes. The client works as expected. When you > > ONLY upgrade the nfsclient package, we get an error : Have you filed a SELD bug? Right off hand it looks like 599511589ca7ddb3b2eac8d3aa5b0b38be7a7691 in upstream libtirpc. --b. > > > > mount /mnt/nfs/ > > mount.nfs4: access denied by server while mounting srvxxx:/vol/vol1/target > > > > I have enabled logging on the rpcgssd : > > > > Aug 31 16:17:09 vmlinux12 rpc.gssd[14072]: Full hostname for > > 'srvxxx.domain.net' is 'srvxxx.domain.net' > > Aug 31 16:17:09 vmlinux12 rpc.gssd[14072]: Full hostname for > > 'server.domain.net' is 'server.domain.net' > > Aug 31 16:17:09 vmlinux12 rpc.gssd[14072]: Key table entry not found > > while getting keytab entry for 'root > > /server.domain.net@DOMAIN.NET' > > Aug 31 16:17:09 vmlinux12 rpc.gssd[14072]: Success getting keytab > > entry for 'nfs/server.domain.net@DOMAIN.NET' > > Aug 31 16:17:09 vmlinux12 rpc.gssd[14072]: Successfully obtained > > machine credentials for principal 'nfs/server.domain.net@DOMAIN.NET' > > stored in ccache 'FILE:/tmp/krb5cc_machine_DOMAIN.NET' > > Aug 31 16:17:09 vmlinux12 rpc.gssd[14072]: INFO: Credentials in CC > > 'FILE:/tmp/krb5cc_machine_DOMAIN.NET' > > are good until 1283300229 > > Aug 31 16:17:09 vmlinux12 rpc.gssd[14072]: using > > FILE:/tmp/krb5cc_machine_DOMAIN.NET as credentials cache for machine > > creds > > Aug 31 16:17:09 vmlinux12 rpc.gssd[14072]: using environment > > variable to select krb5 ccache FILE:/tmp/krb > > 5cc_machine_DOMAIN.NET <snipped..> Comment to NFS-Bug After failing of mounting the NFS-Client damages the ICAuthority (user-administration, you get an suitable alert!) and the contact manager 'Evolution'! I tried it with two computers: A x86-64 and an i386. (In reply to comment #13) > Olaf: pdb.suse.de suggests that you are the maintainer of libtirpc. Could you > please review and take the attached patch in Comment#12? Suresh, Olaf. This patch works well. I also needed to install librpcsecgss to make it work. When will we see an official patch? Until then we will have to really watch our patching process to avoid that a working rpc.gssd is overwritten with a bogus one which would lock everyone out. Thanks dipe I looked at the patch, and I'm not fully convinced it always does the right
thing.
For instance, the changed code reads like this:
bool_t
xdr_rpc_gss_init_args(XDR *xdrs, gss_buffer_desc *p)
{
bool_t xdr_stat;
u_int maxlen = (u_int)(p->length + RPC_SLACK_SPACE);
xdr_stat = xdr_rpc_gss_buf(xdrs, p, maxlen);
This looks reasonable when you want to encode the gss_init args, but on
decoding, p->length will be 0, so you limit the maximum size to RPC_SLACK_SPACE.
I'm checking this with upstream
Sorry but nfs-client-1.2.1-8.3.1.x86_64.rpm has not solved the problem. *** Bug 657813 has been marked as a duplicate of this bug. *** just tested with: - 11.4rc1 -> access denied by server - 11.4rc1 with nfs-client-1.2.3-47.1.i586 from factory -> access denied .. - 11.4rc1 with librpcsecgss3-0.18-9.3.i586 and rpc.gssd from 11.2 -> success don't want to stay with 11.2 ! :( This is apparently still broken in 11.4. in 11.4 the rpc.gssd from 11.3 works! ;-) $ uname -a; md5sum /usr/sbin/rpc.gssd* Linux xxx 2.6.37.1-1.2-desktop #1 SMP PREEMPT 2011-02-21 10:34:10 +0100 x86_64 x86_64 x86_64 GNU/Linux f2b870026fa7bcd6745a92080e4e5d88 /usr/sbin/rpc.gssd f2b870026fa7bcd6745a92080e4e5d88 /usr/sbin/rpc.gssd.113 3682571ef2d3d3c9c5e6774708c86270 /usr/sbin/rpc.gssd.114 With allow_weak_crypto=1 it works in 11.4. We have already allow_weak_crypto=1 . No problem with rpc.gssd from 11.3 . With 11.4: mount.nfs4: access denied by server while mounting ... The ADS is windows-server 2003 sp2. In SLED11 the nfs-client-1.2.1-2.10.1 is not working with our AD and krb5 nfs mounts. We must use nfs-client-1.2.1-8.1 or our gss mounts are not working. Any news on this issue ? I have the same problem with 11.4 as Martin Walter on our evaluation system. The workaround with rpc.gssd from 11.3 works here too! But last nfs update overwrites the rpc.gssd.... When can 11.4 be used in productive instalations? Is there any chance to get this work out of the box? We are using a 11.0 nfs-server with krb5-nfs4. The patch mentioned in comments 16,18,19 is already in 11.4, so this must be a new problem. I've looked through the changes to rpc.gssd between 11.3 and 11.4 and there is nothing that would obviously cause any breakage. It would help a lot if I could get some comparative tracing output. i.e. Run the 11.3 gssd as # kill the old rpc.gssd rpc.gssd -f -vvvv > /tmp/11.3.log # perform test which succeeds # interrupt rpc.gssd Then the same with 11.4 gssd rpc.gssd -f -vvvv > /tmp/11.4.log # perform same test which should now fail # interrupt rpc.gssd Then attach both of the .log files to this bugzilla and I'll have a look. Thanks. Created attachment 442554 [details]
output of rpc.gssd -f -vvvv > /tmp/11.3.log 2>&1
This is the output for the 11.3 rpc.gssd binary:
rpc.gssd -f -vvvv > /tmp/11.3.log 2>&1
when isuing a mount -t nfs4 /umount operation.
The mount works without error.
Created attachment 442555 [details]
rpc.gssd.11.4 -f -vvvv > /tmp/11.4.log 2>&1
This is the output of the 11.4 binary rcp.gssd
rpc.gssd.11.4 -f -vvvv > /tmp/11.4.log 2>&1
when issuing a mount -t nfs4 command
The command results in the error message:
mount.nfs4: access denied by server while mounting server:/home
Thanks for the logs. It seems to be failing in authgss_create_default. The only difference I can see in the way this is being called is that the list of permitted crypto schemes has been enlarged. In 11.3 it is hard coded as 1,3,2 In 11.4 it gets the info from the kernel and uses 18,17,16,23,3,1,2 This second list is hardcoded into the kernel. I'll attach a rpc.gssd binary based on 11.4 code but using the old hard coded list. Please check and report how it goes. If it works, then we need to understand why trying to negotiate other encryption schemes causes a problem. If it does ... I'll have to dig harder. Created attachment 442802 [details]
rpc.gssd binary for testing (x86-64)
Thanks for the patch. But unfortunately the test-binary has the same result as the original 11.4 binary The result is: mount.nfs4: access denied by server while mounting server:/home The log-output is in the next message. Created attachment 442876 [details]
Output of the test binary bug-614293_rpc.gssd
The output is create with:
/tmp/bug-614293_rpc.gssd -f -vvvv > /tmp/bug-614293.log 2>&1
I forgot to remove the status NEEDINFO Thanks for testing. I'll have to look deeper. I might end up creating a binary with lots more debugging and getting you to test that. Probably won't happen until next week though. Created attachment 443572 [details]
Alternate rpc.gssd binary for testing
I discovered an error in making previous rpc.gssd so it wasn't really any different to the 11.4 one.
This one is fixed.
I've looked further and I cannot see anything else that could possibly cause the failure where it happens so if this isn't it - I'm lost.
So: please test the same way and report.
Thanks.
Created attachment 443687 [details]
Output of the test binary bug-614293_rpc2.gssd
Thanks for the binary.
The new binary works as expected.
The share is mounted flawless and first r/w tests are ok.
Excellent - thanks. Now we need to figure out why that change makes a difference. I'll ask upstream developers for idea but there are some details that might help. First: please confirm what sort of server you are using. Comment 26 says: The ADS is windows-server 2003 sp2. Comment 27 seems to suggest ADS too. If you have a windows server providing NFS and/or authentication too, that might be a common thread. Also it might help to get a 'tcpdump' of traffic during them mount (or mount attempt) to compare. i.e. on the NFS client run tcpdump -s 0 -w /tmp/filename host NAME-OF-SERVER & # then start rpc.gssd # them try to mount the filesystem killall tcpdump and attach /tmp/filename. If you could try that for both the non-working and the working rpc.gssd and attach both files? That would be great. The server is running OpenSuse 11.0 with kernel 2.36.4 vanilla and nfsv4 shares. The authentication system is the enclosed krb5 (krb5-1.6.3-50.11) with ldap. The 11.4 client has set allow_weak_crypto = true to enable krb5 authentication of nfsv4 If you need further information on the systems, please ask me. I will see when I can provide the tcp-dump logs. Thanks.. I suspect this may be a known bug (not that I personally knew about it..) According to http://wiki.linux-nfs.org/wiki/index.php/Enduser_doc_kerberos you need the "allow_weak_crypto = true" on the server, but you say it is on the client. Are you able to change it on the server and see if that makes a difference? I may get rpc.gssd to retry with the 'old' approach when the 'new' approach doesn't work, but first I'd like to be sure I understand the problem. I can try "allow_weak_crypto = true" on the server too. But I have needed this option beginning with OpenSuse 11.3 on the client. 11.1 uses week_crypto by default. And afaik the problem is that the kernel part does not support strong encryption until now, but newer gss and krb5 implementations defaults to strong encryption. I can try to change this on the server, but not at the moment during working hours. In the evening I can try this, but I suspect, krb5-1.6 does not support this option. But we will see! 1. "allow_weak_crypto = true" on the server does not solve crypto the problem... It has no influence at all. 2. Can I send you the tcpdumps via email? As the data should not become public? (In reply to comment #44) > 1. "allow_weak_crypto = true" on the server does not solve crypto the > problem... > It has no influence at all. Thanks. > > 2. Can I send you the tcpdumps via email? As the data should not become public? Sure - nfbrown@novell.com - but you can see that I expect. I'll keep them confidential. Created attachment 444812 [details]
New rpc.gssd which attempts to auto-correct
Thanks for the tcpdump traces. They largely show what I would expect.
I've modified rpc.gssd to handle failure by retrying with a reduced set of allowed encryption types. Hopefully this will work correctly on all servers...
Please:
1/ Remove the "allow_weak_crypto = true" from the server - it seems to be
a problem. I realise you will need to wait for a quite time to do that.
2/ try to mount a file system with the attached rpc.gssd running. If you could
collect a tcpdump trace while that happens and mail it to me that would be
great. I don't expect to find any surprised in it, but it would be nice to
be certain of that.
If you confirm that it works and there are no surprises, I will submit this patch to Factor and try to get an update for 11.4 scheduled in due course.
Just for completeness, the change I made is below.
Thanks.
Index: nfs-utils-1.2.3/utils/gssd/gssd_proc.c
===================================================================
--- nfs-utils-1.2.3.orig/utils/gssd/gssd_proc.c 2010-09-28 22:24:16.000000000 +1000
+++ nfs-utils-1.2.3/utils/gssd/gssd_proc.c 2011-08-09 11:23:49.316191138 +1000
@@ -917,6 +917,23 @@ int create_auth_rpc_client(struct clnt_i
printerr(2, "creating context with server %s\n", clp->servicename);
auth = authgss_create_default(rpc_clnt, clp->servicename, &sec);
+#ifdef HAVE_SET_ALLOWABLE_ENCTYPES
+ if (!auth && authtype == AUTHTYPE_KRB5 && krb5_enctypes) {
+ u_int min_stat;
+ /* The extended list of enctypes can confuse old servers */
+ gss_release_cred(&min_stat, &sec.cred);
+ free(krb5_enctypes);
+ krb5_enctypes = NULL;
+ num_krb5_enctypes = 0;
+ printerr(2, "retry auth using default encryption types\n");
+ if (limit_krb5_enctypes(&sec) == 0)
+ auth = authgss_create_default(rpc_clnt,
+ clp->servicename, &sec);
+ else
+ printerr(1, "WARNING: Failed while limiting krb5 "
+ "encryption types to default list\n");
+ }
+#endif
if (!auth) {
/* Our caller should print appropriate message */
printerr(2, "WARNING: Failed to create %s context for "
The log from the new binary is in your mailbox. Maybe there is a possibility to reduce the 30s penalty for old servers? Thanks for the log... It looks a bit odd though - I cannot make out why it is doing what it appears to be doing. So I've spent a while setting up kerberos and exploring gssd and I have found another possible approach. Could you please try adding default_tkt_enctypes = des-cbc-md5 des-cbc-crc des3-cbc-sha1 to the [libdefaults] section of /etc/krb5.conf, and trying again with the standard openSUSE-11.4 rpc.gssd? If that doesn't get you there, add permitted_enctypes = des-cbc-md5 des-cbc-crc des3-cbc-sha1 default_tgs_enctypes = des-cbc-md5 des-cbc-crc des3-cbc-sha1 and see if that helps. It appears that the problem is really with the server - the libraries understand the newer encryptiong types but the kernel doesn't. So a client that uses a newer encryption type results in confusion. The above addition to krb5.conf tell kerberos to only use the older encryption types and so allow it to work with an older server. I don't actually have an 11.1 machine I can play with as a server so I haven't tried out exactly the combination you have, but making those changes does seem to change the things that rpc.gssd says to better match what the old rpc.gssd said. As you supposed adding default_tkt_enctypes = des-cbc-md5 des-cbc-crc des3-cbc-sha1 works with the standard rpc.gssd of 11.4 Does this also influence the security of logins? Since witch kernel version is des3 supported for nfs4? Thanks for the suggestion and your help! I suspect it might affect the strength of the crypto used for logins too. However your security is only as strong as the weakest link, and as your NFS server does not support anything stronger it will be your weakest link. DES is theoretically more vulnerable than more recent encodings. How much this actually increases your exposure is very hard to say. The safest approach is to upgrade the server so you can drop the limitation. AES crypto was added in 2.6.35, but server support requires either 2.6.39, or possibly an earlier kernel with nfs-utils-1.2.5 (which hasn't been release yet). As the original bug mention in this bugzilla was fixed, I'll resolve this as FIXED. The subsequent bug is really a server bug for which we have a workaround (default_tkt_enctypes = des-cbc-md5 des-cbc-crc des3-cbc-sha1) so I won't pursue that any more. Thanks for you help in getting to the bottom of this. |
Created attachment 369189 [details] strace logs User-Agent: Mozilla/5.0 (X11; U; Linux i686; cs-CZ; rv:1.9.1.9) Gecko/20100317 SUSE/3.5.9-0.1.1 Firefox/3.5.9 Hi, NFS client with kerberos identification (-o sec=krb5i or -o sec=krb5p) isn't working on OpenSuse 11.3 Milestone 7. Normal nfs mount without kerberos ticket (-o default) is working. In the KDC server log file, I can see request and answer for the nfs ticket as any other workstation (- no error/warning messages) I tested it on i686 and x86-64 CPU with same result. BTW: NFS client in YaST2 is broken too, but bug was already reported I'm from Suse (Prague) If you need help with testing Debuging the problem I can help you - just ping me on Novell IM. Martin Reproducible: Always Steps to Reproduce: 1.Install OpenSuse 11.3 M7, setup ldap and kdc client (check if login on machine is working) disable firewall 2. don't use YaST for NFS client - is broken! 3. enable gss in /etc/sysconfig/nfs, download yours /etc/krb5.keytab from kdc server, add nfs mount into /etc/fstab e.g.: nfsserver:/home /nfs nfs sec=krb5i,intr,rw 4. reboot machines 5. run nfs mount e.g.: "mount nfsserver:/home /nfs -t nfs -o sec=krb5i,intr,rw Actual Results: error messages: "mount.nfs: access denied by server while mounting nfs.suse.cz:/home" Expected Results: mount.nfs should successfully mount /nfs. I add strace logs file. I hope It helps you with debug and fix the problem.