Web lists-archives.com

[Samba] winbind causing huge timeouts/delays since 4.8




Hello!

I want to share some findings with the community about hugh timeouts/delays since upgraded to samba 4.8 end of last year and a patch fixing this in our setup. It would be great if someone from samba dev team could take a look and if acceptable apply the patch to the common code base. It may also affect current stable and release candidates. The patch expects the patch from BUG 13503 "getpwnam resolves local system accounts to AD" being already applied.

Within the company i'm working for, we see frequently system hangs/slowness for a couple of seconds on servers using winbind passwd/group resolution via nsswitch.conf since we updated our OS from CentOS7.5 to CentOS7.6 which includes a samba update from 4.7 to 4.8.

We could track it down to winbind and when it is asked for an unknown local user account. This means that the users account in question is not in local passwd and doesn't contain any domain like SOMEDOMAIN\account or account@SOMEDOMAIN. The expected behavior is an immediately return with an error like "no such user" or "unknown user", but instead a call like "id unknown" takes 60+ seconds. Increasing "winbind max domain connections" could reduce this to 10+ seconds and setting "winbind use default domain" to yes could get it back to the expected immediately response. A protocol about different setups can be found at the bottom.

As none of the config changes make sense as a requirement to me and setting "winbind use default domain" to yes isn't usable on some of our servers, i digged deeper using wbinfo to talk to the winbind more directly and so avoid other services affecting testing.
The finding was pretty clear:
[root@centos7dev64 ~]# testparm -v 2>&1 < /dev/null|grep "winbind use default domain" ; time wbinfo -i unknown
        winbind use default domain = No
failed to call wbcGetpwnam: WBC_ERR_WINBIND_NOT_AVAILABLE
Could not get info for user unknown

real    1m2.522s
user    0m0.005s
sys     0m0.009s
[root@centos7dev64 ~]# vi /etc/samba/smb.conf ; systemctl restart winbind
[root@centos7dev64 ~]# testparm -v 2>&1 < /dev/null|grep "winbind use default domain" ; time wbinfo -i unknown
        winbind use default domain = Yes
failed to call wbcGetpwnam: WBC_ERR_DOMAIN_NOT_FOUND
Could not get info for user unknown

real    0m0.015s
user    0m0.005s
sys     0m0.005s

Doing some code research i could track it down to a logical change and the return value of the function parse_domain_user from within source3/winbindd/winbindd_util.c.
Calling the function with this conditions:
- none domain (e.g. empty)
- user without a domain part (e.g. not DOM\user or user@DOM)
- "winbind use default domain" set to No/false (which is the default) causes different return values:
- up to version 4.7: false
- since version 4.8: \0 - e.g. empty string

Applying the attached patch that re-introduce the return value of false instead of '\0' fixed the described issues and we now could revert back to former config without changing "winbind use default domain" and/or "winbind max domain connections" from their default values using our patched version of samba.

Hopefully this helps others and i would appreciate if it gets into common code base of samba, so it could get into usual update channels of the distributions out there. For CentOS i already reported a bug (15795) for further processing.

Best regards

Alex

#######

Here is a protocol of a trip through the different config settings on one of our servers, which is reproducible on the other servers using winbind and samba-4.8:

[root@centos7dev64 ~]# rpm -q samba-4*
samba-4.8.3-4.el7.x86_64
[root@centos7dev64 ~]# testparm -v 2>&1 < /dev/null|egrep "(Server role|winbind use default domain|max domain connections)" ; time id unknown
Server role: ROLE_DOMAIN_MEMBER
        winbind max domain connections = 1
        winbind use default domain = No
id: unknown: no such user

real    1m8.630s
user    0m0.000s
sys     0m0.009s
[root@centos7dev64 ~]# vi /etc/samba/smb.conf ; systemctl restart winbind ; sss_cache -E [root@centos7dev64 ~]# testparm -v 2>&1 < /dev/null|egrep "(Server role|winbind use default domain|max domain connections)" ; time id unknown
Server role: ROLE_DOMAIN_MEMBER
        winbind max domain connections = 10
        winbind use default domain = No

id: unknown: no such user

real    0m10.914s
user    0m0.000s
sys     0m0.005s
[root@centos7dev64 ~]# vi /etc/samba/smb.conf ; systemctl restart winbind ; sss_cache -E [root@ecentos7dev64 ~]# testparm -v 2>&1 < /dev/null|egrep "(Server role|winbind use default domain|max domain connections)" ; time id unknown
Server role: ROLE_DOMAIN_MEMBER
        winbind max domain connections = 10
        winbind use default domain = Yes
id: unknown: no such user

real    0m0.020s
user    0m0.002s
sys     0m0.003s
diff -Naur samba-4.8.9/source3/winbindd/winbindd_util.c samba-4.8.9-fix_winbind_empty_domain/source3/winbindd/winbindd_util.c
--- samba-4.8.9/source3/winbindd/winbindd_util.c	2018-12-13 04:08:40.000000000 -0500
+++ samba-4.8.9-fix_winbind_empty_domain/source3/winbindd/winbindd_util.c	2019-02-21 06:30:52.358040157 -0500
@@ -1604,6 +1604,7 @@
 			fstrcpy(namespace, domain);
 		} else {
 			fstrcpy(namespace, lp_netbios_name());
+			return false;
 		}
 	}
 
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba