Web lists-archives.com

Re: [Samba] winbind causing huge timeouts/delays since 4.8




Am 23.02.19 um 15:48 schrieb Rowland Penny via samba:
If you have, as you have, 'files sss winbind' in the the passwd
& group line in nsswitch.conf, means this:
First /etc/passwd or /etc/group is searched and if the user or
group is found, this info is returned.
Next sssd will be asked, 'do you know this user or group ?' if
found, the info is returned.
Finally winbind will be asked, 'do you know this user or
group ?' if found, the info is returned.

Lets take a user called 'fred', this user is in AD. The first
search will return nothing, so sssd is asked, this 'asks' AD and
returns the users info. Finally, wait that's it, we have the
info, there is no need to ask winbind for anything.

That is incorrect. Alexander stated:

No. we use max. 3 auth providers: (1. and 2. on all unix
servers) 1. unix (local passwd)
for static OS/service accounts across all our env
2. sssd (with unix ldap servers as provider)
unix experienced user and application related service accounts
3. samba/winbind
for windows users/services needing access to a group of unix
servers

And:

They don't - as stated above we use sssd for query/caching
entries from our ldap directory server and not Windows
DomainConmtrollers
- also this is possible, but makes more trouble and don't
provide what samba's smb/windbind does.

He clearly writes (in multiple emails) that sssd is configured to
use his unix ldap servers and not AD.

Maybe three sources of user databases is not regular, but I fail
to see why this should be a problem (provided that usernames,
uidNumbers and such are unique across the databases).

And there is the problem, if 'fred' is in /etc/passwd, that user
will be used, but what if you meant fred in ldap or AD ?

We are aware of this possible clash and it's handled during users account creation.

There is absolutely no point in having 4 databases (yes there are
4, Unix, sssd, winbind and the ldap lines in smb.conf), they
could all be combined in AD.
No it won't work as our windows team doesn't accept schema changes for unix in AD.


The main problem is that the OP wants Samba changing to cope with
his mess, it might be a valid change, but the reason for the
change is invalid.
The intial reason we hitted this after upgrade from samba-4.7 to 4.8 is a script that frequentyl checks the system for changes and a final "chown root.wheel FILE" freezes the system for approx a minute (simliar to "wbinfo -i foo"). The winbind and also sssd log showd that both were asked about a user "root.wheel" which is another question, why the notation which usually (under linux) indicates user.group and not an account with a dot in it's name - but more to glibc related. Removing winbind from nss fixed the freeze, but isn't an option. It leads to the point that asking winbind for an uknown user without domain took a long time before it returns WBC_ERR_DOMAIN_NOT_FOUND .


Well, I think the problem is you _assume_ users are in multiple
databases and we just don't know that. I think there is a good
change Alexander perfectly knows what he is doing and users are
unique across databases.

Nevertheless, at some point nss is clearly querying winbind, which
means nss did not find the user in either /etc/passwd nor via sssd.
In the case that winbind _is_ queried, Alexander is experiencing,
like he wrote, 'frequently system hangs/slowness for a couple of
seconds' and he observed that winbind is causing this behaviour.

So maybe we should set our focus on winbind instead of the multiple
database stuff and figure out why it behaves like this since the
upgrade from 4.7 to 4.8. I would say we should start with fixing
the winbind stuff in smb.conf. Right?

-Remy


P.S. I am following this thread since I also noticed occasional
'hangs' when the system is querying winbind. This is Samba 4.8.7 on
FreeBSD 11.2.





I am quite prepared to help in getting winbind working correctly,
but this will require the OP changing their smb.conf considerably
and removing sssd. We do not support sssd, it is not a Samba
product (for want of a better name). Samba on a Unix domain member
is designed around 3 binaries, smbd, nmbd and winbind, the latter
can do just about anything sssd can do, so why use sssd ? Now you
say that I am making assumptions, well about this one, probably
somewhere in the mix there will be Windows domain members and the
users in ldap are unlikely to be known to them.

I consider sssd as 'just another' user database, like /etc/passwd
(which Samba apparently does support) and I personally cannot see any
difference there, but I respect your opinion.

Where is it documented winbind should be the only service which
should be used with nss? If it is not documented, maybe it should.

I am not saying that sssd shouldn't be used, just Samba does not
support it. If you want to use sssd, then do so, just don't expect to
get help with using it, we don't produce it, so don't know it.
What I will say is this, there is no need to use both on a Unix domain
member, they both do the same thing.

I'm with you and don't expect that you support sssd.
On the otherside windind shouln't require to be the only one in nss-setup as i didn't ever heard, that only a certain amount can be taken into the stack. Before nss_sss we used nss_ldap alongside with nss_winbind without issues. The only interaction i could imagine if one of the libs in the stack calls the stack from beginning waiting for a certain answer ending up in a dead end. Following the functions calls of the parse_domain_user function it seems to me samba takes care about this with the flag LOOKUP_NAME_NO_NSS in the code - only an assumption as programming is not my daily business.



The proof of the pudding is of course, Alexander removing sssd from
nsswitch.conf and show us the problem still exists, or better yet,
disappeared.

That is what I am trying to get at, if it is a Samba problem, then it
will still be there after sssd is removed and the smb.conf is fixed.



I have seen quite a few Samba setups that are like this one, bending
Samba to do something it isn't designed to do, you then get
complaints that it slow, hangs etc. Probably fixing the set up
would stop all these problems.

The OP says that it is sssd that is doing the ldap lookups, yet he
has these in smb.conf:

ldap connection timeout = 10
ldap timeout = 30

Yep, these lines should be removed.

Glad you agree ;-)

These were a left over from testing config changes to find a solution to the problem and reomving them so they get to it's default didn't has a real affect: [root@centos7dev64 ~]# testparm -v 2>/dev/null </dev/null|grep "ldap.*timeout"
        ldap connection timeout = 10
        ldap timeout = 30
[root@centos7dev64 ~]# time wbinfo -i foo
failed to call wbcGetpwnam: WBC_ERR_WINBIND_NOT_AVAILABLE
Could not get info for user foo

real    1m1.700s
user    0m0.049s
sys     0m0.010s
[root@centos7dev64 ~]# vi /etc/samba/smb.conf
[root@centos7dev64 ~]# systemctl restart smb winbind sssd ; sss_cache -E ; net cache flush [root@centos7dev64 ~]# testparm -v 2>/dev/null </dev/null|grep "ldap.*timeout"
        ldap connection timeout = 2
        ldap timeout = 15
[root@centos7dev64 ~]# time wbinfo -i foo
failed to call wbcGetpwnam: WBC_ERR_DOMAIN_NOT_FOUND
Could not get info for user foo

real    0m59.304s
user    0m0.051s
sys     0m0.013s

Curious is also that wbinfo returns different errors for the same call:
1. failed to call wbcGetpwnam: WBC_ERR_WINBIND_NOT_AVAILABLE
2. failed to call wbcGetpwnam: WBC_ERR_DOMAIN_NOT_FOUND

The first somewhow tells me winbind got stuck or not responding in time.
The second is more the expected response as foo does not provide a domain and "winbind use default domain" is set to it's default no - not expected the time it takes to get to this finding.



He also has these:

idmap config * : rangesize = 1000000
idmap config * : range = 1000000-19999999
idmap config * : backend = autorid

The '*' domain is meant for the Well Known SIDs and anything outside
the Samba domain. I would have expected something like this:

idmap config * : backend = tdb
idmap config * : range = 3000-7999
idmap config OPS : backend = rid
idmap config OPS : range = 10000-999999

That should also be fixed.


We use this as we have a multi-domain setup on windows side and this is a suggested setup from wiki.samba.org:
https://wiki.samba.org/index.php/Idmap_config_autorid

I'll try to somehow reconfig idmap as you suggested taking care of all the trees in our forest and will report back if that changes the situation.

If he wants to use Samba in a supported way, then I am more than
willing to help.

Thanks. Now let's hope Alexander is willing to jump some hoops.

I am not holding my breath ;-)

Rowland


Thanks for responding and keep breathing ;)

Alex

--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba