Web lists-archives.com

Re: [Samba] authentication performance with 4.7.6 -> 4.7.8 upgrade (was: Re: gencache.tdb size and cache flush)




On Tue, 2018-09-04 at 14:15 +1200, Andrew Bartlett via samba wrote:
> On Wed, 2018-08-29 at 15:36 +0200, Peter Eriksson via samba wrote:
> > For what it’s worth you are not alone in seeing similar problems with Samba and gencache. 
> > 
> > Our site has some 110K users (university with staff & students (including former ones), and currently around 2000 active (SMB) clients connecting to 5 different Samba servers (around 400-500 clients per server). When we previously just let things “run” gencache.tdb would grow forever and authentication login performance would start to deteriorate after a little while (would take more than 10 seconds). So we now delete it (and locks/locking.tdb that also tends to grow forever) and restart our samba processes every morning at 7 am - which gives us much more stable performance.
> > 
> > - Servers with 256GB of RAM, 10Gbps ethernet interfaces and around 110TB of disk per server.
> > - FreeBSD 11.2-p2
> > - Samba 4.7.6 with some local patches to allow (much) bigger socket listening queues in order to handle the case of many clients connecting at the same time.
> > 
> > (We are trying to upgrade to a more recent Samba but 4.7.8 and 4.7.9 gave us horrible authentication performance every 10:th hour where the servers basically denied clients to login for about 2 hours so we had to back down to 4.7.6 again).
> 
> I realise testing in production is difficult, but is there any chance
> you can pin down where between 4.7.6 and 4.7.8 it broke?  There are not
> that many changes between, and while some appear authentication related
> nothing stands out. 
> 
> Also, do you run Samba as an AD DC, or are these file servers in a
> windows domain?

BTW, the main caching change made in that set of versions is:

commit 0f2e2711e92a433abdc9436ecaf3ba9d773902c8
Author: Volker Lendecke <vl@xxxxxxxxx>
Date:   Tue Aug 8 14:24:27 2017 +0200

    winbindd: Name<->SID cache is not sequence number based anymore
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=13369
    
    Signed-off-by: Volker Lendecke <vl@xxxxxxxxx>
    Reviewed-by: Ralph Boehme <slow@xxxxxxxxx>

commit a92c5dc7800a32c4dc58051c111a43b4749d0854
Author: Volker Lendecke <vl@xxxxxxxxx>
Date:   Sun Aug 6 18:13:10 2017 +0200

    winbindd: Move name<->sid cache to gencache
    
    The mapping from name to sid and vice versa has nothing to
    do with a specific domain. It is publically available. Thus put
    it into gencache without referring to the domain this was
    retrieved from
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=13369
    
    Signed-off-by: Volker Lendecke <vl@xxxxxxxxx>
    Reviewed-by: Ralph Boehme <slow@xxxxxxxxx>

Perhaps this gives something to try and revert to pin this down.

Andrew Bartlett

-- 
Andrew Bartlett
https://samba.org/~abartlet/
Authentication Developer, Samba Team         https://samba.org
Samba Development and Support, Catalyst IT   
https://catalyst.net.nz/services/samba





-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba