Web lists-archives.com

[Samba] performance problem on bridgehead DC




Hi everybody !

I am encountering performance problem on my bridgehead DC.
I have 19 DC (Debian Stretch / Samba 4.6.7 from Tranquil.it repo) and they all synchronized on a main bridgehead DC.

This performance problem first appears when the bridgehead DC was on Debian Jessie and kernel was updated with meltdown/spectre patch from Debian (3.16.51-3+deb8u1)
I added "nopti" option to grub boot to resolve the performance problem.
Last night, I upgrade this Bridgehead DC from Jessie -> Stretch (samba package 4.6.7 from Tranquil.it is the same for Jessie and Stretch), I also keep "nopti" option but problem is back again.

There is 2 processes eating lots of CPU, samba-tool drs showrepl takes minutes, system is slow, load average is at 2 constantly

dc000:~# htop
  PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
28923 root       20   0  641M 42604 19064 S 39.6  2.1  1h26:53 /usr/sbin/samba
28930 root       20   0  641M 36340 13136 R 54.4  1.8 45:00.58 /usr/sbin/samba

dc000:~# samba-tool processes | egrep "(28923|28930)"
rpc_server             28923
rpc_server             28923
rpc_server             28923
rpc_server             28923
rpc_server             28923
rpc_server             28923
rpc_server             28923
rpc_server             28923
rpc_server             28923
dreplsrv               28930

dc000:~# strace -p 28923 -f
strace: Process 28923 attached
strace: [ Process PID=28923 runs in x32 mode. ]
strace: [ Process PID=28923 runs in 64 bit mode. ]
fcntl(15, F_SETLKW, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=332, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=368, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=5073496, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=368, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=368, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=5073496, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=368, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=376, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=5125448, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=376, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=376, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=5125448, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=376, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=380, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=1545572, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=380, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=380, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=1545572, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=2265976, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=380, l_len=1}) = 0
fcntl(15, F_SETLKW, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=380, l_len=1}) = 0
.......

dc000:~# egrep "(28923|28930)" /proc/locks
16: POSIX  ADVISORY  READ  28923 ca:01:132971 168 EOF
17: POSIX  ADVISORY  WRITE 28923 ca:01:132971 8 8
18: POSIX  ADVISORY  READ  28923 ca:01:132978 168 EOF
19: POSIX  ADVISORY  WRITE 28923 ca:01:132978 8 8
20: POSIX  ADVISORY  READ  28923 ca:01:132976 168 EOF
21: POSIX  ADVISORY  WRITE 28923 ca:01:132976 8 8
22: POSIX  ADVISORY  READ  28923 ca:01:132970 168 EOF
23: POSIX  ADVISORY  WRITE 28923 ca:01:132970 8 8
24: POSIX  ADVISORY  READ  28923 ca:01:132968 168 EOF
25: POSIX  ADVISORY  WRITE 28923 ca:01:132968 8 8
26: POSIX  ADVISORY  READ  28923 ca:01:132940 168 EOF
27: POSIX  ADVISORY  WRITE 28923 ca:01:132940 8 8
28: POSIX  ADVISORY  READ  28923 ca:01:132930 168 EOF
29: POSIX  ADVISORY  WRITE 28923 ca:01:132930 8 8
31: POSIX  ADVISORY  WRITE 28923 00:12:387271 0 EOF
54: POSIX  ADVISORY  WRITE 28930 00:12:389224 0 EOF

dc000:~# ls -la /proc/28923/fd/15
lrwx------ 1 root root 64 mars  16 09:44 /proc/28923/fd/15 -> /var/lib/samba/private/sam.ldb.d/DC=DOMAINDNSZONES,DC=PR,DC=EDUCATIONETFORMATION,DC=FR.ldb

I also have many access on :
dc000:~# ls -la /proc/28930/fd/14
lrwx------ 1 root root 64 mars  16 09:44 /proc/28930/fd/14 -> /var/lib/samba/private/sam.ldb.d/DC=PR,DC=EDUCATIONETFORMATION,DC=FR.ldb
dc000:~# ls -la /proc/28930/fd/15
lrwx------ 1 root root 64 mars  16 09:44 /proc/28930/fd/15 -> /var/lib/samba/private/sam.ldb.d/DC=DOMAINDNSZONES,DC=PR,DC=EDUCATIONETFORMATION,DC=FR.ldb
dc000:~# ls -la /proc/28930/fd/16
lrwx------ 1 root root 64 mars  16 09:44 /proc/28930/fd/16 -> /var/lib/samba/private/sam.ldb.d/DC=FORESTDNSZONES,DC=PR,DC=EDUCATIONETFORMATION,DC=FR.ldb

dbcheck is fine on all DC (10 000 objects)
All dc are synced from the bridgehead
DC Bridgehead is running with 8 virtual CPU and 2Go of Ram


If you have any idea, you are welcome :)
Thanks
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba