Re: [Samba] Replication Failure Issue
- Date: Tue, 27 Mar 2018 10:00:06 +1100
- From: David Minard via samba <samba@xxxxxxxxxxxxxxx>
- Subject: Re: [Samba] Replication Failure Issue
On 26/03/18 23:13, lingpanda101 wrote:
On 3/25/2018 8:54 PM, David Minard wrote:
Before you try anything further I would suggest you make a good backup
of your current DC not exhibiting any replication issues.
On 24/03/18 01:35, lingpanda101 wrote:
On 3/22/2018 8:06 PM, David Minard wrote:
I myself would hold off updating until you correct the DC's with the
issues. Anything in the Samba logs or yum history stand out? You can
try and force replication 'samba-tool drs replicate --full-sync' from
FirstDC to SecondDC.
Will replay to all messages so far in this one to keep it all
On 21/03/18 22:52, lingpanda101 wrote:
On 3/21/2018 7:32 AM, David Minard via samba wrote:
The thing is, that I did not upgrade the version of Samba - that
is the next step, so the ports used would not have changed. I only
updated the OS.
On 21/03/2018, at 10:04 PM, Carlos Alberto Panozzo Cunha
I have same problem after update for samba.
I allow new ports in firewall.
On Wed, Mar 21, 2018, 00:15 David Minard via samba
I have 4 DCs on Centos 7.1. Everything was working
really well for
years, including replication.
Then I decided that the OS needed updating. Did the yum
update on one
of the DCs, rebooted. That server is now running Centos 7.4. Samba
seemed to start okay.
However, samba-tool drs showrepl gives this error on all
3 of the other
DCs, and shows success on the updated DC.
Default-First-Site-Name\SAMBA4-10 via RPC
DSA object GUID:
Last attempt @ Wed Mar 21 12:58:13 2018 AEDT
failed, result 58
10623 consecutive failure(s).
Last success @ Thu Mar 8 14:34:14 2018 AEDT
Any thoughts on why this DC is now not replicating
thoughts on how to remedy this?
You most likely will need to turn up the samba log level to get
additional information but you can start with running 'yum history
list all' and post results. This might help identify the changes
that were made to the OS. Are you using bind or the internal DNS?
I will turn up the logs and test it out.
I use Bind-9.9.4-51 (before update 9.9.4-18)
yum history shows 348 packages that got updated... Bind being one.
Will sift through them.
My firewall is very lose. All ports are open for the subnets on
which the samba servers need to talk. eg:
-A INPUT -s 172.20.0.0/16 -p tcp -m state --state NEW -m tcp -j ACCEPT
-A INPUT -s 172.20.0.0/16 -p udp -m state --state NEW -m udp -j ACCEPT
When I first set this up with 4.0.0-a2 (or whatever it was right at
the beginning), I was not able to work out what ports exactly were
needed, hence the lose rules. Now I see they are documented clearly
on the Samba site, I will tighten them up, but not until the issue
My samba is complied from source. I am currently running 4.3.2. It's
been running flawlessly so no urgency to update, until the huge
security hole was announced the other week. Now I've got to get it
done, but want the ailing server going right first - or should I
just do the updates and then worry about the ailing server?
# Global parameters
workgroup = SCEM_AD
realm = samba4.scem.westernsydney.edu.au
netbios name = SAMBA4-10
server role = active directory domain controller
server services = s3fs, rpc, nbt, wrepl, ldap, cldap, kdc,
drepl, winbindd, ntp_signd, kcc, dnsupdate
# log level = 1 auth:2
# logs split per machine
log file = /var/log/samba/log.%m
# max 50KB per log file, then rotate
max log size = 0
read only = No
path = /usr/local/samba/var/locks/sysvol
read only = No
It is the out of the box config from the original provision.
The first thing I tried, was the forced replication on NC that was
# samba-tool drs replicate Broken-DC Working-DC
Replicate from Working-DC to Broken-DC was successful.
Then doing the showrepl on all DCs, everything seemed fine.
I held off sending this message for a couple of hours, and things are
now showing up as broken again. I now have two DCs with the same
issue, because I accidentally got the direction of the sync wrong. I
went source destination, rather than destination source. I should read
the help a bit better!
Anyway, this shows that manual replication seems successful, and that
it might not be a firewall thing, as the second DC that now has the
issue has not been updated in any way, shape, or form.
Now the strangest thing is that the two broken-DCs report that
everything is fine between them when I showrepl. From the working-DCs,
they show the two broken-DCs up.
Right oh. Will get onto that.
Have you tried correcting the force replication with a known good DC?
Yes. The replication says it is successful. When I showrepl from the
good DC, the issue shows up again.
If I do a showrepl on one of the bad DCs, it show all DCs to be okay.
You can try to further troubleshoot the issues and attempt to resolve,
but the easiest thing IMO would be to join new DC's to the domain.
Remove the other two DC's from the domain and never join them again.
I will look into that. We have one site with no DC, so this would be a
good opportunity to introduce one. If that one is okay, then I can drop
the broken ones and set up new ones as you suggest.
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
To unsubscribe from this list go to the following URL and read the