Web lists-archives.com

Re: [Samba] Recurrent DNS issues after DC loss

On 05.06.2018 20:39, lingpanda101 wrote:
On 6/5/2018 2:11 PM, Ole Traupe via samba wrote:
Hi list,

I have a domain in production on two sites (subnets, via "Sites and Services") with originally two DCs. One went down due to HDD (-> old hardware) error. Now, occasionally, clients cant access/find the file server (domain member). This does not occur on all clients at the same time, however, so I am rather sure it is not the file server itself, but a DNS problem.

I couldn't find anything diagnostic in the logs. Default log level was not informative, I think, while log level 10 I just could not handle/analyze properly.

Can someone recommend a log level? Should I look on the DC or on the file server?

Do I have to remove the offline DC completely from DNS and Sites and Services for this mess to stop?

I appreciate any advice.



    If you haven't already removed the dead DC from your network you should do that first.


Your clients DNS may still be pointing to the offline DC causing look up delays. Also did you have your DC's pointing to themselves for DNS or each other?

Thank you for your help!

I had trouble with fail-safe tests regarding DC redundancy a while ago. Some time after discussing it here on the list I finally got it working (had something to do with IPv6). So I can say I have tested the absence of a DC, and it did not lead to any trouble (except for a very short moment due to DNS caching, supposedly). Now it does, which is weird.

When the drive errors on the now broken DC manifested, the domain acted weirdly. When I took that DC completely offline, everything went back to normal. Now issues are showing up. Just so much for the background.

The current situation is very much like in the fail-safe tests, with two exceptions: the remaining DC (FSMO role holder) is the primary DNS server on all Windows machines, and I updated the resolv.conf on that DC to only point to itself. This DC and several Windows clients got restarted after that, but issues persist.

Actually, the DCs (resolv.conf) were pointing to each other initially, and I think that was at least one root of the evil. I think this advice in the Samba wiki actually is rather bad (and unnecessary with Samba, as has been pointed out, before?).

Regarding demoting the dead DC: My Samba version is rather old (4.2.5). The problem is that I chose the uid/gid scopes unwisely. And I read on some patch notes that I can't update anymore, because newer versions of Samba actually require those scopes to be set in a very specific way. So perhaps demoting via the newly available method is not an option here.

What I can think of is:
- removing the dead DC from the clients DNS config, of course
- removing it from AD DNS
- removing it from AD Sites and Services
- and removing it from AD Users and Computers

What else does the Samba script for demoting a DC do? Can I do that manually, too? I repeat: it was not the FSMO role holder.

Thanks again for any advice!

To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba