Web lists-archives.com

Re: TTL expired in transit to qemu virtual machine.




Hi Mimiko,

On Fri, Mar 17, 2017 at 9:58 PM, Mimiko <vbvbrj@xxxxxxxxx> wrote:
Hello.

I've setup qemu/kvm and installed several virtual machines. Access and ping to some virtuals are ok, but one have a stable problem not receiving correctly packets. First, this is the environment:

>uname -a
Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.84-1 x86_64 GNU/Linux

That's an really old kernel, I don't start anything virtual these days without at least 3.13.x kernel.
 

>libvirtd --version
libvirtd (libvirt) 0.9.12.3

>cat /etc/network/interfaces
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet manual

auto eth1
iface eth1 inet manual

auto bond0
iface bond0 inet manual
        bond-slaves eth0 eth1
        bond-mode balance-alb
        bond-miimon 100
        bond-downdelay 200
        bond-updelay 200

auto br0
iface br0 inet static
        address 10.10.10.10
        netmask 255.255.0.0
        vlan-raw-device bond0
        bridge_ports bond0
        bridge_stp off
        bridge_fd 0
        bridge_ageing 0
        bridge_maxwait 0

Hmmm, this doesn't make much sense to me, more specifically this part:

        vlan-raw-device bond0
        bridge_ports bond0

Whats the purpose exactly of the vlan? Usually, and that is how I do it, you would split the VLAN's coming from the switch trunk port over the bond and attach them to separate bridges lets say:

# VLAN51
auto br-vlan51
iface br-vlan51 inet manual
    vlan-raw-device bond0 
    bridge_ports bond0.51
    bridge_stp off
    bridge_maxwait 0
    bridge_fd 0

# VLAN52
auto br-vlan52
iface br-vlan52 inet manual
    vlan-raw-device bond0 
    bridge_ports bond0.52
    bridge_stp off
    bridge_maxwait 0
    bridge_fd 0

If the intention was to pass through the tagged traffic to the VM's then vlan-raw-device part is not needed at all.


Virtual machines connects to LAN using bridge:
>virt-install .... --network=bridge:br0,model=virtio ....

All virtuals has networking configuret like that. Also in iptables is an entry to allow access to virtuals:

>iptables -L FORWARD -nv
Chain FORWARD (policy DROP 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
 X    X     ACCEPT     all  --  br0    br0     0.0.0.0/0            0.0.0.0/0

Most virtuals does not have networking problems, but some times they can't be reached. For now only one virtual machines have this problem:
>From windows machine ping virtual machine:

>ping 10.10.10.3

Reply from 10.10.10.10: TTL expired in transit.
Reply from 10.10.10.10: TTL expired in transit.
Reply from 10.10.10.10: TTL expired in transit.
Reply from 10.10.10.10: TTL expired in transit.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from 10.10.10.10: TTL expired in transit.
Reply from 10.10.10.10: TTL expired in transit.
Reply from 10.10.10.10: TTL expired in transit.
Reply from 10.10.10.10: TTL expired in transit.

>tracert -d 10.10.10.3

Tracing route to 10.10.10.3 over a maximum of 30 hops

  1    <1 ms    <1 ms    <1 ms  10.10.10.10
  2    <1 ms    <1 ms    <1 ms  10.10.10.10
  3    <1 ms    <1 ms     *     10.10.10.10
  4     1 ms    <1 ms    <1 ms  10.10.10.10
  5    <1 ms    <1 ms     *     10.10.10.10

So packet goes round on interfaces of server hosting virtuals.

Yep typical routing loop.
  

Virtuals are linux different flavour and one windows. This problem may occur on any of this virtuals.

I've observed that for this particular virtual, which have problem, the arp table of host assigned self mac to the virtual's IP, not the mac configured for virtual machine.

That's strange indeed, except if br0 is used by something else like libvrit network that sets up the interface for proxy-arp. What's the output of:

# brctl showmacs br0
# ip route show
# arp -n

on the host, and:

# ip link show
# ip route show
# arp -n

on the problematic vm and on one of the good vm's?

To find the loop I would start by doing ping between good and bad vm (both directions in turns) and check the traffic on the host interface:

# tcpdump -ennqt -i br0 \( arp or icmp \)

and corresponding network devices on both vm's too.
 

What could be the problem?


Any sysctl settings you might have changed on the host?
 
--
Mimiko desu.