Live migration of container running BIND fails

Discussion in 'General Questions' started by SteveITS, Feb 12, 2016.

  1. SteveITS

    SteveITS Mega Poster

    Messages:
    218
    I tried live migrating a PPA DNS service node container today via PVA and got:

    Operation with the Container "ns10.teamits.net" is finished with errors: Can not migrate: exec vzmsrc failed [13312] : locking 7127125 Shared disk detected /vz/private/7127125/root.hdd Connection to destination node (hn3.teamits.net) is successfully established Moving/copying CT#7127125 -> CT#7127125, [], [] ... Checking bindmounts Check cluster ID Source and target CT private resides on the same shared partition Checking license restrictions Check of requires kernel modules Checking CPT image version for online migration Checking capabilities for online migration Checking technologies Checking templates for CT copy CT private /vz/private/7127125 vzctl : Can not suspend Container: Invalid argument vzctl : Error: unsupported deleted submount: (deleted)/var/named/run-root/etc/named.rfc1912.zones vzctl : Failed to checkpoint the Container /usr/sbin/vzctl exited with code 16 can not suspend CT#7127125 : /usr/sbin/vzctl exited with code 16 Can't move/copy CT#7127125 -> CT#7127125, [], [] : can not suspend CT#7127125 : /usr/sbin/vzctl exited with code 16 Check target CT name: ns10.teamits.net Checking RATE parameters in config CT is shared and both nodes are in HA cluster. Checking ploop format 2 OfflineManagement CT#7127125 ... done Suspending CT#7127125 ... OfflineManagement CT#7127125 ... done .

    Since we have redundant name servers I just shut it down and migrated it successfully then started it again, but thought this was strange.

    The container is running CentOS 6.7 with all updates via yum. The storage is in our Virtuozzo cluster.
     
  2. Pavel

    Pavel A.I. Auto-Responder Odin Team

    Messages:
    432
    Hello Steve,

    From what I can see migration fails on checkpoint stage due to the fact there are bind-mounts pointing to a deleted directories.
    This happens due to the "bind" update process - it deletes directories without unmounting them. Well, to be honest, it tries, however, it expects "(deleted)" mark to be appended, while in vzkernel, it is prepended, thus breaking the bind's update rpm scriplets.
    Issue with appended/prepended "(deleted)" marks has already been reported to our development team, it's processed under ID #PSBM-25896.

    Until the issue is permanently fixed in code you can use this tiny command to unmount all deleted bindmounts (to be executed inside of a container):
    Code:
    # grep "(deleted)" /proc/mounts | awk '{print $2}' | sed 's~\\040(deleted)~~g' | while read mp; do umount $mp;done
    # service named restart
    
    Once it's executed migration should be able to succeed.
     

Share This Page