ZFS: from array to array

Posted on Thu 05 April 2018 in Administracja • [5 min read]

I own few Microservers Gen8 build by HP.

Each of them is beautiful and cool machine. It has compact and modular design with easy access to basic components. It's hardware configuration is also nice and sufficient for small office. It is definitely better choice than NAS as offers more flexibility considering software and OS.

There was need to grow space for the disk array in one of offices which use Microserver.

Server itself has built-in RAID controller but in reality it's a software array which works only with operating system built in Redmond. But it's much cheaper than server with real hardware controller.

Frankly, hardware controller can costs the very same amount of money as the whole Microserver so it's really smart to use ZFS mirror.

I prefer Linux so my choice is Proxmox, Debian based virtualisation environment with KVM and LXC(Linux Containers). I defined array in so called "hardware" RAID but only for booting purposes. Proxmox ignores this array and I configured a mirror in ZFS filesystem.

What I needed was replacing disks with bigger ones, which is very easy in ZFS.

In brief:

I need put new disks into empty slots (4 HDD bays - it's another advantage of Micorserver), make new mirror, copy all data from old to new mirror and make new mirror bootable.

Original mirror is rpool and consists of two drives: /dev/sda and /dev/sdb.

I can list it with:

zpool status -v

New drives are: /dev/sdc and /dev/sdd`.

With gdisk tool I created partitions on new drives based on old configuration.

Number Start (sector) End (sector) Size Code Name
1 2048 4061 1007.0 KiB EF02 BIOS boot partition
2 4096 5860500366 2.7 TiB BF01 Solaris /usr & Mac ZFS
9 5860501504 5860533134 15.4 MiB BF07 Solaris Reserved 1

As you can see first partition is bootable and second partition holds all data.

New pool is made with Bash script which I built based on output of following command:

zpool history rpool

cat ~/create_rpool2.sh

#!/bin/sh
# create new pool based on old_pool history
zpool create -f -o ashift=12 -o cachefile=none rpool2 mirror /dev/sdb2 /dev/sdd2
zfs create rpool2/ROOT
zfs create rpool2/ROOT/pve-1
zfs set atime=off rpool2
zfs set compression=lz4 rpool2
zfs create -V 4194304K -b 4K rpool2/swap
zfs set com.sun:auto-snapshot=false rpool2/swap
zfs set sync=always rpool2/swap
zfs set sync=disabled rpool2
zfs set sync=standard rpool2

Take care of the name of new pool whis is rpool2 in my configuration.

What I will do next is making snapshot of rpool. Snapshot is a frozen state of filesystem. -roption takes recursive snapshot of all filesystems defined in pool.

zfs snapshot -r rpool@moving

Let's check snapshots:

zfs list -t snapshot

Now I sent data from snapshot to new poolrpool2:

zfs send -r rpool@moving | zfs receive rpool2

If you want to see how fast data are transfered you can pipe this process through Pipe Viewer (pv). Install it with: apt-get install pv and run following command:

zfs send -R rpool@moving | pv | zfs receive -F rpool2

Check it again:

zfs list -t snapshot

For the next step I prepared myself a Bash script to stop as many services I can to make another snapshot. This way I tried to avoid bigger changes in filesystem. Then I will send new snapshot as a increment which will be very fast comparing first transfer of data.

Here is the script:

$ cat ~/stop-services.sh
#!/bin/sh
# stop services before sending snapshot
systemctl stop watchdog-mux
systemctl stop systemd-timesyncd
systemctl stop spiceproxy
systemctl stop rrdcached
systemctl stop rpcbind
systemctl stop pvestatd
systemctl stop pveproxy
systemctl stop pvefw-logger
systemctl stop pvedaemon
systemctl stop pve-ha-lrm
systemctl stop pve-ha-crm
systemctl stop pve-firewall
systemctl stop pve-cluster
systemctl stop postfix
systemctl stop nfs-common
systemctl stop lxcfs
systemctl stop dbus
systemctl stop cron
systemctl stop cgmanager
systemctl stop open-iscsi
systemctl stop atd
systemctl stop ksmtuned
systemctl stop rsyslog
systemctl list-units --type=service --state=running

I run script:

./stop-services.sh

And made another snapshot:

zfs snapshot -r rpool@moving2

I sent new snapshot incrementally:

zfs send -Ri rpool@moving rpool@moving2 | zfs receive -F rpool2

I set new mountpoint for root and add bootable flag:

zfs set mountpoint=/ rpool2/ROOT/pve-1
zpool set bootfs=rpool2/ROOT/pve-1 rpool2

I added new entry into file /etc/grub/40_custom (content of this file is taken in consideration when I will runupdate-grubscript):

menuentry 'Proxmox NEW' --class proxmox --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-32125e6ecced17a2' {
    load_video
    insmod gzio
    insmod part_gpt
    insmod zfs
    set root='hd1,gpt2'
    if [ x$feature_platform_search_hint = xy ]; then
        search --no-floppy --fs-uuid --set=root --hint-bios=hd1,gpt2 --hint-efi=hd1,gpt2 --hint-baremetal=ahci0,gpt2 32125e6ecced17a2
    else
        search --no-floppy --fs-uuid --set=root 32125e6ecced17a2
    fi
    echo 'Loading Linux 4.2.2-1-pve ...'
    linux /ROOT/pve-1@/boot/vmlinuz-4.2.2-1-pve root=ZFS=rpool2/ROOT/pve-1 ro boot=zfs $bootfs root=ZFS=rpool2/ROOT/pve-1 boot=zfs quiet
    echo 'Loading initial ramdisk ...'
    initrd /ROOT/pve-1@/boot/initrd.img-4.2.2-1-pve
}

Note: I changed hd0 to hd1 (until I'll remove old disk).

Note: I changed pool name from rpool to the new one: rpool2.

Next, in /etc/default/grub changed the following line:

GRUB_DEFAULT=0

this number tells Grub to boot from first entry defined in /boot/grub/grub.cfg.

I changed it to:

GRUB_DEFAULT=6

as my new entry will be soon on the 6th place (starting from 0) in new /boot/grub/grub.cfg.

I wrote new grub config to disk:

update-grub

I crossed my finger and made a reboot.

I checked if rpool2 is mounted as root:

df -h
mount
gdisk -l /dev/sdx

etc.

Everything was OK so I installed Grub on the first drive of rpool2, in my case /dev/sdc:

grub-install /dev/sdc

Reboot again.

System still boots from old drive but mounts new drive as root.

What I did next is revert almost all changes to Grub configuration remembering that new ZFS pool is now rpool2, not rpool. So, I removed entry from /etc/grub/40_custom.

Next, in /etc/default/grub I changed back following line:

GRUB_DEFAULT=6

to:

GRUB_DEFAULT=0

And I run again:

update-grub

I listed newly created /boot/grub/grub.cfg to check if all rpool entries are changed to rpool2.

Last reboot and I'm done.

I removed old drives from HDD bays.

p.s.

If boot messages show:

Job dev-zvol-rpool-swap.device/start timed out.

you need to change /etc/fstab from:

/dev/zvol/rpool/swap none swap sw 0 0

to:

/dev/zvol/rpool2/swap none swap sw 0 0

Other problems could be:

cannot import 'rpool' : one or more devices is currently unavailable
zfs-import-cache.service: main proces exited, code=exited, status=1/FAILURE
Failed to start Import ZFS pools by cache file.
Unit zfs-import-cache.servie entered failed state.

Generate new cache with:

zpool set cachefile=/etc/zfs/zpool.cache rpool2

Reboot and check with:

journalctl -b

You shouldn't see any problems.