FreeBSD Pure ZFS and GELI migration to UEFI

For the past several years, I have been using an ASUS X53E laptop as my primary machine. My X53E was a great machine — ath(4) wireless network card, supported up to 8GB RAM, which was a requirement of mine, since I wanted to use a full zfs(8) setup — it had everything that I considered a requirement.

Unfortunately, the back of the LCD panel cracked recently, making the screen increasingly more difficult to work, because the crack was in a location which made full-size terminals impossible to use.

As a result of the screen crack and the age of the X53E, I decided it was finally time to upgrade my main system, and get a new laptop. I ended up purchasing an ASUS X550LA, however see this disclaimer.

Most modern systems use EFI in place of a traditional BIOS. This non-blog entry covers how I converted my '/ on zfs(8)' system to an EFI-bootable system, without losing any data. (I should note, however that the way my laptop is configured is very non-traditional, so I do not expect that most people will be as lucky as I was in such a conversion.)

The Previous Laptop Configuration

When I say that my laptop configuration is non-traditional, I cannot express this enough. I use zfs(8) exclusively. Additionally, my hard drives are encrypted, using geli(8) as the backend geom(8) provider.

Since I do not use a separate device (such a as a removable USB flash drive) to provide the kernel to the system, the /boot partition needs to remain unencrypted, since that is where the kernel lives. In order to encrypt the entire root filesystem, but still allow the system to boot from the attached drives, /boot was created as an entirely separate zfs(8) dataset.

Several years ago, when initially doing the installation of the first laptop my hard drive was in, I created a separate zpool(8) on a separate GPT partition for just this case. The partitions looked like this:

  # gpart show ada0
  =>       34  468862061  ada0  GPT  (224G)
           34          6        - free -  (3.0K)
           40       1024     1  freebsd-boot  (512K)
         1064        984        - free -  (492K)
         2048   20971520     2  freebsd-zfs  (10G)
     20973568   20971520     3  freebsd-swap  (10G)
     41945088  377487360     4  freebsd-zfs  (180G)
    419432448   49429647        - free -  (24G)
  

10GB may seem overkill for a /boot partition, but I have learned by now that future-proofing is a requirement, not just good practice. Meaning, what if (even by accident), the FreeBSD kernel exceeds 10GB? I would have to repartition my drives, reinstall, and restore everything from backup. And that does not sound like a "fun" weekend activity to me.

So, do it once, and do it right.

My system had two zpool(8) datasets — zboot0, which is where /boot bits lived, and zroot0, which is where the rest of the data under / lived.

When the system boots, /boot/loader is readable by the BIOS (because /boot is not encrypted), the kernel is loaded, the system boots (during which, I am prompted for the passphrase to decrypt /dev/ada0p4 and /dev/ada1p4 — my geli(8) providers), and my system is up and running.

So /boot is made available to the running system, such as for upgrades when files in the /boot directory need to be updated as part of the upgrade, the zboot0 would be mounted in /realboot, to avoid clobbering the / filesystem when it is mounted.

The resulting hierarchy looked something like this:

  /
  /bin
  /boot -> /realboot/boot
  /usr
  [...]
  

Now, I have a fully-encrypted (except /boot) system, and I am happy.

That is, of course, until I get the new laptop to replace the one with the cracked screen...

Why is any of this relevant, especially when it is regarding the old laptop? Well, the way I set up the original partitions allowed me to do very evil things to convert the drive in a way that would allow EFI-booting.

Thanks to the FreeBSD Foundation, much work has gone into supporting UEFI for modern hardware.

The problem for me, or so I thought, was that I did not want to partition the drives, reinstall, restore from backup. It is too time consuming, especially for what should be a simple hard drive swap in a new laptop.

Fortunately, having a completely separate /boot from the start saved me from reinstalling (or from doing zfs send and zfs recv from the mirrored drive).

Converting the First Drive

(Please note, the commands listed below may have been a bit different in reality, since I forgot to save the command history before rebooting.)

I intentionally only installed one of the two drives in the new laptop to be absolutely certain that I could do a local restore from the second drive if anything bad (or stupid on my part) happened, because the parts of the drive I was going to be messing with were in the center of the drive, not at the start or at the end.

First, I downloaded the latest memstick.img snapshot, which already supports UEFI. So, at least if I can get that image to boot, then the conversion should go fine. In theory.

I booted the memstick.img, and at the installer prompt, selected the Live CD menu, which starts a shell from the memory stick.

Then I created a few directories that I could use to mount the parts of the filesystem I needed.

  # mkdir -p /tmp/mnt/zroot0 /tmp/mnt/zboot0
  

Then I imported the zboot0 ZFS dataset to create a backup, before changing the layout of the disk.

  # kldload zfs
  # zfs import -f -o altroot=/tmp/mnt/zboot0 zboot0
  # mount -o rw /
  # mkdir -p /root/boot-backup
  # cd /tmp/mnt
  # tar -czf /root/boot-backup/boot0.tgz zboot0
  

After confirming the backup contained the files I needed, destroyed the GPT partition for the zboot0 dataset.

  # zpool export zboot0
  # gpart delete -i 2 ada0
  

The specifications for EFI suggest a minimum size of 800k for the boot partition, but in the spirit of futureproofing, I created a 2G EFI partition to be safe. I also had to specify the starting block, since the EFI partition needs to be at the start of the disk. I used the output of gpart list ada0 to figure out what the next free block would be.

  # gpart add -t efi -a 1m -b 10485761 -s 2G -i 5 ada0
  

The partition that was created actually starts at 10487808, because I am aligning the partitions on 1m boundaries.

Then I created the partition that would serve as /boot for the loader, loader.conf, kernel, and so on.

  # gpart add -t freebsd-ufs -a 1m -b 14682112 -s 2.5G -i 6 ada0
  

Now that the UFS partition is created, the filesystem needed to be created, and the boot1.efifat file written to the EFI partition. But first, the zroot0 dataset needs to be imported, so the /boot directory can be extracted from the backup.

  # kldload geom_eli
  # geli attach ada0p4
  # zpool import -f -o altroot=/tmp/mnt/zroot0 zroot0
  

Then I created the new filesystem, and mounted it in the root filesystem.

  # newfs /dev/ada0p6
  # mount /dev/ada0p6 /tmp/mnt/zroot0/realboot
  # cd /tmp/mnt/zroot0/realboot
  # tar -xzf /root/boot-backup/boot0.tgz
  

Finally, boot1.efifat needed to be written to the EFI partition.

  # dd if=/tmp/mnt/zroot0/realboot/boot/boot1.efifat of=/dev/ada0p5
  

At this point, everything seemed sane enough to be able to boot test the laptop, so I unmounted all filesystems that were imported, and did a test boot.

  # umount /dev/ada0p6
  # zpool export zroot0
  # reboot
  

I was actually quite surprised that things "just worked", and the laptop rebooted from the internal hard drive.

Adding the Second Drive (a.k.a., The Point of No Return)

Now that the laptop booted fine from the first drive, it was time to add the second drive to the laptop. I booted from the memstick.img again, because I did not want the system to accidentally boot into the wrong zroot0 dataset, because at this point, the filesystem was effectively in a "split-brain" state.

Once again, I selected Live CD from the installer prompt, and attached the second drive. Since geom_eli.ko is not automatically loaded with the memstick.img, I did not need to worry about the system automatically importing the zroot0 ZFS dataset, which I did not want to happen.

Then I destroyed the GPT partitions on the second drive. At this point, I could be certain that /dev/ada0 was the drive I had already converted and /dev/ada1 was the newly-attached drive, because gpart show ada1 lacked the EFI partition.

Rather than specifying each of the partition index numbers, I forcefully destroyed the GPT scheme on the disk.

  # gpart destroy -F ada1
  

Then I dumped the scheme from the first drive onto the second.

  # gpart backup ada0 | gpart restore -l ada1
  

Then I verified the partition layout on both drives matched before going any further.

  # gpart show
  

Both partition layouts matched, so I rebooted the laptop from the internal drives again.

Once the system was running from the internal drives, it was time to make the second drive "live".

In particular, the things that still needed to be done at this point were:

  • Write boot1.efifat to /dev/ada1p5
  • Mirror /dev/ada0p6 and /dev/ada1p6
  • Create a GELI provider for the root filesystem, and add it to the zroot0 dataset

The boot1.efifat file was written to the EFI partition on the second drive, just as it was for the first.

  # dd if=/boot/boot1.efifat of=/dev/ada1p5
  

Then I created the GELI provider for the zroot0 mirror.

  # geli init -b -a HMAC/SHA256 -e AES-CBC -l 256 -s 4096 ada1p4
  # geli attach ada1p4
  # dd if=/dev/random of=/dev/ada1p4.eli bs=1m
  

While dd(1) wrote random data to the GELI provider, I set up the mirror for the /realboot partition so /boot is both available to the running system. I used GEOM_MIRROR for this.

  # gmirror label -v -b round-robin gmboot ada0p6
  # gmirror insert gmboot ada1p6
  

While the mirror synchronized, I updated /etc/fstab to mount /dev/mirror/gmboot in the /realboot directory automaticaly. The /etc/fstab entry looks like this:

  /dev/mirror/gmboot	/realboot	ufs	rw,noatime	1	1
  

Once dd(1) finished writing random data to the GELI provider, I could add /dev/ada1p4.eli to the zroot0 dataset, and let it resilver.

  # zpool attach zroot0 ada0p4.eli ada1p4.eli
  

Then I used zpool status to verify pool has the new GELI provider. (This example output was taken after the mirror had already finished resilvering.)

  # zpool status zroot0
    pool: zroot0
  state: ONLINE
    scan: resilvered 78.7G in 1h5m with 0 errors on Sun Aug  3 15:24:57 2014
  config:

	  NAME            STATE     READ WRITE CKSUM
	  zroot0          ONLINE       0     0     0
	    mirror-0      ONLINE       0     0     0
	      ada0p4.eli  ONLINE       0     0     0
	      ada1p4.eli  ONLINE       0     0     0

  errors: No known data errors
  

Once the resilver finished, I rebooted the laptop one last time, selecting the second drive as the boot drive for a quick test, and the system booted fine.

The End

I realize not everyone will have such a (seemingly) smooth conversion to a UEFI-capable system without needing to repartition the entire drive. But, as it turns out for me, having been using a "/ on ZFS" setup with an encrypted filesystem made this conversion quite painless.

Disclaimers

I purchased an ASUS laptop because my previous laptop was made by ASUS. I will not, however, ever buy another ASUS, after several inquiries to their customer service went unanswered.

This is the same reason I will no longer purchase Dell laptops.

Next time, I will get a Lenovo, assuming it has an ath(4) wireless network card.