Tuesday, January 13, 2009

Using ZFS with Mac OS X 10.5

A few days ago I got a new MacBook Pro. While waiting for it to be delivered, I started thinking about how I want to layout the installation of the OS. For a long long time I wanted to try to use ZFS file system on Mac and this looked like a wonderful opportunity. Getting rid of HFS+, which was causing me lots of problems (especially its case insensitive re-incarnation), sounds like a dream come true.

If you've never heard of ZFS before, check out this good 5min screencast of some of the important features.

A brief google search revealed that there are several people using and developing ZFS for Mac. There is a Mac ZFS porting project at http://zfs.macosforge.org and I found a lot of good info at AlBlue's blog.

Some noteworthy info:
  • The current ZFS port (build 119) is based on ZFS code that shipped with Solaris build 72
  • It's currently not possible to boot Mac OS X from a ZFS filesystem
  • Finder integration is not perfect yet - Finder lists a ZFS pool as an unmountable drive under devices
  • There are several reports of kernel panics, most of which appeared in connection to the use of cheap external USB disks (I haven't experienced any)
  • There are a bunch of minor issues, which I'm sure will eventually go away.
None of the above was a show stopper for me, so I went ahead with the installation. My plan was simple - repartition the internal hard drive to a small bootable partition and a large partition used by ZFS, which will hold my home directory and other filesystems.

Install ZFS

Even though MacOS X 10.5 comes with ZFS support, it's only a read-only support. In order to be able to really use ZFS, full ZFS implementation must be installed.

The installation is very simple and can be done by following these instructions: http://zfs.macosforge.org/trac/wiki/downloads. Alternatively, AlBlue created a fancy installer for the lazy ones out there.

Repartition Disk

Once ZFS is installed and the OS was rebooted, I could repartition the internal disk. If you are using an external hard drive, you'll most likely need to use zpool command instead.

First let's check what the disk looks like:
$ diskutil list
/dev/disk0
#:                       TYPE NAME                    SIZE       IDENTIFIER
0:      GUID_partition_scheme                        *298.1 Gi   disk0
1:                        EFI                         200.0 Mi   disk0s1
2:                  Apple_HFS boot                    297.8 Gi   disk0s2
Good, the internal disk was identified as /dev/disk0 and it currently contains an EFI (boot) slice and ~300G data slice/partition. Let's repartition the disk so that it contains two data partitions.
$ sudo diskutil resizeVolume disk0s2 40G ZFS tank 257G
Password:
Started resizing on disk disk0s2 boot
Verifying
Resizing Volume
Adjusting Partitions
Formatting new partitions
Formatting disk0s3 as ZFS File System with name tank
[ + 0%..10%..20%..30%..40%..50%..60%..70%..80%..90%..100% ]
Finished resizing on disk disk0
/dev/disk0
#:                       TYPE NAME                    SIZE       IDENTIFIER
0:      GUID_partition_scheme                        *298.1 Gi   disk0
1:                        EFI                         200.0 Mi   disk0s1
2:                  Apple_HFS boot                    39.9 Gi    disk0s2
3:                        ZFS tank                    252.0 Gi   disk0s3


Great, the disk was repartitioned and the existing data partition, which I call boot, was resized into a smaller 40GB partition and the extra space was used to create a ZFS pool called tank. Btw all the data on the boot partition was preserved.

Let's check my new pool:
$ zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
tank                    256G    360K    256G     0%  ONLINE     -
$ zpool status
pool: tank
state: ONLINE
status: The pool is formatted using an older on-disk format.  The pool can
 still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
 pool will no longer be accessible on older software versions.
scrub: none requested
config:

 NAME        STATE     READ WRITE CKSUM
 tank        ONLINE       0     0     0
   disk0s3   ONLINE       0     0     0

errors: No known data errors
The warning above just means that a new ZFS storage format is available but is not used by the current pool. As far as I could find there are no benefits for upgrading to the new format on Mac, but if I did, I would lose compatibility with Macs that have only the read-only ZFS support.

Create Filesystems

So now that the new pool exists, I can create a shiny new filesystem using a single command:
$ sudo zfs create tank/me3x
$ zfs list
NAME        USED  AVAIL  REFER  MOUNTPOINT
tank        388K   252G   270K  /Volumes/tank
tank/me3x    19K   252G    19K  /Volumes/tank/me3x
To configure this new filesystem as my home directory, I created a temporary admin account, logged in under this account and mounted the ZFS fs as /Users/me3x:
$ sudo mv /Users/me3x /Users/me3x.hfs
$ sudo zfs set mountpoint=/Users/me3x tank/me3x
$ sudo cp -rp /Users/me3x.hfs /Users/me3x
That's it. My Mac account now resides on a ZFS file system. Now I can finally enjoy all the benefits of using ZFS on my OpenSolaris box in my office as well as on my Mac. Bye bye HFS, I won't miss you! 

30 comments:

AlBlue said...

Thanks for the link. I also include a link to a Mac OS X installer on my blog, which might be more palatable than the hand-holding on the official site:

http://alblue.blogspot.com/2008/11/zfs-119-on-mac-os-x.html

jwhendy said...

While a user file resides on ZFS, am I correct that all system related files still must be on HFS+? In other words, only a directory under 'Users' can be on a separate ZFS partition; everything else under 'HD' (System, Library, Applications, etc.) must stay on the 'boot' HFS+ partition?

I'm dual booting FreeBSD right now and this is great, as I could share my storage partition for each OS under ZFS! I just want to make sure I get things straight and partition correctly:

- #1 GUID
- #2 OS X 'Top level'/boot (HFS+)
- #3 FreeBSD (UFS)
- #4 Shared storage (ZFS)

Does that look about right?

Thanks for the post!!
John

Shawn Ferry said...

For external disks, you can use zpool (and I have in the older releases). However, using diskutil does some things that are cleaner so you no longer get unrecognized disk messages every time you plug the disk.

Starting from scratch on a new laptop I also have an internal ZFS partition which I had mirrored to an external disk.

diskutil partitiondisk disk1 ZFS SanDiskCF 100%

Started partitioning on disk disk1
Creating partition map
Formatting disk1s2 as ZFS File System with name SanDiskCF
[ + 0%..10%..20%..30%..40%..50%..60%..70%..80%..90%..100% ]
Finished partitioning on disk disk1
/dev/disk1
#: TYPE NAME SIZE IDENTIFIER
0: GUID_partition_scheme *15.3 Gi disk1
1: EFI 200.0 Mi disk1s1
2: ZFS SanDiskCF 14.9 Gi disk1s2


zpool list

NAME SIZE USED AVAIL CAP HEALTH ALTROOT
SanDiskCF 14.9G 360K 14.9G 0% ONLINE -

Igor Minar said...

@AlBlue I'll include link to your package in the post.

@jwhendy you can put some but not all system directories on ZFS. For example I already mounted /opt (used by macports) and /usr/local as ZFS file systems. It's very likely that /Applications could be mounted the same way too.

Basically, anything that is not needed for booting can be on ZFS.

Shawn Ferry said...

Some applications seem a little funky on ZFS.

Aperture for example seemed to have some strange slow down issues with its library on ZFS (but is mostly fine with images on ZFS (it doesn't show the mount names properly in the internal file browser))

Some applications also will not install on filesystem types that are not HFS/HFS+ or case sensitive FS. Nikon Capture NX is one that comes to mind.

I'm not quite comfortable enough to move my home directory as local disk pools don't always import on boot (I think because I added the pool as a slice manually after the fact).

jwhendy said...

Thanks for the response. I sat down yesterday and had some good success! A few questions, though.

- First, I have a backup usb drive that is bootable as it holds my backups created with Carbon Copy Cloner. I was doing my partitioning/zfs creation from that drive. At one point I forgot I was on that drive and issued the command 'sudo zfs set mountpoint=/Users/jwhendy tank/jwhendy' but what I _should_ have done was set the mountpoint to /Volumes/Macrophage/Users/jwhendy since / was actually the backup drive.

Anyway, I rebooted and tank/jwhendy was mounted on my _internal_ drive at /Users/jwhendy. After seeing that, I suspect that it will mount with respect to the drive being booted. How can I ensure it only mounts as part of my internal drive and not the external?

-Second, the tank partition shows up as a white 'image-type' volume called tank. My Users folder still looks like a house, but when I'm in it, the finder window says 'tank' at the top and the house logo for the folder in the quick-links in the Finder side menu also says tank next to it. Is this common? Is there any way to trick it further and have this be a little more seamless?

Thanks a ton for your help!
-John

jwhendy said...

Oh, and whoops - another comment. My whole goal in this exercise was to be able to share a storage partition with FreeBSD. I could see the partition from FreeBSD as /dev/ad5s3 but could not mount it. I created a folder in /media and tried:

mount -t zfs /dev/ad5s3 /media/temp

But had no luck. I tried changing the node name to something like ad5s3[a,c,etc.] but had no luck. Will this be possible for me?

Thanks,
-John

Igor Minar said...

@jwhendy I'll start with the last question. ZFS is architecture and OS agnostic so in theory it should be possible to access your pool under FreeBSD. I've seen some references to that it works, but I haven't tried it my self.

The small inconsistencies when it comes to icons in Finder are normal. The ZFS support in Finder is not complete yet. I don't know about any workarounds, I haven't tried to find any yet.

Now, the first question regarding your mountpoints. I'm not quite sure I understand what you did and how the system is set up. If you booted from your usb drive, then I'd expect the usb drive to be mounted as /. Also you didn't make it clear where the zfs pool is located. Is it located on the usb drive or on the internal drive?

jwhendy said...

Thanks for responding...

- Re. the sharing. I found this statement, 'A pool cannot be shared across systems. ZFS is not a cluster file system.' here: http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide. Is that stating that I can't do what I'm trying to do in accessing one pool from two separate OSs?

- Re. the mounting issue... I have two drives: Macrophage (disk0) and Backup (disk1)

I did:
# sudo gpt add -b ##### -i 3 -s #### -t 6A898CC3-1DD2-11B2-99A6-080020736631 disk0

# sudo zpool create tank /dev/disk0s3

# sudo zfs create tank/jwhendy

# sudo zfs set mountpoint=/Users/jwhendy tank/jwhendy

Notes: since zfs was created on disk0, that's the internal drive. I got the GUID ID from wikipedia and 'sudo gpt show disk0' indicates that it is of type 'Solaris Usr', so it worked.

So... I have the zfs binaries on both drives and what I think is happening is:

- When I boot into Macrophage (disk0), tank/jwhendy gets mounted at /Users/jwhendy, which is on disk0.

- When I boot into Backup, tank/jwhendy gets mounted at /Users/jwhendy, which is on disk1 (Macrophage is still disk0, but is at /Volumes/Macrophage, not /).

What the / directory actually is depends on which drive I boot into. I want my zfs partition to only mount on the /Users/jwhendy on disk0, regardless of if it's / on disk0 or /Volumes/Macrophage on disk0.

Does that make sense?


Thanks,
John

Igor Minar said...

The statement about sharing a pool between two OSes refers to using a ZFS pool by more than one OS at the same time, ie using it as a clustered filesystem. If I understood you correctly, what you want to do is to use the pool with your Mac and later reboot (if you have a multiboot system) or move the drive to a different box and use it with FreeBSD. If that's what you want then AFAIK, it should be possible.

Regarding the mountpoints, I think what's happening is that ZFS automatically imports the pool in both boot environments. And in either case it uses the mountpoint which you defined on that fs. A quick, untested thought: what if you set the mountpoint to /Volumes/tank/jwhendy and then create a symlink from /Users/jwhendy to that mount.

jwhendy said...

Excellent about sharing the partition across OS's. Your inference about what I am looking to do is exactly correct. I want disk0s3 to act as both my OS X /Users/jwhendy folder and also to be able to access the data within that folder from FreeBSD. I can see the slice from FreeBSD, but it's only showing as ad5s3 and I have not been able to mount it yet. Do you have any recollection of where the post was that you saw about sharing a partition? I haven't found anything like that using google...

I had thought about the symlink method... perhaps I will try that!

Thanks for all your input - you've been extremely helpful. I don't find many people doing this yet, so it's been great to have your input!


Thanks,
John

Igor Minar said...

I can't remember where I saw that reference. Try asking at the opensolaris/zfs mailing list.

I'm glad that you found the post helpful.

jelemans said...

I built a raidz zfs volume today on my office server.
It's made of external SATA drives.
Now, I'd like to be able to share it with the office users. However, it doesn't show up under Volumes on Server Admin under File Sharing.

Is there something I can do or is this a gotcha until we get official Apple support? TIA.

Igor Minar said...

No idea. I don't use Mac as a server. OpenSolaris is much better at that ;-)

Maybe try finding the corresponding terminal command, it's possible that just the GUI part is missing in Leopard.

jwhendy said...

I'll try the mailing list. I googled OpenSolaris mailing list, and am wondering if the list you mean is 'zfs-discuss' on this list: http://mail.opensolaris.org/mailman/listinfo?

I'll also try FreeBSD mailing lists and already have a post on the FreeBSD forums.

Thanks,
John

Igor Minar said...

yup.. that's the one.

jwhendy said...

Update: I did 'zpool import' in FreeBSD and then 'zfs list' showed tank but not tank/jwhendy... I tried to mount tank/jwhendy with 'zfs mount tank/jwhendy' but I got 'Mismatched versions: File system is version 2 on-disk format, which is incompatible with this software version 1! cannot mount 'tank/jwhendy': operation not supported'.

Is there a way to alter the 'software versions' to match? I'm not sure what the version refers to... Since it seemed to indicate that the FreeBSD was behind OS X (1 on FBSD, 2 on OS X), I tried zpool upgrade but got 'This system is currently running ZFS version 6. All pools are formatted using this version'.

Thoughts?

Thanks,
John

Igor Minar said...

I don't know what version of ZFS on-disk format is used in FreeBSD. Mac's port uses version 6 by default, but if you upgrade to the read/write build v119, you can upgrade to version 8.

jwhendy said...

Hi again... I gave a rest to FreeBSD for a while - too busy to mess around. I have a question on backing up my zfs users folder. I set up pools tank and tank/jwhendy for my users folder. Then I changed my Users folder via the System Preferences>Accounts>ctrl click your user. So now my user folder is set to /Volumes/tank/jwhendy.

I just tried running carbon copy cloner and created a test file on the desktop to see if CCC would follow the jwhendy users folder, but it did not.

Do you have an idea of what I could do to make sure my users folder continues to be backed up if I'm using CCC?

I have seen people set up time machine with this, but having used CCC for quite a while now, I'd prefer not to have to switch my backup program.


Thanks,
John

Igor Minar said...

no idea, I use zfs snapshots for backups. For me that was one of the main reasons to use ZFS in the first place.

jwhendy said...

Hi,


Hopefully I'll be able to stop asking you questions soon... I'm wondering if you think that the legacy mounting of OS X conflicts with zfs at all? I'm noticing what I think to be odd things:

- I tried to mount my pool at /Users, but the Users folder disappears from the Finder (it was an empty folder when I set this mountpoint).

- I had a 'jwhendy' filesystem at pool/jwhendy, but since /Users didn't show up, it didn't matter. I can see it from 'ls' in the terminal.

- If I 'unmount' the pool drive shown on the desktop, Users reappears and now pool/jwhendy is visible in /Users/jwhendy and zpool status and zfs list all show both pool and pool/jwhendy both mounted and online. Is OS X conflicting here?

- Lastly, I had my pool mounted at /Users again ('technically at /Volumes/Macrophage/Users') and all was well until a reboot. Now I have a /Volumes/Macrophage and a /Volumes/Macrophage 1 and neither of their Users folders shows anything inside!

Do you ever run into issues with zfs's mount and OS X's?

Thanks for any input. Should I just stick with having my zfs pools completely outside the data structure of my HD (just on / vs. /somethingElse)?


Thanks,
John

Igor Minar said...

I only mounted a zfs file system as my home dir, not the /Users dir and haven't experienced any issues with that (except for the small known Trash and Finder problems).

I also mounted zfs as /opt for macports and /usr/local.

I haven't had any issues with any of these mounts.

jwhendy said...

Thanks - I'm guessing that the 'finder issue' is that it doesn't show up? The 'trash issue' is that you just delete permanently, correct?

I tried testing the snapshot feature and got something about not being able to unmount the volume when I tried a rollback. This is what I meant by the two mounting methods 'competing'. I don't think the issue was coming from the zfs side, but from the OS X side not letting the drive unmount.

Hope that made sense... what's your snapshot method and how would you go about exploring a snapshot from OS X? I didn't seem to be able to rollback and can't find a .zfs directory from which to explore the snapshot from the terminal. It is listed with 'zfs list' though - I just don't know what to do with it.


Thanks,
John

Igor Minar said...

The finder issue is that it displays pool name instead of the filesystem name in the left column and the trash issue is that you can't empty the trash via the GUI, but have to delete the items in the trash manually.

Inability to see and browse the .zfs directory is a current limitation of the mac port of zfs. In order to see and browse a snapshot you need to clone it. More info: http://zfs.macosforge.org/trac/wiki/issues


I suggest that you report the problems with rollbacks to the mac zfs team at http://lists.macosforge.org/mailman/listinfo/zfs-discuss

Shawn Ferry said...

@jwhendy I often need to resort to forcing the umount of a couple of my ZFS filesystems because they are busy. This is particularly true under my home directory which has selective migration to ZFS.

The rollback includes an unmount and remount. I am guessing that you could rollback if you first manually unmount.

Anonymous said...

Igor, seems like we were thinking about the same while waiting for our MacBook Pros :). All you did makes perfectly sense for me, but later, came to my mind something that I read time ago: "using ZFS is better on whole disks, rather than on slices". Googling, I have found this (from Sun, your emplooyer :)):

http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#ZFS_Administration_Considerations

Quoting:

"Set up one storage pool using whole disks per system, if possible.
For production systems, use whole disks rather than slices for storage pools for the following reasons:
* Allows ZFS to enable the disk's write cache for those disks that have write caches. If you are using a RAID array
with a non-volatile write cache, then this is less of an issue and slices as vdevs should still gain the benefit of
the array's write cache.
* The recovery process of replacing a failed disk is more complex when disks contain both ZFS and UFS file systems on
slices.
* ZFS pools (and underlying disks) that also contain UFS file systems on slices cannot be easily migrated to other
systems by using zpool import and export features.
* In general, maintaining slices increases administration time and cost. Lower your administration costs by
simplifying your storage pool configuration model.
If you must use slices for ZFS storage pools, review the following:
* Consider migrating the pools to whole disks after a transition period.
* Use slices on small systems, such as laptops, where experts need access to both UFS and ZFS file systems.
* However, take great care when reinstalling OSes in different slices so you don't accidentally clobber your ZFS pools.
* Managing data on slices is more complex than managing data on whole disks."

What is your experience so far with your current ZFS/HFS+ (boot) setup? It would be very useful info for me.

Thanks,
-Mariano.

Igor Minar said...

Mariano,

Yes, you are right, the best practice on solaris is to give zfs the entire hard drive. But as you know ZFS port on mac can't boot (yet) and there is usually only one drive in a MBP so you don't have many options. But even if you had two disks, you supposedly wouldn't benefit much. Check out: http://alblue.blogspot.com/2008/11/zfs-119-on-mac-os-x.html?showComment=1234357560000#c6987585746077826466

My experience has been great so far. I did see 2-3 kernel panics since January, but that's a small penalty to pay for all the goodness. I haven't experienced any data loss or data corruption.

For backups I know use snapshots and I'm even able to send them to a pool on an external drive that I use for backups.

Anonymous said...

ZFS is ok on 10.6 SnowLeopard?

Regards.

Igor Minar said...

ZFS was completely removed from 10.6 a few months before the release and currently doesn't have even the read-only support that 10.5 has.

It is not clear why this happened and what are the future Apple plans. There is a lot of discussion about this move on the mac zfs mailing list, but any official or even unofficial statement from Apple is missing.

http://lists.macosforge.org/pipermail/zfs-discuss

Shawn Ferry said...

You can still download and install the zfs-119 build, I have found that it still works without any new or unexpected issues.