Skip to main content

storage - ZFS over iSCSI high-availability solution

I am considering a ZFS/iSCSI based architecture for a HA/scale-out/shared-nothing database platform running on wimpy nodes of plain PC hardware and running FreeBSD 9.



Will it work? What are possible drawbacks?



Architecture




  1. Storage nodes have direct attached cheap SATA/SAS drives. Each disk is exported as a separate iSCSI LUN. Note that no RAID (neither HW nor SW), partitioning, volume management or anything like that is involved at this layer. Just 1 LUN per physical disk.


  2. Database nodes run ZFS. A ZFS mirrored vdev is created from iSCSI LUNs from 3 different storage nodes. A ZFS pool is created on top of the vdev, and within that a filesystem which in turn backs a database.


  3. When a disk or a storage node fails, the respective ZFS vdev will continue to operate in degraded mode (but still have 2 mirrored disks). A different (new) disk is assigned to the vdev to replace the failed disk or storage node. ZFS resilvering takes place. A failed storage node or disk is always completely recycled should it become available again.



  4. When a database node fails, the LUNs previsouly used by that node are free. A new database node is booted, which recreates the ZFS vdev/pool from the LUNs the failed database node left over. There is no need for database level replication for high-availability reasons.




Possible Issues




  • How to detect the degradion of the vdev? Check every 5s? Any notification mechnism available with ZFS?


  • Is it even possible to recreate a new pool from existing LUNs making up a vdev? Any traps?


Comments