Skip to content

Instantly share code, notes, and snippets.

@agrif
Last active January 1, 2016 17:19
Show Gist options
  • Save agrif/8176177 to your computer and use it in GitHub Desktop.
Save agrif/8176177 to your computer and use it in GitHub Desktop.
drive notes

Raid and BCache Notes

Day 1

Herein lies notes on the current drive configuration, recorded faithfully on this Thursday, December 19, in the year of our lord 2013.

Here's how it stands before doing anything:

  • /dev/sda: first 1TB drive, original data store, GRUB installed here

    • /dev/sda1 (664f7291-41b2-468b-8012-a4c98ef980d5): mounted on /boot
    • /dev/sda2 (mZLgi1-aRS7-uGpx-kfCK-sEc1-mH3w-TBkNp9): original LVM store
  • /dev/sdb: old 500GB raymond drive

    • /dev/sdb1: raymond /
    • /dev/sdb2: raymond swap
    • /dev/sdb3: raymond /home
    • /dev/sdb4: ???
  • /dev/sdc: SSD from nara (120GB)

    • /dev/sdc1 (2EE4DEE3E4DEABF9): old nara C:/
    • /dev/sdc2 (de359586-932d-4e49-8dad-1be5d8de7db1): old nara /boot
    • /dev/sdc3 (4a86e3b2-dfa8-43cd-806e-6385503a122e): old nara /
  • /dev/sdd: 1TB drive from nara

    • /dev/sdd1 (64BCE43BBCE4097E): old nara Z:/
    • /dev/sdd2 (25dbd16e-5338-4a9e-a24e-51719e597ba2): old nara /home

Now, I proceed to do the following:

  • backup raymond /home

    • e2fsck -f /dev/sdb3 # required for next step
    • resize2fs -M /dev/sdb3 # resize FS to minimum allowed
    • reported 1180715 blocks long, blocks are "4k"
    • python -c "print(1180715 * 4 * 1024)"
    • yields: 4836208640
    • lvcreate -n backup.raymond.home -L4836208640B platter
    • dd if=/dev/sdb3 of=/dev/platter/backup.raymond.home bs=4k count=1180715
    • verify that it still works by mounting backup.raymond.home
  • clear raymond's partition table on /dev/sdb

  • backup old nara C:/

    • ntfsresize -i /dev/sdc1 # to get the minimum size
    • ntfsresize -ns 75483463680 /dev/sdc1 # dry-run using reported size
    • ntfsresize -s 75483463680 /dev/sdc1 # real run, since last succeeded
    • lvcreate -n backup.nara.winC -L75483463680B platter
    • python -c "print(75483463680 / 4 / 1024)"
    • yields: 18428580.0
    • dd if=/dev/sdc1 of=/dev/platter/backup.nara.winC bs=4k count=18428580
    • mount and check it out
  • clear nara SSD's partition table on /dev/sdc

  • backup old nara /home

    • e2fsck -f /dev/sdd2
    • resize2fs -M /dev/sdd2
    • reported new size: 41320179 (4k) blocks
    • python -c "print(41320179 * 4 * 1024)"
    • yields: 169247453184
    • lvcreate -n backup.nara.home -L169247453184B platter
    • decide not to backup this as it's huge
    • sorry future me if this causes problems :(
  • clear old nara platter's partition table

So, after all that, here's how it stands:

  • /dev/sda: original 1TB drive

    • /dev/sda1: /boot
    • /dev/sda2: current LVM store
  • /dev/sdb: 500GB from raymond

  • /dev/sdc: 120GB SSD from nara

  • /dev/sdd: 1TB from nara

Now, I will create a partition on /dev/sdd that matches the size of /dev/sda2.

  • blockdev --getsize64 /dev/sda2
  • gives the size of sda2 in bytes: 999666966528
  • python -c "print(999666966528 / 1024)"
  • yields 976237272.0
  • use fdisk to create /dev/sdd2 with size +976235272K
    • note: 2M smaller! end partition must be smaller so we can add the bigger one later!
  • change partition type to 0xda "Non-FS data"
    • we're using initrd boot, and this is recommended for that
  • write out
  • blockdev --getsize64 /dev/sdd2
  • says the size in bytes is: 999665172480, smaller (as we wanted)

Now we will create a RAID1 using just this /dev/sdd2 (sda2 will be added later, after migrating the LVM stuff off it).

  • mdadm --create --verbose /dev/md1 --level=1 --raid-devices=1 /dev/sdd2
  • realize mdadm complains about --level=1, so add --force
  • update mdadm.conf: mdadm --detail --scan > /etc/mdadm.conf

Now I'll add /dev/md1 to the LVM volume group, and start migrating:

  • pvcreate /dev/md1
  • vgextend platter /dev/md1

Remember that I still need to add bcache on top of this, so undo that:

  • vgreduce platter /dev/md1
  • pvremove /dev/md1

Yeah ok so bcache:

  • make-bcache -B /dev/md1 -C /dev/sdc

Now there is a /dev/bcache0 device, which we can now use ;D

  • pvcreate /dev/bcache0

Oh but uggh. Employ the fix at http://www.redhat.com/archives/linux-lvm/2012-March/msg00005.html and get on with it.

  • pvcreate /dev/bcache0
  • vgextend platter /dev/bcache0

Note message "mpath major 254 is not dm major 253" on most lvm commands now. Choose to blissfully ignore.

As a test, move the itunes LV to the new bcache PV

  • pvmove -i 5 -n itunes /dev/sda2 /dev/bcache0
  • mount it and poke around
  • remember that "oh yeah it's a whole disk, can't mount that"
  • assume it went fine

More daring now, move the iscsi windows at "nara.windows" while it's booted. It worked!

Day 2

Herein lies the events of the twentieth of December, in the year of our lord 2013, faithfully recorded by this one with uid 1000.

The command lvs -a -o+devices is a good way to see where each LV is stored.

I now move backup.raymond.home and backup.nara.winC over to /dev/bcache0.

It looks suspiciously like I'll need to register my bcache devices in my initrd, so I'll do a test reboot next. Hopefully I don't bork everything.

http://maestromony.blogspot.com/2013/09/gentoo-bcache.html

Following the above, I found the LVM part of startVolumes() in /usr/share/genkernel/defaults/initrd.scripts and just after the loop using setup_md_device, I added:

for I in /dev/sd*; do
    echo $I > /sys/fs/bcache/register_quiet
done
for I in /dev/md*; do
    echo $I > /sys/fs/bcache/register_quiet
done

And because genkernel uses the old-style raid autodetect:

  • fdisk /dev/sdd, set type of partition 2 to 0xfd "Linux raid autodetect"

(NOTE FROM THE FUTURE: after this the kernel failed to reload the partition table. I didn't note it at the time, but see just below...)

Now I re-create the initrd, and reboot, and pray.

  • genkernel --install --lvm --mdadm initramfs
  • mv /boot/initramfs-genkernel-x86_64-3.10.17-gentoo /boot/initrd-3.10.17-gentoo

On reboot the new 1TB drive failed to enumerate. I opened the case and disconnected the power and SATA cable, and was able to boot. I will now attempt to reconnect them and reboot.

Second failure. I've now disconnected the drive for good, and will come back to this at a later date. I lost my windows though! AUGH!

In the meantime, I've removed the lost LV and reduced the VG to only contain the original drive again, so I can re-install windows.

Day 3

Herein lies the events of Saturday, December 28, in the year of our lady of Discord 3179.

I've purchased two new 1TB WD Blacks, to replace the 1TB WD Black that died. I did a serial number lookup on the dead drive, and it was manufactured in YOLD 3175, so it's not an unreasonable death.

Now I will attempt to form a partial RAID5 out of these two disks, copy over the LVs from the existing platter VG, and then add this drive to the RAID.

Roughly, I will follow this guide: http://blog.mycroes.nl/2009/02/migrating-from-single-disk-to-3-disk.html

First, though, a note of where everything stands.

/dev/sda: (the existing drive)
  Model Family:     Western Digital Caviar Black
  Device Model:     WDC WD1002FAEX-00Y9A0
  Serial Number:    WD-WCAW34314874
  LU WWN Device Id: 5 0014ee 2b222798a
  Firmware Version: 05.01D05
  User Capacity:    1,000,204,886,016 bytes [1.00 TB]
  Sector Size:      512 bytes logical/physical
  Device is:        In smartctl database [for details use: -P show]
  ATA Version is:   ATA8-ACS (minor revision not indicated)
  SATA Version is:  SATA 2.6, 6.0 Gb/s (current: 3.0 Gb/s)
  Local Time is:    Sat Dec 28 22:58:58 2013 EST
  SMART support is: Available - device has SMART capability.
  SMART support is: Enabled

/dev/sdb:
  Device Model:     WDC WD1003FZEX-00MK2A0
  Serial Number:    WD-WMC3F0520840
  LU WWN Device Id: 5 0014ee 20962e84d
  Firmware Version: 01.01A01
  User Capacity:    1,000,204,886,016 bytes [1.00 TB]
  Sector Sizes:     512 bytes logical, 4096 bytes physical
  Rotation Rate:    7200 rpm
  Device is:        Not in smartctl database [for details use: -P showall]
  ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
  SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
  Local Time is:    Sat Dec 28 23:00:00 2013 EST
  SMART support is: Available - device has SMART capability.
  SMART support is: Enabled

/dev/sdc:
  Device Model:     WDC WD1003FZEX-00MK2A0
  Serial Number:    WD-WMC3F0520360
  LU WWN Device Id: 5 0014ee 2b40de2cf
  Firmware Version: 01.01A01
  User Capacity:    1,000,204,886,016 bytes [1.00 TB]
  Sector Sizes:     512 bytes logical, 4096 bytes physical
  Rotation Rate:    7200 rpm
  Device is:        Not in smartctl database [for details use: -P showall]
  ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
  SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
  Local Time is:    Sat Dec 28 23:00:42 2013 EST
  SMART support is: Available - device has SMART capability.
  SMART support is: Enabled

/dev/sdd:
  Model Family:     SandForce Driven SSDs
  Device Model:     OCZ-AGILITY3
  Serial Number:    OCZ-EAWS2MS97GBOWKJI
  LU WWN Device Id: 5 e83a97 e858f55f8
  Firmware Version: 2.15
  User Capacity:    120,034,123,776 bytes [120 GB]
  Sector Size:      512 bytes logical/physical
  Rotation Rate:    Solid State Device
  Device is:        In smartctl database [for details use: -P show]
  ATA Version is:   ATA8-ACS, ACS-2 T13/2015-D revision 3
  SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
  Local Time is:    Sat Dec 28 23:01:15 2013 EST
  SMART support is: Available - device has SMART capability.
  SMART support is: Enabled

From this, all the WDBlack drives are identically sized, so I will create identical sd?2 partitions on these to match sda2. Fdisk sez:

Disk /dev/sda: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x431bd553

Device    Boot     Start        End    Blocks  Id System
/dev/sda1           2048    1050623    524288  83 Linux
/dev/sda2        1050624 1953525167 976237272  83 Linux

All these partitions (sdb2 and sdc2) are 0xfd "Linux raid autodetect", as noted before. Now, to join these in a degenerate RAID5:

  • mdadm --create -f /dev/md0 --level=5 --raid-devices=3 /dev/sdb2 /dev/sdc2 missing

And now, we create a block cache on top of this:

  • make-bcache -B /dev/md0
  • echo a2ace06d-e7b0-4ae0-9927-afecd5460865 > /sys/block/bcache0/bcache/attach

That UUID thing is the UUID of the cache set already associated with sdd, which can be found in /sys/fs/bcache. So, now we have a /dev/bcache0. Add it to the VG:

  • pvcreate /dev/bcache0
  • vgextend platter /dev/bcache0

We're getting "mpath major 254 is not dm major 253." again. UUUHHH OHHH.

Reboot time. I'll see you after the gap.

Came back up okay, no problems yet (other than that message, which is still around). Now I'll migrate some less important LVs to the new array.

  • pvmove -i 5 -n nara.xubuntu /dev/sda2 /dev/bcache0
  • pvmove -i 5 -n portage /dev/sda2 /dev/bcache0

Now, as a test, I unmount and remount /usr/portage. It seems fine. Double-plus-fun reboot time.

Reboot was fine. Going for broke! I'm moving root! Seven save us.

  • pvmove -i 5 -n local.root /dev/sda2 /dev/bcache0

And because I don't trust anything ever:

  • genkernel --install --lvm --mdadm initramfs
  • mv /boot/initramfs-genkernel-x86_64-3.10.17-gentoo /boot/initrd-3.10.17-gentoo

Double-checked that the kernel command line is ok. Here goes nothing.

REBOOT.

Reboot went fine! lvs -o +devices sez that local.root is now entirely contained within /dev/bcache0, so I will now proceed to move everything over before I finally reboot.

Glad that modification to a random script I found on the internet worked.

Here's a reference of things that need movin':

  • cookie-clicker
  • downloads
  • local.swap
  • music
  • nara.freedos
  • nara.gentoo
  • nara.gentoo.home
  • nara.windows
  • nara.windows.data
  • windows-template

They were all moved in the same manner described above.

Realize at about 4AM that this whole thing can probably be done at once with a screen session running:

  • pvmove -i 5 /dev/sda2 /dev/bcache0

Cry a little bit. And go to bed.

Day 4

Herein lies the events of Pungenday, the 71st day of the Aftermath in the Year of our Lady of Discord 3179.

lvs -o +devices notes that all LVs now reside on /dev/bcache0. I will now reboot one last time before removing the old drive from the VG.

Since the reboot worked, I will now remove the drive.

  • vgreduce platter /dev/sda2
  • pvremove /dev/sda2

Now, one more reboot.

Now I will change the format type on /dev/sda2 to 0xfd and add it to the RAID5, finally completing it.

  • fdisk /dev/sda
  • mdadm /dev/md/bifrost\:0 --add /dev/sda2

Note that linux now thinks of the raid array as /dev/md127.

Status can be tracked with

  • watch cat /proc/mdstat

Once complete, one final reboot.

Which worked. It's done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment