さくらのVPS(CenoOS7)が障害対応のハードウェア交換で起動しなくなったので、
その対応策の備忘録です。
http://support.sakura.ad.jp/mainte/mainteentry.php?id=22553
これが原因か不明ですが、この時期にサーバへアクセスできなくなる障害が発生しました。
サーバには、Nginx/Postfix/PostfixAdminが稼働中。 teratermでログイン不可なので、さくらVPSのコンソールからログインを試みるもログイン不可。
https://urashita.com/archives/10425
ここと現象が似ていたので、参考にレスキューモードに入ろうとしたが、
「Continue」を選択後にダイアログが消えるだけで、一向に次の画面へ進まない。
正直終わったなっと思ったが、ワラにもすがる思いでさくらの電話サポートへ問い合わせしてみた。
サポート「レスキューモードに入れないのであれば、対応無理ですね」
管理画面のカスタムOSインストールからCentOS7のイメージでレスキューモードを起動しようとしたが、 画面が進まないので、別のやり方を探した。
http://thegeekdiary.com/centos-rhel-7-how-to-boot-into-rescue-mode-or-emergency-mode/
ここにコマンド経由で起動する方法が書いてあったので、やってみた。
(Bootup into Rescue mode(target)って項目です)
そしたら起動はしたが、下記のようにエマージェンシーモードで起動する。
[ 2.463483] XFS (dm-0): Internal error XFS_WANT_CORRUPTED_GOTO at line 1635 o
f file fs/xfs/libxfs/xfs_alloc.c. Caller xfs_free_extent+0xfc/0x130 [xfs]
[ 2.465295] XFS (dm-0): Internal error xfs_trans_cancel at line 990 of file f
s/xfs/xfs_trans.c. Caller xlog_recover_process_efi+0x16b/0x190 [xfs]
[ 2.466871] XFS (dm-0): Corruption of in-memory data detected. Shutting down
filesystem
[ 2.467677] XFS (dm-0): Please umount the filesystem and rectify the problem(
s)
[ 2.468373] XFS (dm-0): Failed to recover EFIs
Generating "/run/initramfs/rdsosreport.txt"
Entering emergency mode. Exit the shell to continue.
Type "journalctl" to view system logs.
You might want to save "/run/initramfs/rdsosreport.txt" to a USB stick or /boot
after mounting them and attach it to a bug report.
どうやら、ブート領域の「dm-0」というのが破損してるっぽい。 なので、下記コマンドで修復した。
:/# xfs_repair -Lv /dev/dm-0
Phase 1 - find and verify superblock...
- block cache size set to 372272 entries
Phase 2 - using internal log
- zero log...
zero_log: head block 12858 tail block 12488
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
- scan filesystem freespace and inode maps...
agi unlinked bucket 0 is 1505216 in ag 2 (inode=135722944)
agi unlinked bucket 8 is 1505224 in ag 2 (inode=135722952)
agi unlinked bucket 9 is 1505225 in ag 2 (inode=135722953)
agi unlinked bucket 10 is 1505226 in ag 2 (inode=135722954)
agi unlinked bucket 11 is 1505227 in ag 2 (inode=135722955)
agi unlinked bucket 7 is 1741575 in ag 1 (inode=68850439)
agi unlinked bucket 18 is 1741394 in ag 1 (inode=68850258)
agi unlinked bucket 19 is 1741395 in ag 1 (inode=68850259)
agi unlinked bucket 20 is 1741396 in ag 1 (inode=68850260)
agi unlinked bucket 21 is 1741397 in ag 1 (inode=68850261)
agi unlinked bucket 22 is 1741398 in ag 1 (inode=68850262)
sb_ifree 189, counted 166
agi unlinked bucket 21 is 1741397 in ag 1 (inode=68850261)
agi unlinked bucket 22 is 1741398 in ag 1 (inode=68850262)
sb_ifree 189, counted 166
sb_fdblocks 12020025, counted 12028607
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
correcting nblocks for inode 68850245, was 1186 - counted 1073
data fork in ino 68850449 claims free block 4303144
imap claims in-use inode 68850449 is free, correcting imap
imap claims a free inode 68850456 is in use, correcting imap and clearing inode
cleared inode 68850456
- agno = 2
- agno = 3
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
Phase 5 - rebuild AG headers and trees...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- reset superblock...
Phase 6 - check inode connectivity...
- resetting contents of realtime bitmap and summary inodes
- traversing filesystem ...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- traversal finished ...
- moving disconnected inodes to lost+found ...
disconnected inode 68850258, moving to lost+found
disconnected inode 68850259, moving to lost+found
disconnected inode 68850260, moving to lost+found
disconnected inode 68850261, moving to lost+found
disconnected inode 68850262, moving to lost+found
disconnected inode 68850439, moving to lost+found
disconnected inode 135722944, moving to lost+found
disconnected inode 135722952, moving to lost+found
disconnected inode 135722953, moving to lost+found
disconnected inode 135722954, moving to lost+found
disconnected inode 135722955, moving to lost+found
Phase 7 - verify and correct link counts...
XFS_REPAIR Summary Thu Aug 31 02:00:59 2017
Phase Start End Duration
Phase 1: 08/31 02:00:57 08/31 02:00:57
Phase 2: 08/31 02:00:57 08/31 02:00:57
Phase 3: 08/31 02:00:57 08/31 02:00:59 2 seconds
Phase 4: 08/31 02:00:59 08/31 02:00:59
Phase 5: 08/31 02:00:59 08/31 02:00:59
Phase 6: 08/31 02:00:59 08/31 02:00:59
Phase 7: 08/31 02:00:59 08/31 02:00:59
Total run time: 2 seconds
done
:/#
最後にリブートして起動確認して終了。
reboot
めでたしめでたし。