![]() | Crashed machine |
UserPreferences |
GMS NOC Docs | FrontPage | RecentChanges | TitleIndex | Help |
Bringing up a crashed machine from firmware mode (shutdown -i5, or system crash into firmware hard boot)
Move file systems to proper backup machines, by running outage and checking where the abandoned file systems should go:
tjau_root > outage ========================================== The following auth processes should be active on the tj node NO AUTH PROCESS'S FOUND ========================================== The following filesystems can be moved off of tjau TO: 05000 Mir001 Sa=62,94 Move to: tjbu Sb=82,74 ... 15000 Mir003 Sa=72,84 Move to: tjcu Sc=92,64 ... rtj1 Mir002 Sa=63,95 Move to: tjbu Sb=83,75 ... rtj2 Mir004 Sa=73,85 Move to: tjcu Sc=93,65 ... =========================================== The following filesystems can be moved off of tjbu TO: 25000 Mir005 Sb=62,94 Move to: tjcu Sc=82,74 ... 35000 Mir007 Sb=72,84 Move to: tjau Sa=92,64 ... rtj3 Mir006 Sb=63,95 Move to: tjcu Sc=83,75 ... rtj4 Mir008 Sb=73,85 Move to: tjau Sa=93,65 ... ============================================ The following filesystems can be moved off of tjcu TO: 45000 Mir009 Sc=62,94 Move to: tjau Sa=82,74 ... 55000 Mir011 Sc=72,84 Move to: tjbu Sb=92,64 ... rtj5 Mir010 Sc=63,95 Move to: tjau Sa=83,75 ... rtj6 Mir012 Sc=73,85 Move to: tjbu Sb=93,65 ...
This shows that file system 45000, which normally runs on TJCU should be moved to TJAU if TJCU is not functioning. (No release is needed, since the machine has crashed and the filesystems are currently not being used.)
tjau_root > activate -y -f tj45000
This will activate the dumped file system on its temporary machine TJAU (listed in outage above).
Repeat the above step for other orphaned file systems. In this case it would be 55000, which would need to be moved to TJBU.
Spoolers (rtj1, etc.) don't need to be moved to a temporary machine unless the machine is to be out of service for a while.
After activating file systems on the proper machine, bring up the downed machine. A phone call to the location may be needed for a hard reboot.