Recovering a Failed Physical Machine (Manual)

Caution: If you need to recover or replace a PM in a ztC Edge system, use the instructions in ztC Edge 100i/110i Systems: Replacing a Node (R013Z) or ztC Edge 200i/250i Systems: Replacing a Node (R019Z). (If needed, see Replacing Physical Machines (Automated) for additional details.) Avoid using the manual procedure described in this topic unless specifically instructed by your authorized Stratus service representative.

Recover a physical machine (PM), or node, when it cannot boot or if it fails to become a PM of a dual-node ztC Edge system. In some cases, the ztC Edge Console displays the state of a failed PM as Unreachable (Syncing/Evacuating).

To recover a PM, you must reinstall the Stratus Redundant Linux release that the PM has been running. Recovering a failed PM, though, is different from installing the software for the first time. The recovery preserves all data, but it re-creates the /boot and root file systems, re-installs the Stratus Redundant Linux system software, and attempts to connect to the existing system. (If you need to replace the physical PM hardware instead of recovering the system software, see Replacing Physical Machines (Manual).)

To reinstall the system software, you can allow the system to automatically boot the replacement node from a temporary Preboot Execution Environment (PXE) server on the primary PM. As long as each PM contains a full copy of the most recently installed software kit (as displayed on the Upgrade Kits page of the ztC Edge Console), either PM can initiate the recovery of its partner PM with PXE boot installation. If needed, you can also manually boot the replacement node from USB installation media.

Use one of the following procedures based on the media you want to use for the installation, either PXE or USB installation.

Caution: The recovery procedure deletes any software installed in the host operating system of the PM and all PM configuration information entered before the recovery. After you complete this procedure, you must manually re-install all of your host-level software and reconfigure the PM to match your original settings.
Prerequisites:  
  1. Determine which PM you need to recover.
  2. If you want to use a USB medium to install the system software on the replacement PM, create a bootable USB medium as described in Creating a USB Medium with System Software.

    When creating the USB medium, ensure that it contains the most recently installed upgrade kit. For example, if the release shown in the masthead of the ztC Edge Console window is version 1.2.0-550, where 550 is the build number, the kit you select to create the USB medium on the Upgrade Kits page must also be version 1.2.0-550. If the system detects a different build on the target PM, it automatically overrides the recovery process, initializes all data on the target PM, and uses PXE boot installation to reinstall the most recently installed software kit on the PM with no user interaction.

  3. If using a USB medium, connect a keyboard and monitor to the replacement PM to monitor the installation process and specify settings.

Related Topics

Maintenance Mode

Managing Physical Machines

ztC Edge Console

Physical Machines Page