Copying disks with Linux

From Try-AS/400
Jump to navigation Jump to search
PXE booted PC as SCSI Attachment Station

Since (mainly) older machines are very picky about Tape Drives, and existing tape drives either become defective, or working tape cartridges become more and more scarce, an alternative to regular full backups on tape is needed. Offline backups become more and more a viable alternative, despite the higher effort of (temporarily) extracting the disk(s) to hook them up to a SCSI equipped Linux PC.

IBM "blessed" AS/400 disks are different in some ways, which makes it difficult do cope with them on Linux. Fortunately, most problems can be avoided by using the SCSI Generic facilities to talk to the device(s) in question. For this, it's mandatory to have the sg3 utils package installed. In addition, the lsscsi package is also mandatory for checking devices on the bus from within Linux.

This article assumes basic knowledge about SCSI:

  • How to avoid duplicate device IDs,
  • proper bus (cable ends) termination.

Please see the articles in the See also section for further details.

Backup

  • Attach the disk and boot the PC,
  • use lsscsi -g to identify the disk(s) on the bus, and also list their SCSI Generic device node,
  • Duplicate the Disks's content with sg_dd (example):
sg_dd blk_sgio=1 if=/dev/sg0 bs=520 of=outfile_520.dd verbose=2 sync=1

If is very important to know that (with some exceptions)

  • PowerPC based AS/400's use 522 Bytes per sector, and
  • CISC architecture AS/400's use 520 Bytes per sector.

If in doubt, use dmesg to scroll through the kernel message buffer and locate the position where the SCSI disk(s) is/are identified with their block size. It is important to use the correct blocksize in the bs= parameter.

Reformat

You can reformat drives to matching block size like this:

sg_format -v -v -6 --format --size=520 /dev/sg0

This is a purely optional step and shown for completeness only.

Restore

First, make sure the new target disk has been formatted to the correct block size. A proven way to do this is to IPL the target system via CD or tape, and use the DST provided utilities to do the formatting.

Then you can write back the image to a given disk:

sg_dd blk_sgio=1 bs=520 of=/dev/sg1 if=outfile_520.dd verbose=2 sync=1

Conclusion

A test run copying a smaller disk to a larger for a CISC system, and test-IPLing this yielded an eventual halt of the system after a main storage dump. By using the displayed SRC on the panel as a crude IPL progress meter, the machine finished reassembling the LID (parts of the "kernel") into RAM and crashed right after enabling full paging. At this time in the IPL sequence that probably means it couldn't find the boot device for further tasks. But it did manage to load SLIC into RAM. And for CISC machines, it also managed to load MULIC/FULIC.

Unresolved questions:

  • is the crash related to the size difference or due to the fact that this is an entirely different disk?[1]
  • is a similar behavior happening when doing this for PPC systems?

The shown condition can be corrected by an ordinary reinstall of SLIC and the OS, wiping the existing disk. Note: This does not affect MULIC/FULIC, which is preserved. But that needs proper install media.

While the procedure shown is not advisable to be used as a single backup scenario, it is definitely a viable way to preserve MULIC/FULIC data from (partly defective) disks. MULIC/FULIC is machine specific code for CISC models, and apparently contains the system serial number. If the system serial number changes (because it is IPLed with MULIC from a different machines' MULIC tape), a system password invalidation happens. Recovering from that condition involves contacting IBM.

See also

Weblinks

Footnotes

  1. DST in OS/400 has a facility to dump a disk contents to another, but it's unknown if it does any "magic" while doing so.