User:Msw/Testbed

From Try-AS/400
Jump to navigation Jump to search

The AS/400 system has some rather unique features, when taken as a whole, considering both the hardware and the software that comprises the system. The following are the main areas:

  • High Level Machine
  • Object Based Design
  • Single Level Storage
  • Hardware Integration
  • Software Integration

Each of the above categories or topics will be discussed in the following sections.

Also, we will highlight features that are unique, when compared with other more common platforms.

High Level Machine

The AS/400, like the System/38 before it, is built on the concept of layers of abstraction, where higher layers are isolated from and protected from the details of how the lower level layers are actually implemented. Rather than dealing with a CPU with memory and I/O devices, in CPF and XPF, users see only "objects" presented by OS/400 within a vast single-level storage addressing scheme.

What do we mean by a "High Level Machine"? The Machine Interface (MI) provides a built-in relational database, with MI instructions that deal directly with this database. Here is a list of the database management, journal management and commitment control instructions:

DataBase Management MI instructions

ACTCR Activate Cursor
CPYDSE Copy Data Space Entries
CRTCR Create Cursor
CRTDS Create Data Space
CRTDSINX Create Data Space Index
DBMAINT Data Base Maintenance
DEACTCR DeActivate Cursor
DELDSEN Delete Data Space Entry
DESCR Destroy Cursor
DESDS Destroy Data Space
DESDSINX Destroy Data Space Index
ENSDSEN Ensure Data Space Entries
ESTDISKR Estimate Size of Data Space Index Key range
INSDSEN Insert Data Space Entry
INSSDSE Insert Sequential Data Space Entries
MATCRAT Materialize Cursor Attributes
MATDSAT Materialize Data Space Attributes
MATDSIAT Materialize Data Space Index Attributes
MODDSAT Modify Data Space Attributes
MODDSIA Modify Data Space Index Attributes
RLSDSEN Release Data Space Entries
RETDSEN Retrieve Data Space Entries
SETCR Set Cursor
UPDSEN Update Data Space Entry

Journal Management MI instructions

APYJCHG Apply Journal Changes
CRTJP Create Journal Port
CRTJS Create Journal Space
DESJP Destroy Journal Port
DESJS Destroy Journal Space
JRNLD Journal Data
JRNLOBJ Journal Object
MATJPAT Materialize Journal Port Attributes
MATJSAT Materialize Journal Space Attributes
MATJOAT Matrerialize Journaled Object Attributes
MATJOBJ Materialize Journaled Objects
MODJP Modify Journal Port
RETJENT Retrieve Journal Entries

Commitment Control MI instructions

COMMIT Commit
CRTCB Create Commit Block
DECOMMIT Decommit (roll back)
DESCB Destroy Commit Block
MATCBATR Materialize Commit Block Attributres
MODCB Modify Commit Block


Layered Architecture

CPF and XPF were originally designed and implemented using a layered architecture. XPF was and is the internal name for the "Extended Control Program Facility" and mainly refers to what we know and think of as OS/400.

At the lowest level is the actual hardware. Firmware (microcode) creates the ISA or instruction set architecture as seen by the developers of the first layer above the hardware, known as the LIC or Licensed Internal Code. This is roughly the equivalent of a "kernel" in modern OS terms, but it is closer to a "micro-kernel" a la the MACH micro-kernel. In fact, IBM participated in the MACH research project with Carneghie-Mellon University (CMU), and IBM Rochester Labs was influenced by some of its design features.

For System/38 and early AS/400 models, microcoded CISC processors were used, and the architecture implemented was called the Internal MicroProgramming Interface (IMPI). IBM used a proprietary language called PL/MP, similar to PL/S on the mainframe, for developing this LIC layer written in PL/MP.

The "kernel" is called Licensed Internal Code (LIC) and handles hardware dependencies. This includes the use of multiple processors with dedicated memory just for I/O tasks. AS/400 uses a "hierarchy of microprocessors" to perform various tasks. The LIC kernel was orignally written in PL/MP, and this is what implements the next layer above, the Machine Interface or MI instructions. When IBM moved from CISC to RISC architecture hardware, much of the LIC was rewritten in C++ to create the "SLIC" (or "System Licensed Internal Code") for the IBM PowerPC-AS 64-bit processors.

Above the MI "virtual machine" created by the Licensed Internal Code (LIC) is XPF. aka., OS/400, and its components that systems administrators and users normally see and work with. IBM calls this the Extended Control Program Facility (or XPF), that evolved directly from the IBM System/38 CPF[1], the immediate predecessor of the AS/400.

CPF and XPF (or OS/400) knows nothing about disk drives. XPF sees only "objects" that reside in the single-level storage space. (See the Single Level Storage topic below.)

CPF and XPF are mostly written in PL/MI, a variation of IBM's PL/S that generates MI program binary creation templates, just as the MI assembler does. MI provides direct visibility of, and access to, the MI objects that reside in the vast single-level storage address space.

Machine Interface (MI)

The MI or Machine Interface, is the virtual instruction set architecture (ISA) used for everything that runs in single-level storage, aka. the XPF MI "virtual machine."

OS/400 compilers do not directly output native CPU binary instruction set (ISA), as that would defeat the purpose of the MI "virtual machine" and the benefits of single-level storage that hides many of the actual details of the physical hardware from the software layers that run "above the MI." MI assembler language instructions, and their binary form, may be thought of as a kind of intermediate code, conceptually similar to Java ByteCode. Retaining this MI binary program creation template makes program objects larger than expected, but this "MI template" contains the MI binary program instructions that enables unprecedented upward compatibility, by allowing XPF to re-translate those MI instructions to a new CPU architecture, whenever IBM upgrades the AS/400 hardware platform. Unlike most other platforms, this translation is transparent and invisible to the users of the system. No recompilation from source code or other steps are needed.

Program objects restored to a newer machine with different hardware, processor architecture, DASD technology, etc., can just be started, as on the original platform. The start-up will take extra time for the LIC to regenerate the internal instructions for the current hardware in use, upon "first touch". This is somewhat like Java's JIT that is commonly used with JVMs, but it is an AOT "Ahead-Of-Time") compilation, since the newly generated instructions are stored permanently, encapsulated within the program object, for future use. The program is then started and runs as native code on a possibly completely different or newly upgraded CPU architecture. Subsequent invocations will not incur this translation overhead again.

IBM exploited this possibility when they switched the hardware from the custom 48-Bit CISC IMPI CPU architecture [2] to the 64-bit PowerPC-AS architecture in the mid-1990s.

NOTE: originally, IBM allowed customers to "remove observability" from program objects, to save some disk space. Removing the binary program creation template also removed the ability to "debug" the program at a symbolic level, so many ISVs or software vendors did this to try to better protect their intellectual property. However, once the MI template was removed, you could no longer migrate that object to a new CPU architecture, as there was now no way for the LIC to re-generate the new ISA instructions from the MI program creation template (as it had been removed.)

Upward Compatibility

IBM has a long history building platforms that span a multi-decades of upward compatibility at binary level. The AS/400 is no exception, and is perhaps the best example of this. Programs compiled on ancestor platforms (System/38) or older versions can be restored onto a newer AS/400 system and usually run without difficulty.
Also, higher-level language source code (COBOL, RPG, ...) written for older releases of OS/400 usually compiles without any changes on newer versions and releases of OS/400.

These features may not seem to be of much interest, at first glance. But, this enables developing an application on an older AS/400 system and then running it on any newer AS/400e, iSeries, System i or IBM i system. This could be seen as a base requirement for more free and Open Source Software: it is often much easier to acquire an older "used" machine, at a more affordable price, than it is to acquire a currently supported model.

Unfortunately, transferring objects from newer versions to older releases is not quite as easy. The Savlib command writes stuff into a pre-created Save-File and records the release string of OS/400 where it was run. It is possible to set pre-set different values but the newer the utilized OS version, the newer the oldest possible compatibility level. V4R5 permits V3R2 as oldest release and select V4 releases.
Transfers from older to newer systems are usually quite painless, though.

Object Based Design

The AS/400 is an object-based system. Everything in the system — programs, files, user profiles, message queues — is implemented as an object. Every object has two parts: a description that defines the valid ways of using the object; and a functional part, that contains the encapsulated contents of the object used to implement its behavior.

If an object is defined as a program, its description specifies that the object contains executable, read-only, compiled code. The only operations allowed on such objects are those that make sense for a program. For example, you can write into the middle of a data file, but you cannot write into the middle of compiled code; the system does not allow it to happen. AS/400 encapsulated objects insures data integrity for all objects in the system.

Object-based design also has some important security implications.

All objects on the AS/400 system are owned by an owning user profile. The object owner or a system security officer can then grant or revoke additional authority to each object to other users or to a group profile.

As part of the fundamental design, AS/400 objects are one of many reasons that the AS/400 system has an excellent reputation for security and integrity, when set up and used properly.


Single Level Storage

The AS/400’s massive 64-bit address space can address 18 quintillion bytes of data. Architecturally, the AS/400 is designed to be capable of even more than this — MI pointers are 16 bytes long, on a 16 byte boundary, and so they contain 128 bits, of which at least 96 bits are reserved for addressing.

Mapped into this 64-bit space is the "real" storage: disk drives and main memory. Customers do not need to be aware of any of the storage technologies that implements the huge address space because the AS/400 manages them automatically. As far as customers are concerned, all programs and data simply reside in this massive space. Users do not need to worry about where a program resides; they need only to refer to it by name.

Similarly, customers do not need to worry about making extensions to files that are full. The AS/400 handles this automatically. And when customers add more storage devices to the machine, they do not need to redistribute data across them; the system recognizes the new available storage and uses it. Most AS/400 installations do not even have a traditional DBA or database administrator as they do not need one. The system performs this type of work on its own, automatically.

Processing business applications in a multi-application, multi-user environment involves frequent switching between different tasks. Thanks to its single-level storage, the AS/400 accomplishes this function much more efficiently than most conventional systems. Switching to a new task in the AS/400 is greatly simplified. There is no need to create a separate address space before the execution of a new task can begin (as is done on Unix and Windows systems). AS/400 is designed for very frequent task-switching that characterizes modern commercial business environments. Single-level storage simplifies storage management, and delivers exceptionally good performance, by avoiding the need for swapping of address spaces, and provides a very natural support for data and information sharing across jobs and users.

AS/400 is truly a paging system

IBM learned in the late 1960s and early 1970s that paging I/O is typically faster and less expensive than ordinary disk file I/O.

All AS/400 hard disks (DASD) are mapped into the vast single-level storage (SLS) address space. Paging is done on demand, as applications refer to addresses that are not currently in real main storage (memory). The LIC (or SLIC) Applications simply request contents from addresses and the layer below XPF, the LIC (or SLIC on RISC-based AS/400s) takes care of bringing the data into main storage when a page fault occurs. This means that applications above the MI are not aware of virtual memory or paging, per se.

This also explains why the AS/400 platform lacks a traditional disk-resident "File System." Since all disks are part of the vast virtual memory space of SLS, everything visible to XPF are objects at a given address, with certain properties. This also permits easy sharing of these objects between requesting processes (or jobs).

Another advantage of this SLS approach is the ability to do very fast context switches.[3] This is the reason why even older slower IMPI machines with 100 % loaded CPU still provided astonishingly fast interactive response times, when compared with other common platforms of that era.

Database tables in single-level storage

Conceptually, the rows of data in an AS/400 database physical file member (aka. an MI data space) are laid out, one column after another, and one row after another, just like a big array in virtual storage. So, to fetch a given row from any particular database table (or file), one needs only to know the starting address of the first row and the total record length. Multiply the record length times the desired record number, and add that to the base address and you have computed the address of the desired data in virtual memory (SLS). Then it is simply a matter of paging in the desired page(s).

In reality, it is not quite that simple, but almost. IBM added specialized MI instructions for database management, journal management, and commitment control, to permit safe concurrent access to and updating of the data in database tables (files) across the entire platform (by many interactive and batch jobs, accessing the same database table at the same time.)

Disk Allocations

The usual way the system allocates space on secondary storage (aka: hard disks) is to spread objects over as many disks as possible. In modern RAID terminology, this is referred to as spindles (from the stack of rotating platters' axis), while IBM AS/400 documentation refers to the same concept as disk arms: The actuator moving the read/write heads over the desired cylinder — a set of tracks — on the rotating platters.[4]

This spreading of data has the same advantages and disadvantages as RAID 0:

  • with increasing count of disk arms, the chance that a given data block is located on an idle drive increases and thus decreases latency through exploiting parallelism: the more applications run, the more data can be picked up from individual disks, the more overall processing work can be done.
  • since data is located in the system just once, the failure of (at least) one disk has hard to predict ramifications. Objects become damaged because parts of them are no longer accessible.[5]

From early on, the concept of a dedicated I/O processor allowed for easy implementation of data mirroring, to compensate for data loss because of a malfunctioning disk drive through redundancy. This was recognized early on as a weakness in System/38 CPF -- if you lost one disk drive, you lost the entire Single-Level Storage, and would have to re-load the entire system from the last known good set of back-up tapes. (Hence, the aforementioned emphasis in the IBM documentation of establishing good back-up procedures.)

Later, the well-known RAID-5 and RAID-6 have been made available through the availability of DASD IOPs (I/O processors). IBM actually "invented" what we now know of as "RAID" on the System/38, where it was known as "parity" protection. For every two DASD volumes, a third volume of the same type and size was used to store the exclusive or of the same sectors from the first two drives. If you lose either drive A or B, you simply XOR the remaining value with the "parity" data to recreate the missing data. This allows a system to continue to run, with somewhat degraded performance, until the failed drive can be replaced. (A XOR B -> C. B XOR C -> A; A XOR C -> B.) Modern RAID schemes with striping etc., are just more elaborate schemes using the same underlying concepts and principles of overlapping parity, similar to the ECC error correcting codes used in memory chips, but on a much larger scale, across entire disk drives.

Memory Pools

The described concept of the platform relying on paging as primary means to bring in data and code into the faster main memory for actual processing has some implications on main memory constrained machines. Once, RAM was extraordinarily expensive. Without precautions, it was easily possible to start many programs at the same time, each one competing for presence in RAM. When seriously overwhelmed with concurrent requests, this situation leads to Thrashing, a situation where the machine is completely busy with paging activity, but does no useful work anymore because of competing tasks.

OS/400 introduced the concept of memory pools for that purpose. A memory pool mainly has these attributes:

  • Size,
  • Shared or Private,
  • Maximum Activity Level.

Pool can be more or less translated to RAM Cache: What's already in the pool's memory is cached, because it mustn't be handled by the paging logic.

Pool Size

The Size of a memory pool is just how much RAM is set aside for applications being directed to run in this pool. More RAM means more cache and less disk activity for shoveling work into RAM.

Shared or Private Pool

Shared pools allow more than one group of applications to utilize a memory pool, while private pools are reserved for just one group. See Subsystems below for the explanation of this grouping. This brings in another consideration: The more pools one creates, the more unused memory is left over from excess memory partitioning. [6]

Maximum Activity Level

The Maximum Activity Level of a memory pool is the overall count of applications which are eligible to actively run in a given pool in parallel. This eligible count only considers applications which require actual access to the CPU, but not sleeping ones, waiting for some application-external event. This setting reduces competition for memory amongst applications within that pool. But at the same time — to a lesser extent — overall competition for main memory, system-wide.

Pool Summary

It is important to consider that this competition for RAM is not limited to actual code being in main memory, but also for data. Heavy paging can be quite normal for a batch job reading through a huge database file, and only a minor brake on other applications running. As long as the application code stays in main memory, that application can run again as soon as more data arrives in main memory to be processed. Because batch applications (ideally) use a different memory pool than interactive applications, there is no competition for main memory for these two application classes being confined in their respective pools.[7]

Many Books have been written about performance tuning the AS/400 system. As you can see, there are many aspects to consider which values of these settings lead to the sweet spot of a machine doing maximum (batch) work while minimally affecting interactive users. Dynamic scenarios have been described where interactive users are not considered at non-office hours, giving batch jobs more memory and thus allowing faster processing.

In current reality, there are two groups of users, and both of them rarely need to consider the points outlined above:

  • Hobbyists are most often the sole users of a system. Thus, overall system activity levels are low — most often, the machine idles.
  • Professional users with current systems have RAM in abundance. In addition, the increasing ubiquity of solid state storage with its intrinsic lack of mechanical positioning latency boosts I/O throughput.

For both groups, the directions given in the Post-Install Optimizations article gives an almost perfect starting point for the self-adjustment of the system itself to balance into a good partitioning of RAM into existing shared pools.

Expert Cache

In short, this is a setting of a pool where the system itself decides the block size of data being read from disk and put into memory. Without this, transferred blocks are of a fixed size, thus increasing processing overhead. CHGSHRPOOL PAGING(*CALC) switches Expert Cache on for a given pool. On the other hand, smaller data transfers decrease the wait time until some other disk access can be satisfied.
Usually the recommended setting is enabled. Allow the system to adjust itself. Note: The default setting is disabled.

Subsystems

A subsystem is a way to group application programs. Subsystems are defined by a subsystem definition object. This object allows for many attributes to be set, to prepare the desired run-time environment. See Subsystems for a more in-depth illustration about the underlying concept.

Running Application Programs: Activation and Activation Groups

Since programs are already existent as objects in the single level store memory space, there's no need to load them from disk as common OS' need to do.[8] To actually run such an application program needs some housekeeping, though. Commonly a program needs a Stack and a Heap for variables and other internal data structures to be able to run. The AS/400 is no different in this case, although this stuff is buried very deep into the LIC's innards.

Actually starting a program to be run on an AS/400 is called activation. This task creates necessary storage as outlined above and adds the program to to the job scheduling queue. This is also necessary to provide a properly isolated environment when multiple users start the same application.

Completely different compared to common platforms, it's possible to share some of this space between programs. This shared space is called an activation group. That is, while creating a new program (by compiling and linking), one can tell the compiler/linker if the program is to be run in

  • the default activation group, that contains most of the OS,
  • the caller's activation group, to allow a more dynamic approach, or
  • a new activation group that is to be created for every instance the program has,
  • a common activation group with a programmer specified name.

As long as any program of any activation group is activated, so is the activation group. Data structures will be destroyed only when the last program terminates, or the job ends, [9] or when a group is explicitly deallocated with the RCLACTGRP program. For details see weblinks.

Sharing an activation group shares (amongst other data):

  • Static variables,
  • Open files and file pointer positions,
  • and more…

There's a certain intersection of functionality between a common activation group and fork()ing an application on Linux/Unix platforms. Even if it is not exactly the same but maybe it gives you an idea.
On OS/400, programs to share an activation group must exist as separate objects while on Linux/Unix all code is either linked to a monolithic program or uses additional code with the aid of shared libraries.[10]

Standardized UI

The whole UI revolves mainly around forms and menus, accompanied by static text. The layouts of these screens are somewhat standardized. AS/400 screens are very recognizable because of this.

See also: About Green Screens and mouse-clickable UIs.

OS/400 Command syntax

Experienced users often prefer not to have to repeatedly navigate though nested menus, so commands may be entered on any command line rather than menu choices. The OS/400 commands consist of two or three parts, built of abbreviated English words. Abbreviations often take place by omitting vowels and shortening the result to three characters or less. The first part is typically a verb and the second component designates an object type to act upon, optionally followed by additional groups of (usually) three characters each. (Occasionally, only one or two characters is used for the final part.) This is followed by zero or more parameters or arguments.

A few examples may illustrate the previous statements:

wrksplf
Work with Spooled Files
chgmsgq
Change Message Queue Properties
crtdspf
Create Display File
dltf
Delete File
dspmsg
Display Messages (from Message Queue)

Command Prompting

A very nice feature of OS/400 is the ability to type in a command name, and press F4=Prompt. This presents a "fill-in-the-blanks" form on the display screen.

Command Syntax

OS/400 commands use a very IBM style syntax, similar to what is used for TSO CLISTs, and for IDCAMS (VSAM access method services), etc. Each command specifies command name, followed by one more more "keyword(value)" pairs.

DLTF FILE(MYLIB/MYFILE)

or

DSPPFM FILE(FILE1) MBR(MEMBER01)


Some commands allow the first few keywords to be specified positionally, as follows:

DLTF MYLIB/MYFILE

or

DSPPFM FILE1 MEMBER01


See also IBM i Control Language and CL Tricks.

Outstanding Reliability

Software is a complex thing built by humans. Humans make errors. Thus, Software contains errors. This is an inevitable fact.

IBM managed to make OS/400 not bug free but caught most problems that lead to a complete crash of the operating system. There may be edge cases that are usually hard to spot.

Also, IBM built multiple measures against defective sectors on hard disks being catastrophic into the OS from early on. Since there is no classical file system, most often, read errors only affect one or more objects that can be easily restored from backup.[11]

Reliability is not the same as availability. Some tasks can only be done when OS/400 is running in a restricted state.[12] This includes but is not limited to a full system backup or certain hardware failure recovery procedures.

Rapid Development

OS/400 comes with a set of tools and supportive facilities within the OS to ease development of application programs. Usually, development comprises of:

  • Definition and creation of database tables, display forms and printer output,
  • Writing code that references these files and shovels data just by issuing proper READ and WRITE (among other) requests to these files.

The mentioned files are typed in a format called Data Definition Specifications or DDS. It provides data field definitions, field definition references between files, arbitrary placement of static text (not for database files) and a multitude of flags for content to be shaped to fit for the desired presentation.
The Screen Design Agent program (strsda) provides a pseudo-WYSIWYG-interface for creating screen forms and menus.
The Report Layout Utility program (strrlu) provides a pseudo-WYSIWYG-interface for creating layouts for printer output files.

Actual programming can be done in RPG, COBOL, and C/C++. Output of the compilers (called modules) created by any of these languages can be linked together to create an actual program object or a shared library (service program). Any language may call functions of any other language as long as parameters have been defined to be compatible.

The compilable CL "shell"-script language and REXX-Interpreter help in easing traditional automation tasks on the command line level.

Programming is done within the Program Development Manager collection of programs (strpdm). The most used component is the Work with File Members application, wrkmbrpdm, to list source code physical file content. This in turn calls the Source Entry Utility (SEU) with the appropriate parameters for actual code writing/editing with the desired existing or new file member.

SEU provides a linetype-aware form for RPG and DDS, easing the matching of the appropriate columnns. It can be invoked as usual: With F4.

See also: How to program an Application

Times they are a changin'

Today, development tasks are most often done on Windows PCs, with IBM provided tools, that pull a local copy of a source file to edit.

Also, since the late 1990's, IBM has been trying to establish Java as the language of choice, along with helper programs for shifting away from the old fashioned character based applications (green screen) to a pure web-browser based way of user's interaction with the system.

Also, the AS/400 Language, RPG, has been enhanced and modernised slowly but constantly. While most examples in this collection of knowledge are based on ILE RPG IV, newer OS releases were accompanied by compilers allowing to escape the punched card appeal of RPG more and more. V5 introduced the possibility to write actual code (not definition statements, though) in a free form, while with some V7 all parts of a RPG program may be written without being forced to write statements at a certain position within a line. This generates a load of syntactical changes to be learned.

Examples of positional code are slowly vanishing from Internet Websites. Looking up how to do stuff in positional code is getting increasingly difficult. Asking around in mailing lists for positional code examples sometimes yield unhelpful and cynic comments.
Still, many companies have old to very old code to be maintained that predates compilers allowing free syntax. Providing mainly positional RPG examples is supposed to help understanding this old way of writing RPG, while examples for newer techniques are easily found with an internet search engine of choice.

Weblinks

Footnotes

  1. Control Program Facility
  2. similar to the CISC CPUs in IBM S/370 mainframes of the 1970s-1980s.
  3. According to Frank Soltis in his books Inside the AS/400, and Fortress Rochester.
  4. Once, disks with two actuators have been built, but these have been a development on the fringes and quickly vanished from the market.
  5. The necessity of a solid, and tested backup and restore concept is self-explanatory, but also emphasized in IBM AS/400 documentation.
  6. OS/400 has a function called performance adjustment. This function periodically checks paging activity and adjusts memory allocations between pools, as well as the pool's activity levels.
  7. There is a certain degree of competition for disk I/O, though. As long as both pools need to access data on a common set of disks, I/O latency for both pools will go up, and response time for interactive applications will degrade. Again, this latency increase should happen most often for paging of data, not application code.
  8. Or did in earlier times: Linux currently automatically does an mmap() call if you do an fopen().
  9. A job is defined as work that needs to be done by the system. Work can be done by a single program or mutiple.
  10. A concept that also exists on the AS/400. There, they are called service programs.
  11. You don't have a backup? Own fault! No backup, no pity.
  12. In Unix this would be called Single User Mode, with not much more than init and a root-shell running.