z990 NetMessage-protocol-based processor to support element communication interface

IBM Journal of Research and Development, May-Jul 2004 by Axnix, C, Engler, E, Hegewald, S, Hesmer, T, Et al

The communication interface between support element applications and applications running on the zSeries� system processors is an essential part of the zSeries system design. For example, the interface is used to load firmware during startup, it is used for service actions such as configuring or deconfiguring I/O channels, and for many other functions. It must be fast, reliable, and failsafe. A special hardware interface in the clock chip is used to connect the service infrastructure (support element and cage controller) to the central electronic complex (CEC). Four firmware parties are involved in the communication: support element, cage controller, and two firmware layers running on the processors in the CEC: millicode and i390 code. Starting with the z900, the interface between the support element and cage controller was implemented using the NetMessage protocol, whereas the interface between the cage controller and processors still used the legacy service-word communication protocol from previous IBM S/390� models. This meant that the cage controller had to translate the NetMessage protocol from the support element side to the legacy service-word protocol toward the CEC side. In the z990, the communication interface between the support element and the CEC was generally replaced by the NetMessage protocol. The following paper describes the new design and structure of the support element to CEC communication.

Introduction

The IBM z990 server is a large-scale server designed to meet the needs of customers at the high end of the marketplace. Such servers are capable of running multiple operating systems at the same time. The maintenance and management of these servers is done concurrently. With the IBM z900 server, an out-of-band system control structure was introduced to manage these complex systems [I]. Such management tasks include testing the hardware before the operating system is loaded, loading the operating systems, concurrent repair, concurrent upgrade, reporting of and recovering from errors, etc. To accomplish these tasks, a distributed service and control subsystem consisting of redundant support elements, cage controllers, and communication links has been introduced (Figure 1). (A more detailed, in-depth description of the system control structure can be found in [I].)

The firmware responsible for executing system management tasks runs on different system components (the processor module itself, the cage controller, and the support element). Certain system management tasks require the cooperation of firmware components on the central electronic complex (CEC), the cage controller (CC), and the support element (SE). For example, to replace an I/O card, it must be removed from the SE view of the system configuration, the firmware running on the processors must be informed that it can no longer access this hardware, and when it can be powered off safely, the CC in the I/O cage where the card resides must be instructed to power off the card. This is just one simple example of a management task, but it shows the need for communication between the involved firmware components. It can be seen in Figure 1 that the support element is connected to the cage controllers via a redundant service network, which is realized by Ethernet connections. The cage controllers in the CEC cage are connected to the processor modules via the XMsg-engine ' hardware in the clock chip. When firmware components residing on the support element have to communicate with firmware components running on the processors, they require the assistance of firmware on the cage controller because there is no direct hardware connection between the support element and the processors.

With the IBM z990 server, the method by which the support element firmware communicates with firmware components running on the processors has been changed. A new protocol has been introduced, and the design, structure, and implementation of the firmware components involved in communication tasks have been changed.

Motivation

Figure 2 shows the new communication infrastructure designed and implemented for the IBM z990 server. First, a short explanation of the new structure is given, and then the rationale for the new design is enumerated.

The support element is connected to the cage controllers via a redundant Ethernet connection, while the cage controllers are connected to the processor module via the XMsg-engine hardware. The standard Transmission Control protocol/Internet protocol (TCP/IP) suite is used for SE communication with the cage controllers. On the side of the CC and processor module, the XMsg-engine handler is used to access the XMsg-engine hardware.

Both TCP and the XMsg-engine handler offer the same logical interface by offering a byte-stream-oriented interface (see [2]). This means that they offer a plain send/receive interface for unformatted data and they guarantee that the data arrives in the same order as sent.

The NetMessage protocol is the protocol for the application level (layers 4-7 in the TCP/IP protocol suite [2]). It is used for communication between the support element firmware components and the firmware components running on the CEC. It defines the format of the data exchanged via TCP/IP and the XMsg engine. While the support element and the CEC firmware use the NetMessage protocol to "understand" each other, the cage controller requires no notion of a protocol. The CC serves only as a bridge/router [2, 3] between the Ethernet and the XMsg-engine hardware. The CC simply has to forward each byte arriving at the TCP protocol level to the XMsg engine. This approach has several advantages: The CC firmware is not affected by any protocol changes in the application layer, no protocol translation has to be done, and so on.

 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
advertisement
  • Click Here
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with ProQuest