Harold W. Lawson, Lawson Konsult AB, email@example.com
Sivert Wallin, Teknogram AB, firstname.lastname@example.org
Berit Bryntse, Teknogram AB, email@example.com
Bertil Friman, Friman Data Konsult AB, firstname.lastname@example.org
The properties of the Automatic Train Control System that has provided a reliable and safe function in Sweden since 1980 are described. Via an engineering view of the problem domain, an architecture evolved in the mid-1970s that has been a key factor in the success of ATC. ATC version 1 functioned properly from 1980 to 1993 without a single change in the software. Since 1993, ATC version 2 has continued this outstanding record and has been adapted for new markets and new requirements. In Sweden, there are approximately 1000 ATC locomotive installations of the on-board system. The operating system core has been re-utilized several times for new product versions as well as the "black box" recorder and more than 20 ATC simulators. ATC is examined from the architectural, development and maintenance as well as the verification points of view. Finally, lessons learned from ATC as well as further usage of the concepts in Sweden are reviewed.
The safety of millions of train passengers is dependent upon reliable safety related equipment and functions in the entire railway system. One of the important functions is the monitoring of the behavior of train drivers; that is, assuring that they abide by speed limits, signal status and other conditions. There have been several train accidents in Europé and elsewhere during the past twenty years where the proper operation of this function would have hindered these incidents. This function, now often referred to as Automatic Train Protection (ATP), has been implemented since the late 1970s in Sweden as the Automatic Train Control (ATC) system.
In this paper, key properties of the Swedish ATC system are presented. In particular, the on-board system conceived and developed by Standard Radio and Telefon AB in the 1970s, now owned by ATSS (Ansaldo Transporti Signal System), and further developed and maintained by Teknogram AB will be examined. The major reasons for the success of this system in providing safe train control for over 20 years are cited.
The paper is co-authored by four people who have participated in different roles in respect to the Standard Radio ATC system; namely, as architect, developers and maintainers, and verifier of the most recent versions of the software.
ATC FUNCTION AND ENVIRONMENT
The ATC on-board equipment is used in conjunction with railway trackside equipment, such as speedboards and signaling system, to maintain and increase the safety of the trains, and also to increase the capacity of the railway system.
To meet the demands of increased efficiency of railway transportation on both existing and new tracks, the train speed must be increased and the trains must operate with shorter intervals. This requirement increases the demands on both the safety system and the train drivers thus leaving little room for human errors. The high degree of accuracy of the ATC system minimizes the risks for driver error. The ATC should be considered as being a complement to the existing optical signal systems. The primary intention is not to control the driver but to lighten his work- load, and provide supplementary information that is not available in the optical signal system.
Initially (in 1980 when the first ATC systems where installed), the plan for the Swedish state railways (SJ) was that the train should be driven entirely according to the external optical signals, and that the ATC system should be considered only as a safety back-up. With the advent of the X2000 high-speed trains (200 km/h), it turned out that the optical system was insufficient for presentation of all information needed, e.g. earlier warning for restrictions ahead, and different speeds for various train types. Also, after operational experience with the ATC system had been accumulated, it turned out that the ATC system could be trusted for presentation of information not otherwise available along the track.
The resulting system is a very efficient, robust, and safe combination, well matching more expensive and complicated systems being used elsewhere.
If the driver should lose concentration for a moment, the ATC will then take over the control of the train by applying the brakes. This brake application continues until the driver manually acknowledges to the system that he is once more capable of controlling the train. If the driver should fail to regain control, the ATC will continue to brake the train to a standstill.
The vehicle on-board equipment is portrayed in Figure 1 and consists of the following major components:
· An antenna mounted underneath the vehicle that activates the track equipment (transponders) by transmitting a continuous signal and receiving transponder messages to be evaluated and used to supervise the safe travel of the train.
· A set of computer equipment that evaluates the transponder messages, presenting the information to the driver and braking the train to a safe speed level if the driver should fail to take the correct actions, i.e. not braking the train or exceeding speed limits. The driver has to manually cancel each ATC brake application by pushing a brake release button.
· Cab equipment consisting of a driver's ATC panel used by the driver to enter into the ATC system the data that is relevant to that specific train, and all other communication with the ATC equipment. The panel also keeps the driver informed of current speed limits and target speed limits at speedboards and signals ahead.
· Vehicle interfacing devices, such as speedometer connection, main brake pipe pressure sensor and one or more brake valves.
Figure 1. On-board ATC Equipment
The wayside equipment consists of track mounted transponders transmitting messages (telegrams) to the vehicle when activated by the antenna of the vehicle (see Figure 2). Each type of information generates a unique message (telegram). The transponders are combined into groups of minimum two and maximum five transponders. A transponder group can be valid for the current or the opposite direction of travel, or for both travel directions.
The transponders in a group can have either a fixed code or be coded by an encoder connected between the signaling system and the transponder, in such a way that the transponder group can give information corresponding to the current signal aspect to the on-board equipment.
When a vehicle with an active ATC travels over a transponder group, each transponder will be activated by the energy received from the antenna of the vehicle. The coded message is continuously transmitted to the vehicle equipment as long as the transponder is active. A valid combination of transponders will transmit all the information necessary for the vehicle equipment to evaluate the message and take the required action. The on-board equipment will detect either a faulty message or an invalid combination of transponders and notify the driver accordingly.
Figure 2. ATC Transmission System Overview
THREE PROCESSOR REDUNDANCY
In order to provide for fault-tolerance, a three-processor solution with majority logic comparison of outputs is utilized for the on-board system. The same program is executed on all three processors thus the redundancy protects primarily against processor hardware failures.
TIME LINE FOR THE PRODUCT
The Standard Radio and Telefon ATC product became the property of ATSS (Ansaldo Transporti Signal System) in 1990. Since 1984, a significant part of the further development of the product and maintenance has been contracted to Teknogram AB. In the following time line, major product events are presented.
1973 Standard Radio decides to enter the train control market place. Swedish State Railways (SJ) requests proposals on a transmission system
1974 Standard Radio, Philips, Ericsson Signal develop transmission solutions
1975 SJ selects the Ericsson Signal approach for the transmission system. Standard Radio starts work on an on-board system concept. SJ favors the Standard Radio on-board mechanical structure. Work on the software architecture concept begins.
1976 A problem related architecture evolves. Guidance for development, production, testing, and maintenance
1977-79 Standard Radio selected for the on-board system for SJ trains. Development, testing and verification. Contract to Ericsson Signal for on-board system for SL trains only *). Integration of transmission and on-board systems followed by validation.
*) SL – Stockholm’s Local Traffic. Utilizes a different on-board solution based upon N-version programming. Three different program solutions deployed on each of the three processors. Inherited by Elektrisk Byrå AB, ABB Signal AB and finally Adtranz AB.
1980 Installation of ATC1 on SJ locomotives.
1980-93 ATC1 operates successfully without any changes in software
1988-92 ATC2 plan: SJ, NSB*, EB-Signal, Standard Radio-ATSS, Teknogram.
Further development based upon ATC1, testing, verification, validation.
*) NSB - Norweigen State Railways
1993 Installation of ATC2
After this, two further developments of ATC2 have led to ATC2.1 developed especially for the Västervik line where a radio based control instead of the transponder system was employed. Further ATC2.2 that was developed to be integrated in the locomotives that will be traveling over the Öresund bridge. In this case, Teknogram has also developed an interface PC-board and software based upon the same operating system as ATC2 for communication with the Siemens solution utilized on the Danish railways. This system will begin operation during the summer of 2000 when the bridge officially opens.
In addition to the main ATC products, a separate PC-board and software running under the same operating system was developed to function as the "black box" recorder for ATC. The recorder collects information for up to three days of train operation and includes telegram information and all transitions of speed greater than 2 km per hour. The most recent version of the recorder utilizes flash memories. Earlier versions utilized solid state memories that required constant power.
It was hoped by Standard Radio that ATC1 would be an export product. Unfortunately, this market did not materialize. Several potential customers, including British Railways examined the product, but decided not to buy it. Very unfortunate since it has now been proved that it has worked reliably for train traffic for over 20 years. This is a truly impressive record. The cost of one single serious accident would most likely pay for the installation of the system.
Since 1990 the solutions utilized in ATC1 and ATC2 have been further exploited by AT Signal System. ATSS along with Teknogram have been involved in several installations of ATC. The installations have included an ATP (Automatic Train Protection) system for Keretapi Tanah Melayu Berhard of Malaysia (installation 1996), ATP for Hamersley Iron Ore Railways in Australia (installation 1998), the ATC system for Roslagsbanan in suburban Stockholm (installation during 2001), and ASES (Advanced Speed Enforcement System) for New Jersey Transit in USA. While some of these versions have been programmed in Ada, They are based upon the same architecture and operating system core concepts.
Further, Teknogram AB has successfully utilized the same architecture and operating system to develop and market more than 20 train simulators. Consequently, the ATC architecture has been the basis for the Teknogram business concept.
In summary, a truly exceptional example of reuse of concept and operating system solution for a period of over twenty years.
ATC SOFTWARE STATISTICS
As indicated in the time line, there have been two major versions developed and two minor variations on the second version that have been developed for utilization by the Swedish Railways (SJ). The size in terms of number of procedures, lines of assembly code and number of memory bytes are as follows:
Version Number of Number of Number of
Procedures Instructions Bytes
ATC-1 157 4116 10365 *)
ATC-2 308 10281 26284 **)
ATC-2.1 313 10523 27029 **)
ATC-2.2 339 11178 29522 **)
*) Motorola 6800 microprocessors
**) Motorola 68HC11 microprocessors
The small size, clear structure, and simplicity of the software solution have led to many advantages in respect to verification as well as further development and maintenance as described below.
In 1975, the consultant services of Harold Lawson were contracted by Standard Radio to assist Roger Andersson, project leader, and Sivert Wallin, chief designer, in the conceptualization of the architecture. Following a review of the work done to date on the software, Harold Lawson and Sivert Wallin re-examined the fundamental requirements of the ATC function and developed the problem oriented architecture that has successfully provided product stability as well as a sound basis for further development under the entire twenty year life cycle of the ATC product.
The major conceptual aspect of the design is the treatment of the system as being continuous in time as opposed to being discrete and event driven. Given the fact that a 250 millisecond resolution (dT) of the state of the train in respect to its environment was determined to be sufficient to maintain stability, it became clear that the simplest approach was to simply execute all relevant processes (procedures) during this period of time. So a cyclic time driven approach became the basis for solution.
This simplification led to the fact that the processors only needed to be interrupted by two events. One interrupt to keep track of time (1 millisecond) and one interrupt when information from a transponder is available. The time in the 250 ms dT is more than adequate to perform all processing. Adding more structure to the problem, for example, via the use of an event driven operating system approach would have had negative consequences in terms of complexity, cost as well as reliability and risk thus affecting safety. The fundamentals of the approach were documented in Lawson, 1975. The operating system organization is illustrated in Figure 3.
Figure 3. Operating System Structure
As development proceeded, it became clear that treating the application software in a "circuit like" manner made sense and led to highly simplified coding of processes (procedures). While it would have been useful to deploy a higher level language in the solution, it was deemed unnecessary due to the low volume of code that was expected. Experience has indicated that this was a reasonable decision at that time. On the other hand, it was decided to comment the code in a higher level language. In earlier versions of the product, the Motorola MPL (a PL/I derivative) was employed. In later versions, a more Pascal like annotation has been consistently employed. In system tests, MPL, respectively Pascal versions have been executed in parallel with the execution of the assembly language version in order to achieve system verification.
As the product concept evolved, the key factors and goals became evident as documented in a comprehensive software plan by Lawson, 1976.
"A comprehensive plan for the specification, development, testing, verification, production and maintenance of the software components of the ATC project is presented. The goal is to produce reliable software parts to complement the three processor Motorola 6800 system so that a trustworthy total system is provided. A further goal is to assure that the software constituent remains reliable under the life time of the product. That is, that future modifications to the software will not affect the reliability due to oversights concerning design features and software component interrelationships."
"The key to a successful software product lies in the ability to decompose the system to be implemented into well defined units such as processes, procedures, blocks, etc. Further, the operation, inputs, and outputs of these units must be well specified and the specification must serve as a control over the implementation, testing, production, and maintenance."
"In the ATC project, the process is the unit to which the system structure has been decomposed. A process should be viewed as a testable component, precisely as a hardware component (integrated circuit). It must have a clear specification and have a well defined component test procedures."
"A system can never be more reliable than its components and their interconnections. Assuming that each software component has been tested, the interconnections of subsystems of components and finally the total system must be developed, tested, and verified systematically."
Thus, it is clear that even at this early point in the product history conceptualization, the importance of architecture as a controlling factor for the life-cycle of the product was clearly identified. Even though the owners of the product and development and maintenance has changed management, the fundamental concepts established in the mid-1970s are still in place and have led to a successful solution for train safety.
DEVELOPMENT and MAINTENANCE PERSPECTIVE
The early development work was based upon using a PDP-15 computer both for simulation as well as for assembly language translation. The target system based upon Motorola 6800 processors was connected to the PDP-15 so that both procedure and system testing could be well controlled.
Due to the simplicity of the architecture, many advantages were discovered, for example:
- The structure of procedures provided clear points of built-in controls that aided in testing and fault isolation.
- The stack pointer must be returned to the same point in each execution cycle providing a general control of proper cycle execution.
- No wild loops can occur.
- No backward jumps are permitted other than in well controlled loops in procedures.
- Quick reliable changes can be made and verified thus reducing costs.
- The operating system core can easily be reused by removing procedures and incorporating new procedures for new functionality (recorder, simulator).
The target system changed to the usage of Motorola 68HC11 processors for ATC2. The development system was moved from the PDP-15 to a PC based solution that as with the older version also provides for system test via simulation.
Verification is carried out via module testing, code inspection, and system test. Early verifications of ATC where carried out by SINTEF at the Technical University in Trondheim, Norway. Bertil Friman has been involved in verification of ATC2, the latest of which is reported in Friman, 1999. The report describes the verification of the ATC2.2 version that will be used for trains crossing the Öresunds bridge between Denmark and Sweden.
The software circuit like procedures of the ATC system have, since the beginning of the ATC project, been tested by running them in parallel with equivalent software circuits written in a high level language, and comparing the results. Back in 1975-76 when the original ATC was developed, this was done by connecting the target system (6800-based) directly to the bus of a minicomputer (PDP-15). The high level version was then run on the minicomputer which also was used to control the execution of the target system and to compare the results. The same principle, although more refined, is also used today. The high level version is now written in Pascal and run on a PC computer. The PC computer has direct read/write access to the 64k byte memory space of the target system which is now based on 68HC11. (A similar system based on the 68331 processor is under development.) This configuration makes it possible to test approximately 1000 value combinations per second. Two million combinations can be tested in roughly half an hour. If a software circuit has a small number of input variables, then it can be tested exhaustively. If the number of input variables is large, then the value ranges are limited to values around min, max and close to the decision points in the code.
Back in 1988, when the major revision of ATC that resulted in ATC-2 was started, it was decided that because of the increased complexity of the program, it would be subject to a thorough and detailed inspection. This inspection was contracted to Bertil Friman at Friman Datakonsult AB. The inspection was mainly done by the use of informal proof techniques. A goal was defined, and then an informal proof was built up to see if it was satisfied.
It was soon noticed that most goals were associated with variables and their contents. A (simplified) goal could for instance be that the variable HS (main signal speed) should always be zero after the passage of a stop signal transponder. Since most goals were associated with variables, the goal-proof-technique was successively replaced by a systematic analysis of individual variables. This analysis was done by tracing all places where a variable could be assigned a new value, and for each such place, finding out the real world conditions that were associated with the variable change. These real world conditions could often be directly checked against sentences in the requirement specification.
Associating real world conditions to places in the code where a variable changes value requires an incremental analysis of variables. First variables that only depend on hardware inputs must be analyzed. Then variables that depend on these variables can be analyzed and so on. Sometimes two or more variables can be dependent on each other in a circular fashion. Analyzing such a loop requires more effort because all involved variables have to be analyzed together.
The variable based inspection method has been very successful both for ironing out special case errors and for enhancing the confidence in the ATC system.
Johan Fredrik Lindeberg and Øystein Skogstad at SINTEF encouraged at an early stage the development of CASE tools to support the code inspection. Several such tools have been developed. The most important is VTR (=Variable TRacer) which is directly associated with the variable based inspection method.
The bulk of the system testing of ATC is done with the use of a simulator. The ATC system is tested by simulating the train start-up and travel on the rails that are equipped with transponders. The simulator has handles, buttons and indicators that correspond to handles, buttons and indicators in the locomotive cabin. The transponders are simulated with a file that contains their positions (from the starting point) and telegrams. A new scenario (use case) is tested by editing a track file and executing the new version on the simulator. After a track file has been changed, it can be run on the simulator instantly. On some occasions, an interesting scenario has been discussed on the phone and at the same time been tested on the simulator. A superb trouble shooting mechanism. Many parties have contributed track files including Teknogram, ATSS, Banverket and Adtranz. Each track file is accompanied by a specification of how the ATC system shall react at each place on the route. ATSS has an archive containing hundreds of track files that can be used for the validation of new versions of the ATC system.
Quick cycle-time simulation has been a key ingredient in the ATC project since its beginning. The first simulator was a program that ran on the same PDP-15 mini computer that was used to assemble the code. It was directly, over the PDP-15-bus, connected to the development version of the ATC system. Today, the simulator uses a 68HC11 CPU with essentially the same operating system and program structure as the ATC program itself. A PC is used for storing the track files and for controlling the parameters of the simulation through the screen and keyboard.
There are several lessons that can be learned from the ATC product experience. These lessons could well be applied in other products, particularly safety critical computer-based systems. Some of the most significant lessons are as follows:
Architecture is a key aspect
The definition and consequent deployment of a problem relevant architecture is a key factor for success. While it is important to have well defined work processes for all life cycle stages of a product, a good architecture reduces the need for heavy processes with multiple activities and tasks. Decision-making is simplified when decisions are bounded by the architectural concepts.
Engineering view is superior to soft-ware view
Instead of creating significant quantities of software, an engineering view of the functions to be performed was taken. The analogy between hardware circuits and the logic of the software, later called software circuits (see Hansson, et. al. 1996, 1997) provides a strong, simplifying solution. We can conclude that software, especially in large quantities, is dangerous but can be controlled with the proper engineering viewpoint.
Do not add more structure than necessary
Adding more structure to a solution than necessary for achieving desired behaviors leads to unnecessary complexity thus costs and risks. This pitfall is very common, even for safety critical systems. Operating systems and programming languages that provide elaborate structures for interrupt handling, multi-tasking, etc. complicate verification, further development, and especially maintenance. In addition, complex methods and tools are often deployed. All of these supporting methods and tools implicitly become a part of the product. Together they often are an overkill solution leading to increased cost and risk.
Verification is a vital aspect of safety critical systems
All safety critical systems must be verified in respect to their specifications and safe behavior in various situations. The combination of module testing, code inspection, and system test via simulation has proved to be an adequate approach for ATC. Simplicity in the architecture and code structure simplifies verification and contributes significantly to safety verification.
FURTHER DEVELOPMENT OF THE CONCEPTS
The architectural concepts developed for ATC have been used in other projects in Sweden. During the early 1990s, Harold Lawson the ATC architect participated in the Nutek sponsored Prometheus project for the automotive industry. The engineering view of software was once again proposed as a means of developing the logic for safety critical functions in vehicles in the BASEMENT system (see Hansson, et. al, 1996, 1997). A methodology based upon the use of "software circuits" evolved during this project.
The work on BASEMENT also led to the development, by Arcticus AB of an operating system concept called Rubus (see Lundbäck, Ericsson and Lawson, 1995). Rubus identifies two types of tasks to be performed; namely time driven (called Red) and event driven (called Blue). In relationship to the ATC solution, execution is carried out in time intervals (dT) where the Red tasks are always executed first and time remaining in dT is available for Blue task execution. Rubus has been successfully applied in developing several embedded system products including the Limited Slip Coupling device developed by Haldex Traction AB and now incorporated in all new Volkswagen automobiles as well as for medical equipment at Siemens-Elema. Arcticus has also produced supporting development tools and utilized them providing embedded systems solutions for Volvo Construction Machines.
Lawson, 1990 and Lawson, 1992b, reported on the importance of architectural philosophy as a key to the engineering of computer-based systems. ATC was cited as one of the case studies in these articles. Ideas related to how to evolve the concepts into a complete resource adequate model called CY-CLONE have been reported (see Lawson, 1992a). Further development of the CY-CLONE model for distributed and parallel execution has also been reported (see Lawson and Svensson, 1993).
The Automatic Train Control system produced by Standard Radio in the late 1970s has proven to be a successful product. It is based upon an engineering view of the problem domain that led to a straight-forward architecture. The architectural concept has been a key factor in relation to further development, maintenance and verification of this successful product.
The concepts used in the ATC product have been further developed, however, given the success of the approach, it is surprising that more safety critical systems are not constructed in a similar manner.
Several people have had important roles related to ATC. In this regard, the authors wish to thank Bengt Sterner, Bengt Wenning, Roger Anderson, Johan Fredrik Lindeberg, Øystein Skogstad, Lennart Backing, Folke Nordlander, and Bertil Sjöbergh.
B. Friman, 1999. Software Validation Inspection Report for Combined Danish-Swedish ATC System Version 2.2, Validation report, June 4, 1999. (ATSS Company Confidential)
H. Hansson, H. W. Lawson, M. Strömberg, and S. Larsson, 1996, BASEMENT: A Distributed Real-Time Architecture for Vehicle Application, Real Time Systems, The International Journal of Time-Critical Computing Systems, Vol. 11, No. 3, November, 1996.
H. Hansson, H. W. Lawson, O. Bridal, C. Eriksson, S. Larsson, H. Lön, and M. Strömberg, 1997. BASEMENT: An Architecture and Methodology for Distributed Automotive Real-Time Systems, IEEE Transactions on Computers, Vol. 46, No. 9, September, 1997.
H.W. Lawson, 1975. Recommendations for Software Organization and Execution Control for the MPU, Consultants Report to Standard Radio and Telefon AB, October 23, 1975
H.W. Lawson, 1976. Preliminary Proposal for a Comprehensive Software Plan for ATC, Consultants Report to Standard Radio and Telefon AB, November 9, 1976
H.W. Lawson, 1990. Philosophies for Engineering Computer-Based Systems, IEEE Computer, Vol. 23, No. 12, pp. 52-63, December, 1990.
H.W. Lawson, 1992a. CYCLONE - An Approach to the Engineering of Resource Adequate Cyclic Real-Time Systems, Real Time Systems, The International Journal of Time-Critical Computing Systems, Vol. 4, No. 1, February, 1992.
H. W. Lawson, 1992b. Engineering Predictable Real-Time Systems, appearing in Real Time Computing, Springer Verlag, 1994, Lectures from a NATO Advanced Study Institute, October, 1992.
H. W. Lawson and B. Svensson 1993. An Architecture for Time-Critical Distributed/Parallel Processing, Proceedings of the EUROMICRO Workshop on Parallel and Distributed Processing, IEEE Computer Society Press, January 1993.
K-L Lundbäck, C. Eriksson, and H.W. Lawson, 1995. A Real-Time Kernel Integrated with an Off-Line Scheduler, Proceedings of the 3rd IFAC/IFIP Workshop on Algorithms and Architectures for Real-Time Control, Ostend-Belgium, 1995.