Definitely not Windows 95: what operating systems allow things to run in space?


Enlarge / ESA’s Solar Orbiter mission will face the Sun from Mercury’s orbit on its closest approach.

ESA / ATG media laboratory

ESA’s recently launched Solar Orbiter will spend years in one of the least welcoming places in the solar system: the sun. During its mission, Solar Orbiter will move closer to 10 million kilometers from the Sun than Mercury. And, remember, Mercury is close enough to have sustained temperatures of up to 450 ° C on its surface facing the Sun.

To withstand such temperatures, Solar Orbiter will rely on a heat shield of complex design. This heat shield, however, will only protect the spacecraft when pointed directly at the Sun – there is not enough protection on the sides or rear of the probe. Therefore, ESA has developed a Real Time Operating System (RTOS) for Solar Orbiter which can act according to very strict requirements. The maximum allowable deportation from the Sun is only 6.5 degrees. Any offset greater than 2.3 degrees is only acceptable for a very short period of time. When something goes wrong and a dangerous error is detected, Solar Orbiter will only have 50 seconds to react.

“We have extremely high requirements for this mission,” says Maria Hernek, Head of Flight Software Systems at ESA. “Typically, restarting the platform like this takes about 40 seconds. Here we had a total of 50 seconds to find the problem, have it isolated, put the system back into service, and take recovery action. “

I’ll say it again: this operating system, located way out in space, needs to reboot and recover remotely in 50 seconds. Otherwise, the Solar Orbiter is toasting.

Bone billiard ball

To deal with such unforgiving delays, spacecraft like Solar Orbiter are almost always run by real-time operating systems that operate in a totally different way than you and I know from an average laptop. The criteria by which we judge Windows or macOS are pretty straightforward. They perform a calculation, and if the result of that calculation is correct, then a task is considered performed correctly. The operating systems used in space add at least one additional central criterion: a calculation must be done correctly within a strictly specified period. When a deadline is not met, the task is considered failed and completed. And in spaceflight, a missed deadline quite often means that your spaceship has already turned into a fireball or strayed into an incorrect orbit. There is no point in continuing to process such tasks; things must respect a very precise clock.

The time, measured by the clock, is divided into singular graduations. To simplify it, space operating systems are typically designed so that each task is performed within a set number of allocated ticks. It may take three ticks to download the data from the sensors; four more ticks are devoted to starting the engines and so on. Each possible task is assigned a specific priority, so that a higher priority task can have priority over the lower priority task. And that way, a software designer knows exactly what task is going to be performed in a given scenario and how long it will take to accomplish it.

To compare this to the operating systems we are all familiar with, just look at any given speed comparison between modern smartphones. In the one made by EverythingApplePro, the iPhone XS Max and the Samsung S10 Plus go head-to-head by opening some popular apps. Before the test, both phones are restarted and the cache is cleared there. Samsung opens all applications in 2 minutes 30 seconds and iPhone in 2 minutes 54 seconds. In the second round, all applications are closed and reopened without restarting or clearing RAM. Because the apps are always in RAM, Samsung finishes opening in 46 seconds, and iPhone does it in 42 seconds. That’s a huge two-minute time difference between the first try and the second. But if phones were to run the kind of real-time operating system used for spaceflight, opening those apps would take exactly the same amount of time no matter how many times you’ve tried it, until a millisecond.

Beyond time, space operating systems have more tricks up their sleeves. Real-time operation is one thing, determinism is another. If you got Craig Federighi to participate in one of these speed comparisons, give him full access to the iPhone about to be tested and ask him to predict exactly how long it would take for that iPhone to complete the test. test he would probably have no idea. Sure, it would probably say something like “fast” or “fast enough”, or even “blazingly fast”, but nothing more specific than that. Neither iOS nor Android is a deterministic system. The number of factors that could potentially affect the speed results is so huge that it’s next to impossible to make such accurate predictions. But if the phone was running a space-class operating system, an engineer with access to the system would know exactly what causes what in a given sequence and could calculate the exact time needed for a given task. Spatial quality software must be fully predictable and operate within very precise time frames.

Shoot the Moon (and beyond) with VxWorks

During the Apollo era, operating systems were tailor-made for each mission. Of course, some of the code was reused – parts of the software designed for the Apollo program were routed to Skylab and the Shuttle program, for example. But for the most part, things had to be done from scratch.

Eventually, NASA’s preferred operating system solution came from WindRiver, a company based in Alameda, California. WindRiver released a fully operational commercial real-time operating system called VxWorks in 1987. Although VxWorks was not the first such system, it quickly became the most widely deployed of all, meaning that VxWorks quickly caught the eye of NASA mission designers.

The first mission to pilot VxWorks was the Clementine Moon probe, also known as the Deep Space Program Science Experiment. In the early 1990s, Clementine marked the abandonment by NASA of the giant Apollo-type programs. Everything was supposed to be light, developed quickly, and on a tight budget. As such, one of the design choices made for the Clementine probe was to use VxWorks, and the system made a pretty good impression to get a second date. VxWorks was the choice for the Mars Pathfinder mission.

But all was not rosy for this RTOS, however. One bug – the priority reversal problem – caused a lot of trouble for the NASA ground control team. Shortly after landing, the Pathfinder system began to restart for no apparent reason, delaying the transmission of the collected data to Earth. It took three weeks to find the problem and an additional 18 hours to fix it; the problem turned out to be buried deep in the mechanics of VxWorks.


Comments are closed.