Services unavailability on 17 May 23: The incident report


On Wednesday, 17 May 2023, a connectivity problem disrupted the RESTENA network, affecting the availability of the Restena' services. The problem identified and now solved, Restena published the corresponding incident report.

The Restena Foundation operates its core services in a redundant way on at least two different locations, to make the impact and probability of outages as small as possible.

A number of technical elements makes sure that the two different locations appear logically as only one, ultimately providing the service to end users irrespective of actual location.

On Wednesday 17 May 2023, one of these key elements first experienced a hardware issue, which was initially identified and scheduled for replacement under controlled conditions; expected to be a routine operation that was already executed before.

This routine operation however triggered unexplicable catastrophic side effects on both locations simultaneously, so that none of the redundant services worked, on any location. The accompanying symptoms were inconclusive, and fault finding and restoration to normal operation therefore took an extended amount of time.

Restena takes this matter very seriously. Beyond the immediate short-term measures that were taken to restore services to working order, its technical team is already discussing possible architectural improvements to minimise the possibility of such an issue re-occuring.