Article by: Dave Anderson, Digital Performance Expert, Dynatrace
“The outage that Facebook users are experiencing is just another example of how difficult it’s becoming to keep the world’s software working perfectly. It’s hard for consumers to understand or even care about the complexity involved in delivering a constant flow of app improvements – but more often than not, problems like these are caused by some sort of software update, as appears to have been the case this time.
“The reality is that developing apps for the hundreds of variations in devices and operating systems is a continuous, never-ending cycle of release, monitor, fix, repeat and this process is becoming harder to manage. Today’s applications live in complex enterprise cloud ecosystems where a tiny and seemingly insignificant change to a single line of code has limitless implications for user experience.
“Adding to the challenge, software releases that were previously measured in weeks, became days and have now been reduced to just seconds. These are very rapid release cycles to fix bugs, optimise the app and make sure security is up to standard; which is likely why we’re seeing the issue with Facebook’s Android app, following what could have been just a minor update.
“The enormity of that challenge has simply become too great for human capabilities to overcome. That’s why organisations are increasingly turning to monitoring and intelligence platforms that can provide real-time situational awareness. At the heart of these platforms is AI that can pinpoint and even predict problems before they hit. To take this even further, some organisations are already using that intelligence to enable application self-healing which removes the need for manual human intervention to prevent outages like this from occurring.”