SRE

Breaking Clouds

Cloud infrastructure is a necessity in our modern digital world. However, understanding and preparing for failures in cloud infrastructure is critical for reliability of our services. Failures can be viewed as learning opportunities and to improve our system design. It can inform proactive problem-solving, fostering effective incident response, and guiding future design challenges. Chaos Engineering plays a vital role in testing for resilience of our system.

Above-the-line / Below-the-line framework

Introduction into the "Above-the-line / Below-the-line" framework and why you look at your systems design mostly wrong.