Book Summary

Book Summary: Becoming an Effective Software Engineering Manager by James Stanie

Introduction: The introduction of the book provides an overview of the role of a software engineering manager, and the skills and qualities needed to excel in this role. The author emphasizes that software engineering managers must be effective communicators, strategic thinkers, and leaders, with the ability to work collaboratively with their team members, stakeholders, and […]

Book Summary: Becoming an Effective Software Engineering Manager by James Stanie Read More »

Book Summary: SRE, Part 4, Best Practices for Building Monitoring and Alerting

Monitoring is a crucial aspect of Site Reliability Engineering (SRE) because it allows teams to detect, diagnose, and resolve issues in distributed systems. In this article, we’ll explore the principles of monitoring and best practices for monitoring distributed systems. First principle: Measure what matters Teams should identify key performance indicators (KPIs) that directly impact user

Book Summary: SRE, Part 4, Best Practices for Building Monitoring and Alerting Read More »

Book Summary: SRE, Part 3, Service Level Indicators (SLIs), Objectives (SLOs), and Agreements (SLAs)

In this article, We are going to learn about Site Reliability Engineering (SRE) core terminologies. It’s important to understand those terms because they are used a lot nowadays in the software industry. I know that learning terminologies might sound boring or complex but I will try to make it simple and as practical as possible.

Book Summary: SRE, Part 3, Service Level Indicators (SLIs), Objectives (SLOs), and Agreements (SLAs) Read More »

Book Summary: Site Reliability Engineering, Part 2, Error Budgets and Service Level Objectives (SLOs)

It would be nice to build 100% reliable services. Ones that never fail. right? absolutely not. It’s going to be really bad to do such a thing because it’s very expensive and it will limit how fast new features can be developed and delivered to the users. Also users typically won’t notice the difference between

Book Summary: Site Reliability Engineering, Part 2, Error Budgets and Service Level Objectives (SLOs) Read More »

Book Summary: Site Reliability Engineering, Part 1, How a service would be deployed at Google scale

How to deploy an application so that it works well at large scale? Of course there is no easy answer for such a question. It probably would take an entire book to explain that. Fortunately, in Site Reliability Engineering book, Google explained briefly what it might be like. They explained how to deploy sample service

Book Summary: Site Reliability Engineering, Part 1, How a service would be deployed at Google scale Read More »