Site Reliability Engineering
Measuring and Managing Reliability

share ›
‹ links

Below are the top discussions from Reddit that mention this online Coursera course from Google Cloud.

Offered by Google Cloud. Service level indicators (SLIs) and service level objectives (SLOs) are fundamental tools for measuring and ... Enroll for free.

Reddsera may receive an affiliate commission if you enroll in a paid course after using these buttons to visit Coursera. Thank you for using these buttons to support Reddsera.

Taught by
Google Cloud Training

and 13 more instructors

Offered by
Google Cloud

Reddit Posts and Comments

0 posts • 6 mentions • top 6 shown below

r/devops • comment
11 points • 2fplus1

r/devops • comment
1 points • bodiug

I recommend doing and progress from there.

r/sre • comment
1 points • DevAtHeart

The SLO article in the google blog is really nice - I also learned there is a Coursera course on SLOs ( - Does anyone has feedback on that one?

r/webdev • comment
1 points • nwss00

You don't need pure coding courses as your experience will get you up to speed quickly with whatever coding tasks thrown your way.

You need a course on often neglected but crucial topics: deployment, reliability and resiliency.

Topics such as blue/green deployment, global load balancer configuration, horizontal & vertical scaling, ensuring no single point of failure, etc...

This course by Google Cloud is very good.

r/sre • comment
1 points • ali_str

This is based on the book, with some examples and exercises:

r/sysadmin • comment
1 points • philipstorry

Back in the 90's I pushed myself through certifications for messaging platforms (specifically Exchange Server and Lotus Notes). I paid for the first few, and later my employer paid for them.

I haven't bothered with certifications since the mid 2000s.

I suspect that I came to the same conclusion that much of the industry did - a qualification is at best a baseline, and at worst just a slip of paper.

The natural thing for me to do when I got my MCP in Exchange Server - according to Microsoft - would be to continue studying and do about four or five more exams to get an MCSE. I didn't do that because it was a steep price in both time and money, and not much benefit. Paper MCSEs were being churned out of mills around the world, devaluing the qualification in much the same way that they'd done to Novell's CNE and CNA before then.

If I learned every possible obscure corner of Windows 2000 Server, it might get me up to speed a little faster in a new environment - but probably not. Most of the problems you're going to encounter will be specific to that environment - network layout, bandwidth between sites, architectural decisions taken much earlier that are now showing their age. These are not things that are in exams, and whilst exams may help you identify them the fix process will likely be slow anyway.

So you only need to know a solid, usable and - most importantly - reusable core of most technologies. Everything else you can research.

Even when you do have prior knowledge and say immediately "I know what this is", it's wise to make a few other checks just in case you're on the wrong track. Thinking you know the answer and immediately acting without verification can sometimes make problems worse.

So with this in mind, I think that today what you need to do is:

  1. Demonstrate that you can learn things quickly.
  2. Demonstrate that you can learn at least one complex area (almost) entirely.
  3. Show that you can use what you have learned to make changes.

To that end, I'd recommend everyone have a specialisation at some point in their career (it doesn't need to be maintained rigorously), and that they round themselves out by taking short and small courses.

In that regard something like CBT Nuggets or Udemy can be effective, but only if time for learning is given by management.

Every environment has its own challenges, whether technical, procedural or cultural. Qualifications merely give a baseline of product knowledge, and aren't even reliable for that. So we should minimise the time we spend on them.

The rest of the world has known that for years, of course. There's an age old half-joke of "How did my History degree qualify me to be a Procurement Manager?" (Or some similar job.) The answer is that it showed that they could learn, that they could organise themselves, and that they could apply themselves. The business world long ago realised that training is often an indicator, and not much more. We should take the hint.

By the way, if you're looking for a good course to do I recommend the Coursera course "Site Reliability Engineering: Measuring and Managing Reliability". It was created with Google, takes about 4 weeks, and was a well delivered course.

The reason I recommend it is that I'm a traditional sysadmin of 25 years in the industry, who may or may not have to move to some kind of DevOps role in the future. I took that course because I wanted to see how things were different, and it changed the way I viewed my IT infrastructure. I now look at my Nagios box and ask "what am I measuring, and why, and how is it useful?". When I retire that box I will be changing how we do our monitoring...

That's the kind of training we should hope for. Not something where you can say "This is what I learned", but where you can say "This is what I learned and what I now want to change."

Short courses on scripting, or on the fundamentals of technologies not yet used in your environment are a lot more likely to bring about that kind of result.