Guru Meditation
    ____  ____   __   ____ ___ _____ ___  ____   ____ 
   |___ \| ___| / /_ | __ )_ _|_   _/ _ \|  _ \ / ___|
     __) |___ \| '_ \|  _ \| |  | || | | | |_) | |  _ 
    / __/ ___) | (_) | |_) | |  | || |_| |  _ <| |_| |
   |_____|____/ \___/|____/___| |_(_)___/|_| \_\\____|

  1. You should not base your IT strategy on the knowledge level of an airline magazine.
  2. Absence of a signal is itself a signal
  3. Nobody wants backups, everybody wants restores.
  4. The severity of an incident is measured by the number of rules broken in resolving it.
  5. The only other person who knows how this works is also on vacation.
  6. Turning things off permanently is surprisingly difficult.
  7. Big data is much bigger than you think. Think carefully before using Big Data solutions for solving commodity problems.
  8. Naive usage of cloud infractructure is in an very high number of situations a expensive and limited approach.
    If you are serious about the non functional attributes of your system, it has a huge impact on your software- and system-architecture.
  9. Without having a strong IT governance/strategy, the technology hipsters and product owners will efficently implement a cloud platform vendor lockin.
  10. Do not run your GIT repositories (plattform and gitops) on the same cloud you are running your production system.
  11. The most important cloud usecase: If your corporate IT is slow or horrible, the cloud can save your ass.
  12. strace(1)/ktrace(1) doesn't lie. Unless somebody's been playing LD_PRELOAD games.
  13. If they promise you to implement and rollout the "security things" after project launch, accept that it's not gonna be good anymore.
  14. You're bound by the CAP theorem much more often than you may think.
  15. The hardest problem in sre is fighting the urge to solve a different, more interesting problem than the one at hand.
  16. One in a Million is next Tuesday.
  17. "Ancient" is a very relative term when it comes to software and protocols.
  18. Do 👏 Not 👏 Monkey 👏 Around 👏 With 👏 /etc/hosts.
  19. XML will exist as long as YAML and JSON do not have standardized support for schemas, queries, document conversion and programming language independend structural and streamed processing.
    Dear Hipster, think how awesome it would be if you could have automatic formatting, syntax highlighting, validation, assisted completion and automatic conversion for your Kubernetes YAMLs?
  20. The source you're looking at is not the code running in production.
  21. Yes, its DNS! DNS is the Achilles heel of many systems. Gain a deep understanding and incorporate it deeply and carefully into the concept of your platform.
  22. Yes, its MTU when its not DNS!
  23. Very few operations are truly idempotent.
  24. There are very few network restrictions creative and determined use of ssh(1) port forwarding can't overcome.
  25. NTP synchronization being defective may not be a root cause, but it sure didn't help.
  26. Take your time when setting up clusters and deal deeply with the system, otherwise the system will become much more unstable or inefficient than an unclustered system.
  27. There is very little software that is simultaneously stable, secure, high-performance and usable with little knowledge.
  28. DockerHub images and Helm-Charts are 98% not usable without extra effort, because you still have to do your homework regarding security, reliability, efficency and performance.
  29. When setting up systems, you're at 30% of the configuration and engineering effort if the system does functionally what it is supposed to do.
    The remaining time needs to be spend on non-functional requirements (security, stability, performance, ...).
  30. Autoscaling only makes sense within narrow limits.
    Without deep planning (e.g. resource limits of your subsystems) of the maximum scaling, it is a sure way to unnecessarily reduce the availability and stability of your system.
  31. The sum of all fan-out resource pools of an application tier should be less than the sum of the fan-in resource pools
    (http connections pools < database connection pools).
  32. The sum of all fan-out resource pools of an application tier should be less than the sum of fanout resource pools of the next application tier
    (database connection pools < maximum number of database connections).
  33. Scaling applications in the cloud is more complicated than you think. With the same costs, it's easier with your own infrastructure than the marketing departments of hyperscalers would have you believe.
  34. You build it, you ruin it. Quality and development speed suffers when infrastructure development is left in the hands of teams that see it as a secondary topic. Intensive collaboration between feature teams and SRE teams creates powerful organizations.

THE SRE WISDOM   [github scoopex]  [mastodon]  [matrix]