I already posted about this event a few weeks ago, with a focus around my experience and the organization : https://cloudinthealps.mandin.net/2017/11/03/velocity-london-2017/
This time, I would like to share a short summary of what I have learned during these 4 days.
The first two days were a Kubernetes training, so nothing very specific here. I learnt a lot about Kubernetes, which is to be expected 🙂
During the two conference days, I attended the keynotes, and several sessions.
The keynotes are difficult to sum up, as they were very different, and each was a succession of short talks. I attended several large-scale conferences in the past, and that was the first time that I felt that the speakers were really on the edge of research and technology. They were not specifically here to sell us their new product, but to share where their work was headed, what the outcomes could be etc.
They broached subjects ranging from bio-software to chaos engineering, from blockchain to edge computing. Some talks were really oriented toward IT & DevOps, and some were bringing a completely different view on our world.
Overall, it felt energizing to hear some many brilliant minds talk about what is mostly our future!The sessions were a bit more down to earth and provided with data, content and feedbacks that would bring us some changes back home. I was surprised to have most sessions concentrate on general information and feedback, and not so much on specific tools and solutions. I expected more sessions from the toolchains for DevOps (Chef, Puppet, Gitlab, Sensu and so on). Actually, even when the session were presented by these software companies (Datadog, Yahoo, Bitly, Puppet, PagerDuty) they never sold their product. However they used their experience and data to provide very useful insights and feedbacks.
What I brought back could be split into two categories : short term improvements/decisions that could be implemented as soon as I got back (which I did partly), and trends that would have to be thought about and analyzed, and then maybe crafted into a new offer or approach.
In the first category :
• Blameless post-mortems. A lot of data analyzed, with one takeout for us : keep the story focused and short. If you do not have anything to add apart from the basic timeline… maybe you’re not the right team to handle the post-mortem 🙂
• Solving overmonitoring and alert fatigue. This talk was a gamechanger for me. What Kishore Jalleda (https://twitter.com/KishoreJalleda) stated was this : you may stop monitoring applications and services that are not respectful. For example, if you get more than X alerts everyday from an application, you may go to the owner of the application and say “as you are generating too much noise, we will disable monitoring for a moment, until the situation comes back to something that is manageable by the 24*7 team” Of course you have to help the product team get back on track and identify what is monitoring and what is alerting (https://cloudinthealps.mandin.net/2017/05/12/monitoring-and-alerting/). And you need top management support before you go and apply that 🙂
• On the same topic, a session about monitoring containers came down to the same issue : how do you monitor the health of your application? Track the data 🙂
The second group covered mostly higher level topics, on how to organize your teams and company for successful DevOps transformation. I noted an ever spreading use of the term “SRE”, which I would qualify of misused most of the time. At least SRE seems now to qualify any team/engineer in charge of running your infrastructure.
Another trend, in terms of organization, was the model based on this famous SRE team, to provide tooling and best practices for each DevOps/Feature/Product team. I’ll probably post at length sometime later.