PaaS and Managed Services

If you know me, or have read some of my previous articles, you will know that I am a big fan of PaaS services.

They provide an easy way for architects and developers to design and build complex applications, without having to spend a lot of time and resources on components that may be used out of the box. And it relieves us IT admins of having to manage lower levels components and irrelevant questions. These questions are the ones that lead me to switch my focus into cloud platforms a few years ago. One day I’ll write an article on my personal journey 🙂

Anyway, my subject today concerns the later stages of the application lifecycle. Let’s say we have designed and built a truly modern app, using only PaaS services. To be concrete, here is a possible design.

 

I will not dig into this design, that is not my point today.

My point is : now that it is running in production, how do you manage and monitor the application and its components?

I mean from a Managed Services Provider perspective, what do you expect of me?

I have heard recently an approach that I did not agree with but that had its benefits. I will start with this one, and then share my approach.

The careful position

What I heard was a counterpoint of the official Microsoft standpoint, which is “we take care of the PaaS components, just write your code properly and run it”. I may have twisted the words here… The customer’s position was then : “we want to monitor that the PaaS components are indeed running, and that they meets their respective SLAs. And we want to handle security, from code scanning to intrusion detection”.

This vision is both heavy and light on the IT team. The infrastructure monitoring is quite easy to define and build : you just have to read the SLAs of each component and find out the best probe to check for that. Nothing very fancy here.

The security part is more complicated as it requires you to be able to handle vulnerability scanning, including code scanning, which is more often a developer skill, and also vulnerability watching.

This vulnerability scanning and intrusion detection part is difficult, as you are using shared infrastructure in Azure datacenters, and you are not allowed to run these kind of tools there. I will write a more complete article on what we can do, and how on this front sometime this year.

Then comes the remediation process that will need to be defined, including the emergency iteration, as you will have some emergencies to handle on the security front.

The application-centric position

My usual approach is somehow different. I tend to work with our customers to focus on the application, from an end-user perspective. Does that user care that your cloud provider did not meet the SLA regarding the Service Bus you are using? Probably not. However he will call when the application is slow or not working at all, or when he experiences a situation that he thinks is unexpected. What we focus our minds on is to find out which metrics we have to monitor on each PaaS component that have a meaning about the application behavior. And if the standard provided metrics are not sufficient, then we work on writing new ones, or composites that let us know that everything is running smoothly, or not.

The next step would be, if you have the necessary time and resources, to build a Machine Learning solution that will read the data from each of the components (PaaS and code) and be able to determine that an issue is going to arise.

In that approach we do not focus on the cloud provider SLAs. We will know from our monitoring that a component is not working, and work with the provider to solve that, but it’s not the focus. We also assume that the application owners already have code scanning in place. At least we suggest that they should have it.

Monitoring and alerting

Today is another rant day, or, to put it politely a clarification that needs to be made.

As you probably know by now, I’m an infra/Ops guy. So monitoring has always been our core interest and tooling.

There are many tools out there, some dating back to pre-cloud era, some brand new and cloud oriented, some focused on the application, some on the infrastructure. And with some tuning, you can always find the right one for you.

But beware of a fundamental misunderstanding, that is very common : monitoring is not alerting, and vice-versa.

Let me explain a bit. Monitoring is the action of gathering some information about the value of a probe. This probe can measure anything, from CPU load to an application return code. Monitoring will then store this data and give you the ability to graph/query/display/export that.

Alerting is one of the possible actions taken when a probe reaches a defined value. The alert can be an email sent to your Ops team when a certain CPU reaches 80%, or it could be a notification on your IPhone when your spouse get within 50m of your home.

Of course, most tools have both abilities, but that does not mean that you need to mix them and setup alerting for any probe that you have setup.

My regular use case is an IoT solution, cloud-based. We would manage the cloud infrastructure backing the IoT devices and application. In that case usually we would have a minimum of two alerting probes. These two probes would be the number of live connected devices, and the response time of the cloud infrastructure (based on an application scenario).

And that would be it for alerting, in a perfect world. Yes we would have many statistics and probes gathering information about the state of the cloud components (Web applications, databases, load balancers etc.). And these would make nice and pretty graphs, and provide data for analytics. But in the end, who cares if a CPU on one instance of the web app reaches 80%. As long as the response time is still acceptable and there are no marginal variation on the number of connected devices, everything is fine.

When one of the alerting probes goes Blink, then you would need to look into the other probes and statistics to figure out what is going on.

About the solution

There are so many tools available to alert and monitor, there cannot be one size fits all.

Some tools are focused on gathering data and alerting, but not really on the graphing/monitoring part (like Sensu, or some basic Nagios setups) and some are good at both (Nagios+Centreon, NewRelic). Some are mostly application oriented (Application Insight, NewRelic) some are focused on infrastructure, or even hardware (HPE SIM for example).

I have worked with many, and they all their strength and weaknesses. I would not use this blog to promote one or the other, but if you’re interested in discussing the subject, drop me a tweet or an email!

The key thing here is to keep your alerting to a minimum, so that your support team can work in a decluttered environment and be very reactive when an alert is triggered, rather than having a ton of fake alarms, false warnings and “this is red but it’s normal, don’t worry” 🙂

Note : the idea from this post goes to a colleague of mine, and the second screenshot from a tool another colleague wrote.

WPC 2016

It has almost been a year since my first Worldwide Partner Convention organized by Microsoft in Toronto.

At the time, I wanted to share some insights, and some tips to survive the week.

Before WPC, I attended multiple Tech-Ed Europe and VMworld Europe, in several locations over the years. WPC is slightly different as it is a partner-dedicated event, without any customers or end users. It gives a very different tone to the sessions and discussions, as well as a very good opportunity to meet with Microsoft Execs.

As it was my first time, I signed up for the FTA (First Time Attendee) program, which gave me access to a mentor (someone who had already attended at least once) and a few dedicated sessions to help us get the most out of the conference.

 

The buildup weeks

In the months preceding the event, Microsoft will be pushing to get you registered. They are quite right to do so, for two reasons.

First the registration fee is significantly lower when you register early. So if you are certain to attend, save yourself a few hundred dollars and register as soon as you can. Note that you may even register during the event for the next one.

Second, the hotels fill up very quickly, and if you want to be in a decent area, or even in the same place as your country delegation, be quick!

 

A few weeks before the event, I had a phone call with my mentor, who gave me some advice and opinion, as well as pointers on how to survive the packed 5 days. This helped me focus on the meetings with potential partners, and meetings with microsoftees, rather than on the sessions themselves. More on that subject later.

During that period, you are also given the opportunity to complete your online WPC profile, which may help get in touch with other partners, and organize some meetings ahead of time.

 

You also get the sessions schedule, which let you organize your coming days, and see what the focus is.

I had the surprise, a few days before the event, to learn that we had “graduated” in the Microsoft partner program, from remotely managed to fully managed. So we had a new PSE (Microsoft representative handling us as a partner) which was very helpful and set up a lot of meetings with everyone we needed to meet from Microsoft France. This helped, for a first-timer, to be guided by someone who knew the drill.

I was very excited to get there, and a bit anxious as we were scheduled to meet a lot of people, in addition to my original agenda with many sessions planned.

 

 

The event

I’ll skip the traveling part and will just say that I was glad I came one day early, so that I had time to settle in my new timezone, visit a bit and get cointed omfortable with the layout of the city and the venue.

I will not give you a blow by blow recount, but I will try to sum up the main points that I found worthy to note.

The main point, which I am still struggling to define whether it was a good or bad point is that we met almost only people from France, microsoftees or partners. I was a bit prepared for that, having heard the talk from other attendees, but it is still surprising to realize that you have traveled halfway across the world, to spend 5 days meeting with fellow countrymen.

 

There some explanation to that : this is the one time in the year where all the Microsoft execs are available to all partners, and they are all in the same place. So it is a good opportunity to meet them all, at least for your first event. I may play things differently next time.

Nevertheless, we managed to meet some interesting partners from other countries, and started some partner-to-partner relationships from there.

 

I did not go to any sessions, other than the ones organized for the french delegation. These were kind of mandatory, and all the people we were meeting were going there too. But I cancelled all my other plans to watch any session.

I did not really miss these technical sessions, as I work exclusively on cloud technologies, which are rather well documented and discussed all year round in dedicated events and training sessions. But on some other subjects, technical or more business/marketing some sessions looked very interesting and I might be a bit more forceful to attend those next time.

 

I have attended the keynotes, which were of various level of interest and quality. They are a great show, and mostly entertaining. The level of interest is different for every attendee, depending on your role and profile.

What I did not expect, even with my experience of other conferences, was the really packed schedule. A standard day was running like that :

  • 8.30 to 11.30 : keynote
  • 11.30 to 18.30 (sometimes more) : back to back meetings, with a short time to get a sandwich
  • 19.30 to whatever time suits you : evening events, either country-organized, or general events.
  • 22.30 get to bed, and start again

 

You may also insert into that schedule a breakfast meeting, or a late business talk during a party/event.

So, be prepared 🙂

 

 

A word on the parties/events : some countries organize day trips to do some sightseeing together. Niagara Falls are not far from Toronto, so it was a choice destination for many of them. We had an evening of BBQ on one of the islands facing Toronto, with splendid views of the city skyline at sunset. Some of the parties are just diners in quiet places, some other are more hectic parties in nightclubs. The main event is usually a big concert, with nothing businesslike, and everything fun-oriented!

 

The cooldown time

There are a few particulars to that event, mostly linked to Microsoft organization and fiscal year schedule.

The event is planned at the beginning of the year for Microsoft. This means that microsoftees get their annual targets right before the event, and start fresh from there.

The sales people from MS also have a specific event right after WPC in July, which means they are 120% involved in July, and will get you to commit on yearly target numbers and objectives during the event.

To top that, August is a dead month in France, where almost every business is closed or slowed to a crawl. That means that when you get to September, the year will start for good, but the Microsoft will already be starting to close its first quarter!

 

Practical advice

Remember to wear comfortable shoes, as you will walk and stand almost all day long. Still in the clothing deprtment, bring a jacket/sweater, as A/C is very heavy in these parts. We had a session in a room set at 18°C, when it was almost 30°C outside…

The pace of your week may really depend on the objectives you set with your PSE. Our first year was mostly meeting with Microsoft France staff. Next year may not be the same.

And obviously, be wise with your sleep and jetlag, those are very long days, especially when English is not your native language.

 

This year the event will be hosted in Washington DC, in July, and it has been rebranded Inspire.

I would not specially comment on the name, but anything sounds better than WPC 🙂