Azure is experiencing DNS issues in all regions

https://news.ycombinator.com/item?id=19812919

Man… Azure seems to be an order of magnitude worse than AWS and GCP when it comes to reliability.

Seems like they have tons of global dependencies within their services which cause these cascading failures rather often… Seems like only a few months ago we were reading about a global outage that affected auth?

When it comes to outages, just recent Azure downtimes broke our SLA 5 times over for the whole year..

Tho for w/e reason i cannot convince higher ups, that the switch was a stupid idea. But hey! We got some credits to spend in Azure to compensate. (Too bad we had to pay our clients with real cash)

Coincidentally, a neighbor of mine recently retired from consulting and told me, a couple of days ago, that he’s still been getting a lot of calls from people wanting to move off Azure due to reliability issues. He said he’s thinking of jumping back in due to money he thinks he’s now left on the table to be made.

Azure has been magnitudes more expensive and magnitudes more unreliable for the one of our clients that demands we use it to host their stuff than either AWS, Digital Ocean, or Heroku.

Honestly, I can’t think of a single reason why I would recommend them over anyone else at this point, no matter what your hosting or storage or computing needs were. Can’t see a single area where they are better than the competition.

I completely agree. I ran a 100+ node compute cluster a couple years back, and the uptime of our nodes was three nines at most. This wouldn’t have been bad if they had had any form of reliable storage or quick recovery, but at the time their s3 competitor was limited to 20t per bucket (because it was just NetApp hardware).

Azure is a freaking mess. If it hadn’t been Microsoft we were selling to, we’d have never used it.

Magnitudes more expensive? Huh? On what product?

We used to use AWS and had issues a lot. I’ve seen global GCS outrages. Today we were effected be the Azure outage, but only for about 15 minutes.

Yep, we’ve just had one of our environments alert us. Services totally unresponsive – even the Azure portal was unresponsive. Affecting services we run in the US and Europe.

This would explain why my app kept throwing this exception when attempting to call an Azure SQL instance:

  System.ComponentModel.Win32Exception: No such host is known

I wonder if this DNS outage will cause data lose like the last one….

The entire “heres free ELA credits for Azure, please please Mr Sr Director/CIO use Azure” seems to be working, but then they go and do stuff like this.

One would think after that 2016 Dyn outage people would strongly consider having an alternate DNS provider….

It’s DNS, so it is somewhat inherently global. Route53 isn’t region specific either, so I could see an issue with that having a global effect, too.

DNS is also inherently distributed. This should make it resilient to all of the most common outage scenarios, and is likely why AWS offers a 100% uptime SLA for Route 53.

I’ll be interested in the post-mortem from Azure on this one.

Sure, but that’s hypothetical, and I don’t recall AWS having any such issue in recent history.

The us-east-1 outage of S3 in spring of 2017 comes to mind. The AWS status page went down and no one could figure out what was going on because the status page had a hard dependency on us-east-1 S3.

Leave a Reply

Your email address will not be published. Required fields are marked *

Next Post

Chewbacca actor Peter Mayhew dies aged 74

Fri May 3 , 2019
https://www.bbc.co.uk/news/entertainment-arts-48142765