This is part of a series on diagnosing your website outage issues. This is part four; links to the other parts are here.
In Part 1 of this series we covered the overview of what could have broken to cause your website to go down. In Part 2, we started working through those possible issues by diagnosing DNS issues. In Part 3 we diagnosed routing issues. Now that we know your domain’s DNS is good, the routes are good, we’re going to start looking at any layers you may have in your architecture.
The term “layers’ refers to things like firewalls or Content Distribution Networks (CDN) that may be present in your architecture. If you don’t use these things, you can skip to the next section which I will link to here when it is ready.
A typical website architecture looks like this:
website visitor -> web hosting server
There are no layers involved in this architecture. Your visitor simply hits your website directly. That works just fine and represents probably 80% of the use cases out there, but an increasing number of website owners are starting to employ firewalls and CDNs to secure and speed up their sites. If you employ a firewall such as The Mighty Sucuri CloudProxy, your architecture changes to look like this:
website visitor -> Sucuri CloudProxy -> web hosting server
If you harken back to Part 2 where we discussed routing, then you will recognize that this change in architecture introduces another point of failure for your website. How do you test those parts to ensure they are functioning? There are a few options.
If your firewall uses the architecture I’ve indicated above, then that necessarily means that the DNS A records for your domain are pointed to the firewall and the firewall is configured to point to the IP address of your web hosting server. Therefore, it is possible to test both flows:
- you -> firewall -> web host and
- you -> web host
Using this site as an example, here’s how to do that.
ping of my domain, tools we learned about in Part 1, show me that the DNS A records for slumpedoverkeyboarddead.com and www.slumpedoverkeyboarddead.com point to 18.104.22.168. Another tool,
host, shows me that IP address belongs to Sucuri which is expected as Sucuri is my firewall provider:
$ host -n 22.214.171.124
126.96.36.199.in-addr.arpa domain name pointer cloudproxy10006.sucuri.net.
In addition, as a good website owner who is not wholly and wilfully ignorant of the products I buy, I know the IP address assigned to my domain from my web host. I am not going to share that here because obscurity is a valid security layer and while it is not impossible to find the real hosting IP of this site, I revel in making things harder than they have to be. Let’s call my real hosting IP address 188.8.131.52. I just made that up but it turns out to be an IP address in New Jersey. I discovered that by using the MaxMind GeoIP tool.
So, with this information I can now revise the two architectures I outlined above to this:
- me -> 184.108.40.206 -> 220.127.116.11
- me -> 18.104.22.168
This gives me enough information to test things. First, I am going to test if my site comes up when I bypass the firewall IP at 22.214.171.124. The long and painful way to do this is to change my DNS records at my DNS provider to my hosting IP of 126.96.36.199 but that can take hours to propagate throughout the global Internet. I want to test now, not 4 hours from now. Luckily, when computers request the IP address of a domain, they use a fairly standard way of doing so which gives us the opportunity to trick our own computers into thinking the IP address for this site is something different than it is.
When you type a domain name into your browser, or click a link to a site, your browser checks to see if it knows the IP address of that site. Computers can be configured to do this in many ways, but almost nobody messes with the default set up. Computers almost always start by checking a local file on their disks named ‘hosts’ for the domain and if the domain is not in there, then the computer proceeds to cast its net wider and request DNS resolution from the network or from the Internet. This means that you can trick your computer into seeing any IP address you’d like for your domain by simply adding the appropriate entry into your hosts file. Using this magic, we can bypass the Sucuri CloudProxy and request my site directly from my web host. This allows me to compare the results of how my site loads in two ways: when going through the CloudProxy and when bypassing that layer.
The location of your hosts file depends on the operating system you’re using. I run Linux so I know my hosts file is at
/etc/hosts. If you don’t know where your hosts file is this Wikipedia entry can probably tell you. I am reasonably certain that all hosts files have the same format, so to bypass my firewall and access this site directly, I would enter this into my hosts file:
188.8.131.52 slumpedoverkeyboarddead.com www.slumpedoverkeyboarddead.com
This tells my computer that the IP address for this site is 184.108.40.206 and not 220.127.116.11 as the public DNS for this domain says.
A couple of notes:
- Notice how I put both the naked domain and www domain in my hosts file. This tells my computer that both those domains are at that IP. If you only put one and your site redirects requests to the other, you can end up running invalid tests.
- Astute readers may notice that if a firewall can be bypassed like this, then what good is the firewall? A very valid question. The truth is that I have more security on this site that I am letting on and this bypass method would not work even if someone were to discover the real IP address of it. But if you’re running a firewall on your site and you are able to bypass it this easily, then you should fix that. Talk to your firewall provider or systems administrator to find out how.
Now that I have configured my system to bypass the firewall it is just a matter of loading the site into a browser to see what happens. If the webiste looks just as broken or missing as it did before setting up bypass, then you have successfully eliminated the firewall layer as the cause of your issues. If the site looks fine when bypassing the firewall, then you have some pretty good evidence that the firewall is the issue and it’s reasonable to open a support ticket with that provider and give them the results of your test so they can investigate.
These are the basic steps to troubleshoot issues that may be caused by the various layers in your architecture. I’ve used my firewall as an example, but there are an innumerable amount of possible configurations people can have set up so it’s not possible to give examples of them all. But the basic troubleshooting steps are the same in all cases:
- Identify all the layers in your route from your computer to your website
- Eliminate each of them one-by-one using the bypass method I showed
- At the point the site comes back up, you have identified the problem layer
- If you have eliminated all the layers and the site is still down, then the problem is likely your web host
The next step is examine what is happening on the web hosting server itself. We’re going to start with diagnosing problems with SSL (TLS) certificates and how those problems can completely break your site from the inside out. That will be the topic of part 5 which I will link to here when it is ready.