Fun with Curl

Curl is one of those quintessential *nix tools that adheres beautifully to the “one tool, one task” philosophy. curl exists to give us the ability to issue requests against web servers. As sysadmins we’re usually concerned with how the web server responds to requests rather than how the actual page renders so a CLI tool like curl is quick and easy. It also lets us spoof things like user agents and referers in case we want to see how the web site responds to different browsers or different referers.

Let’s look at this site:

$ curl http://slumpedoverkeyboarddead.com | head

I get the entire page back, but I’ve spared you all that output. I’m almost never interested in the whole page content, I am usually interested in just the HTTP headers and what the server does with my request. So, let’s look at just the headers:

$ curl -I http://slumpedoverkeyboarddead.com

Now we’re getting somewhere. The server returns a 200 OK response so I know the server is healthy and it will give me content if I asked for it.

A lot of sites will return a 301 redirect to a new location rather than a 200. You could theoretically use the value of the Location: header in the 301 and issue another curl command against it, or you could just tell curl to follow the redirects:

$ curl -I -L www.phoenixhollow.ca

From this, we can see that the .ca domain redirects to the .com version and the final destination is happy with a 200 response.

What about HTTPS SSL secured sites with invalid certificates? No problem, use the -k switch to tell curl to accept them anyhow so you can get on with your life.

Note: props to Sean Walberg for correcting my curl-fu; I was habitually using -k in all cases of SSL which is not needed with valid certs.

$curl -I https://expired.badssl.com/

OK:

$curl -I -k https://expired.badssl.com/

Maybe I need something more exotic. A friend is complaining that when he clicks a Facebook link to my site, he is being redirected to some other site. But when I load the offending page, it works fine for me. We can tell curl to go get that page with a Facebook referer set and see if that changes the behaviour. Hint: if it does, there’s a high chance my site is infected with something bad.

$ curl -I -L --referer "https://www.facebook.com/my-awesome-referer" http://slumpedoverkeyboarddead.com

I don’t have an example page set up to redirect based on referer, but a quick look in the access log shows it was ingested:

By the way, if you’re not using Papertrail to centralize your system logs, you’re missing out. But I digress.

Now on to the last line, see the curl/7.42.1 bit? That is the User Agent string and it tells the web server what application was used to send the request. Primarily, web browsers are used to access the web so this field usually has a string identifying which browser was used, but in this case I didn’t use a browser, but we can see that by default, curl sends that User Agent.

Maybe I want to hide that. Or maybe, as in the previous example, some page is behaving differently based on what User Agent it sees. The second example is reasonably common for web delivered malware targetted at a specific operation system. For example, malware can be told to look at the User Agent string of a request and only download bad software to Windows PCs, but not Linux or Mac computers. So let’s spoof the User Agent so I look like Firefox on Windows (I am actually using Chrome on ChomeOS):

$ curl -I -L --referer "https://www.facebook.com/my-awesome-referer" -A "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100 101 Firefox/40.1" http://slumpedoverkeyboarddead.com

Taking a look at the logs, we can see that I no longer appear to be curl’ing, I seem to be using Firefox on Windows.

As with almost all *nix tools, curl has a million possible uses and many more options than I’ve covered here. These are just the most common ones for me right now. Having the power to grab web content under arbitrary conditions can be invaluable in the troubleshooting process.