Intercepting t.co links using DNS rewrites

Sunday, January 29, 2023

When someone links to something on twitter, either by embedding something or just pasting a URL, twitter will front it with its own t.co link. This means that you cannot verify what the URL is until you click it and your browser goes to the end result via t.co. I only really noticed this properly when my DNS sinkholing server (Adguard home) started blocking t.co links and I was getting an error when say, clicking a linked news article.

The obvious fix for this would be to just add t.co to my DNS allow list so these requests can go through. However, the fact that you cannot see the URL until you’ve already navigated through to it irks me a lot, I’d rather verify the link I’m navigating to is something I want to visit.

There are browser extensions that solve this problem by modifying the DOM to uncloak the links (e.g. twitter-link-deobfuscator) which works pretty well, but this solution is limited to the browser and does not work on the Twitter app on Android. Other options are to copy the t.co link into a link uncloaking website, but this is fiddly and annoying, or install an app on your phone

So I was looking for a more general solution that works across devices, what if there was a way of “intercepting” when you click on t.co links, unwrapping where it eventually leads to and presenting the user an interstitial page detailing this, with an option to continue forward.

Enter the unwrapper, a small service I wrote this weekend that does exactly this, but it abuses a lot of safeguards we have in place for the web so comes at a high price.

The Unwrapper

The service is just a go server that, when receiving a request for t.co, makes a HEAD request to the shortened link and extracts the Location header before the redirect is followed. If this is found, the value in the header is returned and rendered on a simple interstitial page.

But how do you intercept the calls to t.co? The magic comes in by using (abusing?) the DNS rewriting feature on Adguard home to intercept DNS requests for t.co and return the IP of my reverse proxy, and then adding a proxy rule to forward all requests for the host t.co to the go service.

Screenshot displaying the architecture of how the components fit together. The client makes a DNS request to Adguard Home, which rewrites t.co to return the IP of the reverse proxy, which then forwards requests to 'the unwrapper' go service

Surprisingly this worked pretty well, although with a huge caveat - this is your run of the mill Man in the Middle (MitM) attack. Browsers will, quite rightfully, complain that the certificate being presented from my reverse proxy is not valid for t.co so the user should not continue, or proceed with extreme caution.

To mitigate this I did something bad. Well, I’m already doing something bad, I run my own self-signed Certificate Authority (CA) and my reverse proxy uses certs signed by this CA. This root CA is trusted on my devices so I can access various internal services that run on my network when I’m connected to it, or via VPN when I’m remote.

With the badness in place, I figured why not add to it by adding t.co to the Subject Alternative Name on the cert for my reverse proxy? Now the browser has no problem and doesn’t complain anymore.

Screenshot displaying chrome trusting my self-signed certificate for t.co

This is obviously a terrible solution and not recommended, but it works, and works on all my things, including the Twitter app on my phone. It’s bad though, I’m probably looking past a lot of other security issues that I’ve just opened myself up to.

During my testing, another interesting issue quickly arose, a lot of the links cloaked under t.co tend to be links to other link shortening services like bit.ly, buff.ly, trib.al (as it turns out….the list is endless) meaning the obfuscated link issue still remains.

Going deeper

So to get around this I extended the service to work with multiple “known” URL shortening services and “follow” the trail until you reach the end. Most of them work the same way, a simple redirect and Location header, so following the trail is really just finding where the chain stops, and throw an error if a cycle is detected.

It’s actually quite funny to see how many hops some of these links can take you on, the deepest I’ve ever seen is a link from NYTimes taking you on a 9 hop voyage around various link shortening services. I’m guessing people are pasting short URLs to other short URLs into social media distribution platforms which just adds to the chain.

Screenshot displaying the unwrapper service for a link from the NY times, it shows one link jumping between 9 different short URLs to reach the actual underlying URL

Bonus feature

With the “real” link URL now avaialble it’s often polluted with various query parameters used for tracking or affiliate descriptors, so the service also strips these out and presents a “cleaned” version of the link alongside the original which I can then choose to click on or copy to send to others.

Lessons learned

I’m not sure whether I’m going to run this system full time as the obvious security safeguards put in place have just been overridden with hacky self signed certificates, DNS rewriting and other bad things, but it’s been a useful exercise in exploring what’s possible, even if it is absolutely awful. Probably should just unblock t.co on my adblocker and put the cowboy hacks aside for the time being 🤠