[UPDATE] My web service Proxy Checker is now online! You can use it to test the security of proxy servers
[UPDATE 2] I've expanded my analysis to over 25,000 proxies. Here are my findings
Since my post from 2013 Why are free proxies free? became popular on Reddit and HN I thought I'd do a follow-up post where I try to find proxies on the web who use the technique I was describing in my post.
So I wrote a dead simple script (actually a PHP function) which requests Javascript files from various locations and checks them for altered content.
If you don't care about the code, jump down to the actual scanning
I'm serious about "dead simple" because this is the whole function:
/**************************************************************************/
/* scanProxy function by Christian Haschek christian@haschek.at */
/* It's intended to be used with php5-cli .. don't put it on a web server */
/* */
/* Requests a specific file ($url) via a proxy ($proxy) */
/* if first parameter is set to false it will retrieve */
/* $url without a proxy. CURL extension for PHP is required. */
/* */
/* @param $proxy (string) is the proxy server used (eg 127.0.0.1:8123) */
/* @param $url (string) is the URL of the requested file or site */
/* @param $socks (bool) true: SOCKS proxy, false: HTTP proxy */
/* @param $timeout (int) timeout for the request in seconds */
/* @return (string) the content of requested url */
/**************************************************************************/
function scanProxy($proxy,$url,$socks=true,$timeout=10)
{
$ch = curl_init($url);
$headers["User-Agent"] = "Proxyscanner/1.0";
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_HEADER, 0); //we don't need headers in our output
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 0);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT ,$timeout);
curl_setopt($ch, CURLOPT_TIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); //return output as string
$proxytype = ($socks?CURLPROXY_SOCKS5:CURLPROXY_HTTP); //socks or http proxy?
if($proxy)
{
curl_setopt($ch, CURLOPT_PROXY, $proxy);
curl_setopt($ch, CURLOPT_PROXYTYPE, $proxytype);
}
$out = curl_exec($ch);
curl_close($ch);
return trim($out);
}
If you want the actual and complete script I was using for the scanning, you can find it here. I don't provide you with a proxy list but it shouldn't be hard to get one.
I am currently developing an open source web service Proxy Check which is based on this function. On the site you will be able to test proxies you've found on the web. I plan on releasing it in the next days/weeks.
All you have to do is pass it the proxy, the URL of the file/script/image/page you want to check, specify if the proxy is SOCKS or HTTP and it will return you the data. If you put false
for a proxy it will not use any proxy. You can use that to gather reference data.
So a simple use case would be:
//requesting a JS file from this blog
$proxy_data = scanProxy('127.0.0.1:9050','https://blog.haschek.at/js/blog.js');
//requesting reference data so we can check if something is altered
$reference_data = scanProxy(false,'https://blog.haschek.at/js/blog.js');
if(($proxy_data!=$reference_data) && $reference_data) //if the data is different but the proxy has sent something
echo "[!] Proxy modified the content!\n";
else if(!$reference_data)
echo "[-] Proxy is down\n";
else
echo "[+] Proxy did not modify the content\n";
You can do all sorts of analytics with this function
- Check if a proxy hides your IP by requesting http://ip.haschek.at which will print out your IP and you check with reference data if it's the same as your public IP
- Check if a proxy also tunnels HTTPS traffic. If not it might be because the owner of the server wants only clear text so they can extract data from it
- Check if a proxy is adding anything to static websites (eg: ads)
Let's test 443 free proxies
I harvested proxies from various sources but I found all links to these sites via Google
What are we checking for?
- Is HTTPS allowed?
- Is JS modified?
- Are static websites modified?
- Will it hide my IP?
The results
Test | Result |
---|---|
Tested | 443 (100%) |
Online | 199 (44,9%) |
Offline | 244 (55,1%) |
No HTTPS | 157 (79%) |
Modified JS | 17 (8.5%) |
Modified HTML | 33 (16.6%) |
IP not hidden | 0 (0%) |
Not modifying content | 149 (75%) |
So 75% of all proxies are safe, right?
Just because a proxy doesn't actively modify your content does not mean it's safe
to use. The only way to use a free proxy and be somewhat safe is if it's HTTPS capable and you're only surfing on HTTPS enforced sites.
Only 21% proxies were allowing HTTPS traffic
Aftermath
Free proxy servers on the web tend to be offline, no surprise there but I didn't expect so many proxies to ban HTTPS traffic. It could be because they want you to use HTTP so they can analyze your traffic and steal your logins
.
Only 17 of 199 (8.5%) of the proxies modified JS and most of them were to inject ads to the client. But two of them were just error messages or web filter warnings.
33 proxy servers (16.6%) were actively modifying static HTML pages and inject ads. Most of them injected the following code right before of the ending