How to find out where the link with redirects leads: all intermediate sites and cookies

Links with Redirects

When you click on a link with a redirect, you will find yourself not on the site where this link leads, but on some other. The most popular example of such links is links received on various link shortener services, for example: http://bit.do/fbb2f

Little can be said about such a link until ones click it.

Redirected links are often used on sites to mask external links – web-site visitors gets a link as if to an internal resource of the site, for example, http://hackware.ru/?goto=1, but when you click on it, it goes to an external resource.

Analysis of links with a redirect may be necessary, including for investigating phishing links, or pages with many redirects: when you open a site in a browser window, but then it redirects you to the next page without your actions, then to another, and so on.

How to identify all page redirects

For Linux, there is a special program called Hoper – it does exactly what interests us: it shows all the redirects made.

In Kali Linux, the program is installed as follows:

sudo apt install libcurl4-openssl-dev
sudo gem install gemspec hoper

It is possible that other Linux distributions will also be able to complete the installation in the same way.

In BlackArch, the command is installed like this:

sudo pacman -S hoper

but at the moment it does not work in BlackArch (although it used to work before):

Using the program is simple:

hoper URL

For example:

hoper http://hackware.ru/?goto=1

The hoper program has a number of problems – the main one: it does not show all the redirects that the link makes.

Other disadvantages:

  • requires Ruby
  • does not work in BlackArch
  • does not show cookies

If you want something to be done well, then do it yourself!

The program for revealing all redirects

The task seems pretty simple – ones need to follow the link, see where it leads, go there, see where the next link leads, and so on in a circle.

But there are the following difficulties:

  • relative redirection: the link can be absolute, for example https://hackware.ru/, or it can be relative, for example “/blog” or “/”. You can’t just follow relative links – you need to create an absolute link correctly (apparently, Hoper does not know how to do this)
  • different response codes: there are several HTTP response codes with redirects, they have the form 3xx. This should be considered when writing a parser. At the same time, you cannot focus on the response code 200 to stop crawling links, since the response code 404 or 403 and others also mean that you need to stop
  • redirect from cookies: some sites set cookies and act on the basis of them
  • some services actively counteract bots
  • redirects can be performed not only using HTTP headers, but also using JavaScript and HTML methods.

Example redirect using <meta http-equiv='refresh':

<meta http-equiv='refresh' content='1;url=https://pay2u.space/d/5d9a67e7a054c'>

JavaScript redirect example:

<script>window.location.href = "http://fara.host/?o08z";</script>

Example redirect using JavaScript and with additional obfuscation:

<body><script>function ready(callback){ if(document.readyState!='loading') callback(); else if (document.addEventListener) document.addEventListener('DOMContentLoaded', callback); else document.attachEvent('onreadystatechange', function(){if (document.readyState=='complete') callback();});}ready(function(){ var options = { excludes: {canvas: true, fonts:true} }; Fingerprint2.get(options, function(components) { var fingerprint = Fingerprint2.x64hash128(components.map(function (pair) { return pair.value }).join(), 64); location.href = window.location.protocol + "//" + window.location.host + '/check-unique/index?unique_code='+fingerprint+'&link_type=partner&code=5d9a67e7a054c&u=&url=http://goldenreceiptwin.top/&upgrade=9c69cd6a8a0c0'})});</script></body>

I solved the first three problems in my script, the fourth is only partially solved. The fifth is also partially solved – we can add more patterns to search for redirects, but it is impossible to finally overcome obfuscation. By the way, if you want to contribute examples of redirects to add them to this script, then write in the comments.

To use the script, create the dest-finder.sh file:

gedit dest-finder.sh

And copy into it:

#!/bin/bash

LINK=$1
COUNTER=1

rm /tmp/cookies.txt 2>/dev/null

echo "<b>Received link for analysis: $LINK</b>"
echo
while (( 1 )); do
      HEADER=`curl -s -I -A 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36' --cookie-jar /tmp/cookies.txt -b /tmp/cookies.txt "$LINK"`
      LOCATION=`echo "$HEADER" | grep -E -i '^Location: ' | sed 's/Location: //' | sed 's/location: //' | sed 's/[[:space:]]\+//g'`
      CODE=`echo "$HEADER" | head -n 1`
	if [[ -z "$LOCATION" ]]; then

		BODY=`curl -s -A 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36' --cookie-jar /tmp/cookies.txt -b /tmp/cookies.txt "$LINK"`
		LOCATION=`echo "$BODY" | grep -E "(location.href)|(meta http\-equiv='refresh')" | grep -E -o "http(:|s)[^']+" | head -n 1 | sed 's/";<\/script>//'`
		if [[ -z "$LOCATION" ]]; then
			echo "Final destination: $LINK"
			echo
			echo "The following cookies were set during redirects: "
			cat /tmp/cookies.txt | awk '$1 != "#"'
			exit
		fi
	fi
      echo "Hop # : $COUNTER"
      echo "Received HTTP response code: $CODE"
      echo "Redirected to $LOCATION"
      echo ""

      if [[ -z "`echo \"$LOCATION\" | grep -E '(^http)'`" ]]; then
         if [[ "$LOCATION" == "/" ]]; then
            LOCATION=''
         fi
         LINK="`echo "$LINK" | grep -E -o '[^?]*' | head -n 1`""$LOCATION"
      else
          LINK="$LOCATION"
      fi
      COUNTER=$(($COUNTER+1))
done

Usage:

bash dest-finder.sh URL

Links are recommended to be enclosed in quotation marks, as they may contain characters that have special meaning for the Bash shell.

For example:

bash dest-finder.sh 'http://hackware.ru/?goto=1'

Now we are shown all four redirects:

How to view cookies set by redirects

I wondered how to find out which cookies were set during the redirect. As I already mentioned, in order to process some redirects, you need to consider the cookies that the sites set. Therefore, the previous script saves and sends cookies. Since this is already being done in any case, for those who are interested, the cookies that were received are shown at the end of the script job.

If you want to make sure that cookies are displayed after each redirect, then you can do this by slightly editing the above script.

By the way, if you just want to see what cookies the site sets (even if there is no redirect on the page), then the previous script will also work:

bash dest-finder.sh 'https://www.youtube.com/watch?v=UFEz5fPYqNc'

Online service that shows where the link leads and cookies of web-sites

Above is the source code for a simple script that does not require installation and works without dependencies. If you want an online service, so as not to bother even with the launch of the script, then here it is: https://suip.biz/?act=hoper

This service used to be based on Hoper, but now it uses my script.

That is, it is suitable for you if:

  • you need to know all the redirect intermediate pages
  • you just need to see the cookies that the web page sets (even if it does not have a redirect)

If you find bugs in the script proceeding, then write here in the comments – I will definitely correct it.

Recommended for you:

2 Comments to How to find out where the link with redirects leads: all intermediate sites and cookies

  1. Will says:

    Здравствуйте Alex, спасибо за эту информацию, полезно.

    'Hoper' returned nothing for some reason, but your script worked fine. Thanks!

    I have been tormented recently by 'snowshoe' spam, hosted on an  American Cloudflare server.

    So I'm getting interested in tracking down the source of this rubbish.

    • Alex says:

      Hello! Thanks for your feedback. I am glad to know that you have found the script useful. By the way, I recommend you to enclose the URL in single quotes because it can contain special characters such as “&” that break the command.

Leave a Reply

Your email address will not be published. Required fields are marked *