TechHow to download complete website from archive.org?

How to download complete website from archive.org?

Archive.org is an Internet Archive and a Wayback Machine that stores from a webpage to the entire website which can be accessed in the future, even if the website goes down or is completely shut down forever. Archive.org indexes and stores almost all types of websites and files like pdf, images, videos, audio, etc. Archive.org could be a great way to recover your website which you have closed or lost in past due to any reason and now you want to start it again.

We knew it really takes a lot to design, develop and add content to a website. In this article, we will help you out on how you can download an entire website from archive.org.

There could be numerous reasons for downloading a website from archive.org and some of them could be:

  • You want to download your old website
  • You want to get back the contents of your website that you have forgotten to renew
  • You want to download some other website etc.

For whatever purpose you want to download the website from archive.org it doesn’t matter at all until you don’t have bad ethics. So, let’s begin with it.

Before we begin with any steps make sure that you have the following things installed on your PC-

Things you will need

1. Ruby

You can download ruby for free from http://rubyinstaller.org/

2. Wayback Machine Downloader

“Wayback Machine Downloader” is a script written in Ruby which helps you to download the website from archive.org. You can download the Wayback Machine Downloader script from github.com/hartator/wayback-machine-downloader for free

Download the zip file from the above URL and extract it. I recommend you should extract it in the “C:\wayback” directory as it is going easy for you to follow our tutorial.

Once you have downloaded the ruby and Wayback Machine Downloader script rightly follow the below steps.

Step 1- Setting up the path

The most important step is to set the path to Ruby as well as the Wayback Machine Downloader. Run the command prompt (cmd) and follow the below instructions

  • Type path=<path of the ruby bin directory>. For example, in my case the installation path of the ruby is C:\Ruby23-x64\bin so, I typed path=C:\Ruby23-x64\bin
  • Once you have set the path of ruby the next step is to change the directory.
  • To change the directory type cd followed by the path of Wayback Machine Downloader(the path where you have extracted)
  • In my case the path of Wayback Machine Downloader is C:\wayback\bin so, I typed CD C:\wayback\bin

Step 2 – Downloading the website

The next step is to download the website from archive.org. well, it is quite simple all you need to type follow commands –

  • ruby wayback_machine_downloader http://your-website.com

In case you want to download the website for a particular timestamp you need to use the”--timestamp” keyword with the above command. Example

  • ruby wayback_machine_downloader http://your-website.com --timestamp 20060716231334

Step 3- Locating the downloaded files

By default, the downloaded files are stored in the “bin\websites” of the Wayback machine downloader folder.

Isrg Team
Isrg Team
Isrg Team is a member of Digital Pradesh News Networks, a collective of journalists, reporters, writers, editors, lawyers, advocates, professors, and scholars affiliated with the Digital Pradesh News Network.

Latest Updates