HTTrack copies Web sites to your local disk for backup and offline viewing. WebHTTrack is the Linux version and WinHTTrack is the Windows version. Both are free so you do not have to juggle licences and open source for your safety.
This article is based on the Linux version and the Windows version is almost the same. There are minor installation differences. After installation, they work the same. You can download both from www.httrack.com
There are a small number of legitimate uses for Web site copying.
If you have an old HTML based Web site, you can back it up with HTTrack. After you upgrade the site to Drupal, there is no reason to backup the HTML representation of the files. Instead backup the database and the image upload directory.
You have your new Drupal site and want to present it to people with sections highlighted. You could backup using HTTrack then edit the HTML to add highlighting.
You can also create a version to be shipped on CD as a product catalogue.
A public site could be backed up regularly then used as point-in-time evidence.
You want to read a free online manual while sitting in a park but not using expensive wireless access. You can copy the manual at home or in the office using cheap broadband then read at your leisure in a different location.
Is there a free online manual read frequently by a lot of your staff? You could set up a proxy server to cache the site containing the manual or you could copy the manual to your local server using HTTrack. Think about the site owner missing out on advertising revenue because of your copy. The owner might not make enough money to continue working on the manual. Consider a donation to keep the manual development active.
Copying a small part of a book or Web site is fair use if you are a student studying the subject presented by the material you are copying. Copying a small part is also legitimate for reviews.
Copying the whole site might be legitimate if the site owner asked you to quote on redeveloping the site. Copying the site is legitimate when you are paid to redevelop the site and the copy will be good proof that you transferred everything from the existing site.
A site uses a particular theme or navigation and you want to test a change. You could make a copy then edit the copy. In most cases you will need only a small sample to test your change, not the whole site.
Start Ubuntu 9.10. Other distributions of Linux and earlier versions of Ubuntu might require a more complicated installation.
Select Applications > Ubuntu Software Center.
Search for httrack.
Select the right arrow.
The rest is based on WebHTTrack version 3.43.5-1ubuntu1
The authentication screen pops up. Type in your password and select Authenticate.
WebHTTrack is now installed. Close the Ubuntu Software Center.
Select Applications > Internet.
There are two entries for HTTrack and both are duplicated..
Browse Mirrored Websites is the link to see the Web site copies and all it does is open Firefox at the index page of a directory named websites in your home directory.
WebHTTrack Website Copier is the program to copy Web sites. Select the link to open WebHTTrack in Firefox and to start copying Web sites.
You can select a language and English is the default. Select Next >>.
Select an existing project or create a new project. You can give a project a category. You can change the base path, which defaults to /home/example/websites. Select Next >>.
Type in the URL of the Web site you want to copy. I will use the example of the Inkscape keyboard and mouse reference version 0.46 at http://www.inkscape.org/doc/keys046.html.
Look in the options to check that they fit what you want.
I like to select [*] Get HTML files first!. Under Browser ID, you can change the browser identity from the default:
Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)
There is a HTML footer you might like to remove or change and the default is:
+*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/*
There are options to limit the depth of the copy, the inclusion of external files, the speed of the copy, and a lot of other things to keep you out of trouble with various sites.
Select Next >>.
There is an option to save your settings but not start the copying.
Select Start >>.
At the end of the process there is a link to browse the Web site copy.
Experiment with a small site of your own. Copy only free open material.
HTTrack is a useful tool for Web developers, may be useful for Web site owners, and may be useful for people placing reference material on intranets.