Site Retriever

1999-02-05
Function
Downloads an entire web site and copies its content hierarchy to a local directory.
Applications
Applications include evidence gathering, mirroring, and caching.
Known Competitors
GetBot: a shareware GUI product for Wintel
Current Features
Command-line version only.
Respects the Robots Exclusion Protocol.
Future Features
Needs a Graphical User Interface (GUI).
Needs to be able to start from other than the specified branch root so that sibling branches can be downloaded even if the common parent cannot.
Needs an option to convert absolute links to relative links during the download.
Known Bugs
None.

Version

The latest version is 1.0a1, 1999-02-05.

Requires Java 2 (1.2).

License

Site Retriever
Version 1.0a1, 1999-02-05
Binary Code License

This binary code license ("License") contains rights and
restrictions associated with use of the accompanying

Site Retriever Version 1.0a1, 1999-02-05

software and documentation ("Software"). Read the License
carefully before using the Software. By using the Software
you agree to the terms and conditions of this License.

1.  License to Distribute. Licensee is granted a royalty-free
right to reproduce and distribute the Software provided that
Licensee: (i) distributes the Software complete and
unmodified; (ii) does not distribute additional software
intended to replace any component(s) of the Software; (iii)
does not remove or alter any proprietary legends or notices
contained in the Software; (iv) only distributes the Program
subject to a license agreement that protects ANSER's
interests consistent with the terms contained herein; and
(v) agrees to indemnify, hold harmless, and defend ANSER and
its licensors from and against any claims or lawsuits,
including attorneys' fees, that arise or result from the use
or distribution of the Program. 

2.  Restrictions. (a) Software is confidential copyrighted
information of ANSER and title to all copies is retained by
ANSER and/or its licensors. Except as otherwise provided by
law for purposes of decompilation of the Software, Licensee
shall not translate, reverse engineer, disassemble,
decompile, or otherwise attempt to derive the source code of
Software. Software may not be leased, assigned, or
sublicensed, in whole or in part, except as specifically
authorized in Section 1. (b) Software is not designed or
intended, and ANSER expressly disclaims any representations or
warranties (either expressed or implied), for use (i) in
online control of aircraft, air traffic, aircraft navigation
or aircraft communications; or (ii) in the design,
construction, operation or maintenance of any nuclear
facility.  

3.  Trademarks and Logos. This License does not authorize
Licensee to use any ANSER name, trademark or logo.

4.  Disclaimer of Warranty. Software is provided "AS IS,"
without a warranty of any kind. ALL EXPRESS OR IMPLIED
REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED
WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
PURPOSE OR NON-INFRINGEMENT, ARE HEREBY EXCLUDED.

5.  Limitation of Liability.   IN NO EVENT WILL ANSER OR ITS
LICENSORS BE LIABLE FOR ANY LOST REVENUE, PROFIT OR DATA, OR
FOR DIRECT, INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL OR
PUNITIVE DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE
THEORY OF LIABILITY, RELATING TO THE USE, DOWNLOAD,
DISTRIBUTION OF OR INABILITY TO USE SOFTWARE, EVEN IF ANSER
HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

6.  Termination.  Licensee may terminate this License at any
time by destroying all copies of Software. This License will
terminate immediately without notice from ANSER if Licensee
fails to comply with any provision of this License. Upon
such termination, Licensee must destroy all copies of
Software.

7.  Maintenance and Support.  No upgrades or support are to
be provided to Licensee under the terms of this License.

8.  Export Regulations. Software, including technical data,
is subject to U.S. export control laws, including the U.S.
Export Administration Act and its associated regulations,
and may be subject to export or import regulations in other
countries. Licensee agrees to comply strictly with all such
regulations and acknowledges that it has the responsibility
to obtain licenses to export, re-export, or import Software.
Software may not be downloaded, or otherwise exported or
re-exported (i) into, or to a national or resident of, Cuba,
Iraq,Iran, North Korea, Libya, Sudan, Syria or any country
to which the U.S.has embargoed goods; or (ii) to anyone on
the U.S. Treasury Department's list of Specially Designated
Nations or the U.S. Commerce Department's Table of Denial
Orders.

9.  Restricted Rights. Use, duplication or disclosure by the
United States government is subject to the restrictions as
set forth in the Rights in Technical Data and Computer
Software Clauses in DFARS 252.227-7013(c) (1) (ii) and FAR
52.227-19(c) (2) as applicable.

10. Governing Law. Any action related to this License will
be governed by West Virginia law and controlling U.S. federal
law. No choice of law rules of any jurisdiction will apply.

11. Severability. If any of the above provisions are held to
be in violation of applicable law, void, or unenforceable in
any jurisdiction, then such provisions are herewith waived
or amended to the extent necessary for the License to be
otherwise enforceable in such jurisdiction.   However, if in
ANSER's opinion deletion or amendment of any provisions of the
License by operation of this paragraph unreasonably
compromises the rights or increase the liabilities of ANSER or
its licensors, ANSER reserves the right to terminate the
License and refund the fee paid by Licensee, if any, as
Licensee's sole and exclusive remedy.

Installation and Execution

  1. Download siteretriever.zip.
    Extract SiteRetriever.jar from the zip file.

  2. Install Java on your computer if you do not already have it.

  3. To run as a system command line prompt utility, run
         java -jar SiteRetriever.jar
         
    and you will be prompted with a help message similar to the following:
    
    SiteRetriever Copyright 1998-1999 Analytic Services, Inc.
    Version 1.0a1, 1999-02-05
    http://nexos.anser.org:8080/java/app/siteretriever/
    David Wallace Croft, croftd@anser.org
    
    Downloads an entire web site and copies its content hierarchy
    to a local directory.
    
    Command-line Arguments:
      1.  your e-mail address;
      2.  root of web site branch to download; and
      3.  local destination directory.
    
    Example:
    java -jar SiteRetriever.jar croftd@anser.org http://www.anser.org/ C:\mirror\
    
         

Usage Notes

Site Retriever abides by the Robots Exclusion Protocol so some pages may not be downloaded if the webmaster has chosen to block them from softbot access.

Site Retriever only downloads pages that are apparently in the same branch as the root URL specified on the command line. For example, if

http://nexos.anser.org:8080/~croftd/
were specified on the command line, only pages that match the following criteria would be downloaded:

Your e-mail address is provided on the command-line so that webmasters may contact you if they have problems or concerns with your use of Site Retriever on their site. Your e-mail address is passed to the web server with each file request as part of the standard request header.

Version History

First release.

Feedback

Feature suggestions, bug reports, comments, and questions may be directed to David Wallace Croft at croftd@nexos.anser.org.

The latest version of this web page can be found at
http://nexos.anser.org:8080/java/app/siteretriever/