Download List

專案描述

HarvestMan is a multithreaded off-line browser.It has many features for customizing offline browsing through URL filters, word filters, domain filters, URL priorities, depth-fetching, fetch levels, file limits, time limits, robot exclusion protocols, and many more. It is useful to download an entire Web site or certain files from a Web site to the hard disk for offline browsing later. It supports HTTP/HTTPS and FTP protocols and can work across proxies.

System Requirements

System requirement is not defined
Information regarding Project Releases and Project Resources. Note that the information here is a quote from Freecode.com page, and the downloads themselves may not be hosted on OSDN.

2005-09-09 11:32
1.4.6

The install scripts were fixed. They had problems
working with Python 2.4.
標籤: Minor bugfixes

2005-08-20 16:22
1.4.5

This release fixes a bug in the regular expression for localizing
URLs, a bug related to resuming a project by reading back its project
file, and errors with a few commandline options that were not working
correctly. It adds a subdomain flag to the commandline.
標籤: Minor bugfixes

2005-08-02 10:00
1.4.5 beta 1

New, user friendly command line options, a new nocrawl command line flag for only downloading URLs, similar to wget, support for .chm, .cfm, .cfml, .php4, and .aspx Web page extensions, and a duplicate link bugfix for the URL tree printing option. Other minor bugfixes were made and readme.txt was updated.
標籤: Development, Minor feature enhancements

2005-07-21 23:59
1.4.5 alpha 2

This release replaces lists at critical places with the new collections.deque data structure. This improves performance when run with Python 2.4. 2. A bug with HTTP redirect handling that requires cookies has been fixed. Many bugs that created invalid URL (HTTP 404) errors have been fixed. The modules htmlparser and cookiemgr have been removed, since they are no longer used. The default locale has been changed to 'C'. Bugs in the logger.py, connector.py, and config.py modules have been fixed.
標籤: Development, Major bugfixes

2005-05-27 22:43
1.4.5 a1

The config file format has been changed from text to XML. There is a new HTML parser based on the SGMLParser module. The dependency on HTML tidy is removed. A new archive feature for archiving project files to tar.bz2/tar.gz archives. Changes in project caching: data of Web pages is compressed before writing to cache, there is an option for writing the cache in DBM format, and headers of URLs are also written to the cache. A junk filter for filtering out banner ads and similar URLs. This release works with Python 2.4.
標籤: Development, Major feature enhancements

Project Resources