Simple Python Distributed Indexing

SPyDI Is a powerful engine to create
distributed full text indexing systems and
distributed search engines. It supports
harvesting, crawling (pull mehtods), and push
methods (via a Web interface or SPyRO Web
services). It supports boolean and vector
Information retrieval models. It has few
dependencies, and comes with its own HTTP
server and HTML embedded pages language
(called pyew and wey pages), and session
manager. It can use the SMTP of the Python
library. It supports replacing the default
modules with some better modules (Apache,
exim, etc).

Zgal generates a static HTML photo album with thumbnails. The Web
gallery supports rendered themes, transparent rollover buttons,
captions, background images, and advertisment blockers for Lycos,
Geocities, Tripod, etc.

XPath Methods

XPath Methods allows XPath queries on ParsedXML XML documents (and possibly other DOM implementations) in Zope. XPath is a relatively simple but still quite powerful query language used to address portions of XML documents. When you call an XPath Method you will retrieve a set of DOM nodes which you can then display in a Web page using DTML or ZPT, or which you can issue operations upon using, for instance, Python scripts.

Xapian Fu

XapianFu is a Ruby library for working with Xapian databases. It builds on the GPL licensed Xapian Ruby bindings, but provides an interface more in-line with "The Ruby Way"(tm) and is considerably easier to use. For example, you can work almost entirely with Hash objects, and XapianFu will handle converting the Hash keys into Xapian term prefixes when indexing and when parsing queries. It also handles storing and retrieving hash entries as Xapian::Document values. XapianFu basically gives you a persistent Hash with full text indexing (and ACID transactions).

Docindex document management

Docindex is an open, extensible system that permits Web-based catalog searches and access-controlled fetch from a group of document repositories on multiple CVS (extensible to other) servers. Documents remain under CVS version control and are made available to Web users using bookmarkable URLs pointing to specific versions or branches.

OpenEphyra is a question answering (QA) system. It retrieves answers to natural language questions from the Web and other sources. OpenEphyra comes with implementations of algorithms that proved effective in Carnegie Mellon's Ephyra system, which participated in the TREC evaluations. It is platform independent and can be set up in just a few minutes. The goal of this project is to give researchers the opportunity to develop new QA techniques without worrying about the end-to-end system.

FTPSearch/Agent is fully functional FTP indexing
and searching engine for medium-sized local
networks (up to 200 users). It features
associative extension of the searching results
system that allows you to gather relevant results.

Pyndex is a simple and fast full-text indexer implemented in Python. Pyndex also includes an easy to use Bayesian classifier. It uses Metakit as its storage back-end. It works well for quickly adding a search feature to an application, and is also well suited to in-memory indexing and searching. It can handle phrase queries. It performs best in applications involving a few thousand documents, but its scaling is mostly limited by available memory.

phpContact is a PHP3 application that allows you to
easily include a contact manager into your Web
site. It's features include summary field view
configuration, add new contact, and administration
tools to modify and delete contact.

GFXIndex creates thumbnails (small representations of the original images) and some HTML-files to make an album that will help you organize your pictures and publish them on a Web page.

Search::Xapian is a Perl XS frontend to the Xapian C++ search library. It is a fairly complete wrapper: most features of the Xapian library are made available for use from Perl. Xapian is a highly adaptable toolkit that allows developers to easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model as well as a rich set of boolean query operators. It's fast and scalable to hundreds of millions of documents.

fbrowse is a Web-based file browser. It supports
image previews and selectable styles.

Foxtrot full text

Foxtrot is a full text indexing software for PDF, 1 and 2, MS Word, and XLS files.
The packge provides two different frontends: a
Google-like searching tool implemented with
Perl-Gtk and a PHP-based Web interface. The
backend scans directories asynchronously, converts
files to text, and indexes them in a MySQL

Web Spider

Web Spider is a an application to download Web
pages, images, audio clips with an extendible
searching algorithm and an eye-catching user

Exporia is a PHP photo album storing all the non-graphic data in MySQL. You can write and display comments on pictures in multiple languages. Exporia's directory structure is straight-forward and it works with PHP's safe mode.

