Swish-e

From SME Server
Revision as of 19:52, 13 March 2009 by Elmarconi (talk | contribs) (Swish-e is a free open source system for indexing web pages or other files)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


Description

http://www.swish-e.org

Swish-e is a fast, flexible, and free open source system for indexing collections of Web pages or other files.

Forum link

http://forums.contribs.org/index.php/topic,43486.0.html

Installation

Download rpm's from http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/

wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-2.4.5-4.i386.rpm
wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-debuginfo-2.4.5-4.i386.rpm
wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-devel-2.4.5-4.i386.rpm
wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-perl-2.4.5-4.i386.rpm
wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-perl-api-2.4.5-4.i386.rpm

Install with dependencies from the SME Contribs repository by issuing the following command on the SME Server shell:

yum --enablerepo=dag localinstall swish-e-2.4.5-4.i386.rpm swish-e-d* swish-e-p*

There is no need to reboot.

Setup

In order to have swish-e index .doc .xls and .pdf files we need:

yum install perl-Spreadsheet-ParseExcel --enablerepo=dag
yum install perl-MIME-Types --enablerepo=dag
yum install xpdf

Test filter:

swish-filter-test 
swish-filter-test -man
swish-filter-test -headers /path/to/xlsfile.xls
swish-filter-test -headers /path/to/docfile.doc
swish-filter-test -headers /path/to/pdffile.pdf

Configuration

As I was not interested in indexing web pages, just files in ibays I used the following spider: /usr/libexec/swish-e/DirTree.pl

I modified it, so it would index .doc .xls .pdf files:

sub check_path {
   my $path = shift;
   return 1 if $path = /\.doc$/;  # return true if ends in .doc?
   return 1 if $path = /\.xls$/;  # return true if ends in .xls?
   return 1 if $path = /\.pdf$/;  # return true if ends in .pdf?
   return 0;  # otherwise return false
}

Next create a config file: ibay.cfg

IndexDir /usr/libexec/swish-e/DirTree.pl
SwishProgParameters /home/e-smith/files/ibays/ibayname/files
StoreDescription HTML <body> 20000
# replace to make links to UNC
# works in IE, needs fix for Firefox
ReplaceRules remove /home/e-smith/files/ibays
ReplaceRules prepend //smeservername
ReplaceRules replace /files/ /

Next: run the swish. The index file will be placed in the current dir.

swish-e -c ibay.cfg -S prog -v 9

This should create both index.swish-e and index.swish-e.prop in the current dir.

Under construction

swish.cgi

Under construction

Options

Under construction

Usage

Under construction