Difference between revisions of "Swish-e"
m (→Setup) |
m (→Installation) |
||
Line 24: | Line 24: | ||
There is no need to reboot. | There is no need to reboot. | ||
+ | Test: | ||
− | ====Setup==== | + | swish-e -h |
+ | |||
+ | ====Setup Part 2==== | ||
In order to have swish-e index .doc .xls and .pdf files we need: | In order to have swish-e index .doc .xls and .pdf files we need: | ||
Revision as of 10:36, 15 March 2009
Description
Swish-e is a fast, flexible, and free open source system for indexing collections of Web pages or other files.
Forum link
http://forums.contribs.org/index.php/topic,43486.0.html
Installation
Download rpm's from http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/
wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-2.4.5-4.i386.rpm wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-debuginfo-2.4.5-4.i386.rpm wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-devel-2.4.5-4.i386.rpm wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-perl-2.4.5-4.i386.rpm wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-perl-api-2.4.5-4.i386.rpm
Install with dependencies from the SME Contribs repository by issuing the following command on the SME Server shell.
Howto enable dag's repository: http://wiki.contribs.org/Dag
yum --enablerepo=dag localinstall swish-e-2.4.5-4.i386.rpm swish-e-d* swish-e-p*
There is no need to reboot. Test:
swish-e -h
Setup Part 2
In order to have swish-e index .doc .xls and .pdf files we need:
yum install --enablerepo=dag perl-Spreadsheet-ParseExcel perl-MIME-Types xpdf
Test filter:
swish-filter-test swish-filter-test -man swish-filter-test -headers /path/to/xlsfile.xls swish-filter-test -headers /path/to/docfile.doc swish-filter-test -headers /path/to/pdffile.pdf
Configuration
As I was not interested in indexing web pages, just files in ibays I used the following spider: /usr/libexec/swish-e/DirTree.pl
I modified it, so it would index .doc .xls .pdf files:
sub check_path { my $path = shift; return 1 if $path = /\.doc$/; # return true if ends in .doc? return 1 if $path = /\.xls$/; # return true if ends in .xls? return 1 if $path = /\.pdf$/; # return true if ends in .pdf? return 0; # otherwise return false }
Next create a config file: ibay.cfg
IndexDir /usr/libexec/swish-e/DirTree.pl
SwishProgParameters /home/e-smith/files/ibays/ibayname/files
StoreDescription HTML <body> 20000
# replace to make links to UNC # works in IE, needs fix for Firefox ReplaceRules remove /home/e-smith/files/ibays ReplaceRules prepend //smeservername ReplaceRules replace /files/ /
Next: run the swish. The index file will be placed in the current dir.
swish-e -c ibay.cfg -S prog -v 9
This should create both index.swish-e and index.swish-e.prop in the current dir.
Under construction
swish.cgi
Under construction
Options
Under construction
Usage
Under construction