Difference between revisions of "Swish-e"
m (→Setup) |
m (This is not a contrib it is a howto.) |
||
(10 intermediate revisions by one other user not shown) | |||
Line 7: | Line 7: | ||
===Forum link=== | ===Forum link=== | ||
http://forums.contribs.org/index.php/topic,43486.0.html | http://forums.contribs.org/index.php/topic,43486.0.html | ||
+ | |||
+ | Please add comment there so I can merge it here later! | ||
===Installation=== | ===Installation=== | ||
Line 24: | Line 26: | ||
There is no need to reboot. | There is no need to reboot. | ||
+ | Test: | ||
+ | |||
+ | swish-e -h | ||
− | ====Setup==== | + | ====Setup Part 2==== |
In order to have swish-e index .doc .xls and .pdf files we need: | In order to have swish-e index .doc .xls and .pdf files we need: | ||
− | yum install --enablerepo=dag perl-Spreadsheet-ParseExcel perl-MIME-Types xpdf | + | yum install --enablerepo=dag perl-Spreadsheet-ParseExcel perl-MIME-Types xpdf catdoc |
Test filter: | Test filter: | ||
Line 53: | Line 58: | ||
Next create a config file: ibay.cfg | Next create a config file: ibay.cfg | ||
− | + | # ibay.cfg, a shwish-e config file | |
+ | # | ||
IndexDir /usr/libexec/swish-e/DirTree.pl | IndexDir /usr/libexec/swish-e/DirTree.pl | ||
− | + | # | |
SwishProgParameters /home/e-smith/files/ibays/ibayname/files | SwishProgParameters /home/e-smith/files/ibays/ibayname/files | ||
− | + | # | |
StoreDescription HTML <body> 20000 | StoreDescription HTML <body> 20000 | ||
− | + | # | |
# replace to make links to UNC | # replace to make links to UNC | ||
# works in IE, needs fix for Firefox | # works in IE, needs fix for Firefox | ||
ReplaceRules remove /home/e-smith/files/ibays | ReplaceRules remove /home/e-smith/files/ibays | ||
ReplaceRules prepend //smeservername | ReplaceRules prepend //smeservername | ||
+ | # Next line will not work if you have dir's called "files"... | ||
ReplaceRules replace /files/ / | ReplaceRules replace /files/ / | ||
+ | # | ||
Next: run the swish. The index file will be placed in the current dir. | Next: run the swish. The index file will be placed in the current dir. | ||
Line 70: | Line 78: | ||
swish-e -c ibay.cfg -S prog -v 9 | swish-e -c ibay.cfg -S prog -v 9 | ||
− | This should create both index.swish-e and index.swish-e.prop in the current dir. | + | This should create both index.swish-e and index.swish-e.prop in the current dir. |
+ | |||
+ | === swish.cgi === | ||
+ | |||
+ | For PoC I have setup this basic configuration in /home/e-smith/files/ibays/Primary/cgi-bin | ||
+ | |||
+ | Copy (or symlink) swish.cgi. I prefer copy as I can modify the script without loosing the original. | ||
− | + | cp /usr/libexec/swish-e/swish.cgi /home/e-smith/files/ibays/Primary/cgi-bin/ | |
− | + | Create /home/e-smith/files/ibays/Primary/cgi-bin/.swishcgi.conf: | |
− | + | return { | |
+ | swish_index => '/home/e-smith/files/ibays/Primary/cgi-bin/index.swish-e', | ||
+ | title_property => 'Just a Sample Title ', # Not required, but recommended | ||
+ | # | ||
+ | # Next line to make it clickable | ||
+ | # | ||
+ | prepend_path => 'file:////', | ||
+ | # | ||
+ | link_property => 'swishdocpath', | ||
+ | title_property => 'swishtitle', | ||
+ | }; | ||
=== Options === | === Options === | ||
Line 84: | Line 108: | ||
=== Usage === | === Usage === | ||
− | + | Search should now be available at http://smeservername/cgi-bin/swish.cgi | |
− | [[Category: | + | [[Category: Howto]] |
Latest revision as of 19:56, 11 May 2009
Description
Swish-e is a fast, flexible, and free open source system for indexing collections of Web pages or other files.
Forum link
http://forums.contribs.org/index.php/topic,43486.0.html
Please add comment there so I can merge it here later!
Installation
Download rpm's from http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/
wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-2.4.5-4.i386.rpm wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-debuginfo-2.4.5-4.i386.rpm wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-devel-2.4.5-4.i386.rpm wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-perl-2.4.5-4.i386.rpm wget http://rpmbuild.joshr.com/swish-e-release/2.4.5-4/centos-4-i386/swish-e-perl-api-2.4.5-4.i386.rpm
Install with dependencies from the SME Contribs repository by issuing the following command on the SME Server shell.
Howto enable dag's repository: http://wiki.contribs.org/Dag
yum --enablerepo=dag localinstall swish-e-2.4.5-4.i386.rpm swish-e-d* swish-e-p*
There is no need to reboot. Test:
swish-e -h
Setup Part 2
In order to have swish-e index .doc .xls and .pdf files we need:
yum install --enablerepo=dag perl-Spreadsheet-ParseExcel perl-MIME-Types xpdf catdoc
Test filter:
swish-filter-test swish-filter-test -man swish-filter-test -headers /path/to/xlsfile.xls swish-filter-test -headers /path/to/docfile.doc swish-filter-test -headers /path/to/pdffile.pdf
Configuration
As I was not interested in indexing web pages, just files in ibays I used the following spider: /usr/libexec/swish-e/DirTree.pl
I modified it, so it would index .doc .xls .pdf files:
sub check_path { my $path = shift; return 1 if $path = /\.doc$/; # return true if ends in .doc? return 1 if $path = /\.xls$/; # return true if ends in .xls? return 1 if $path = /\.pdf$/; # return true if ends in .pdf? return 0; # otherwise return false }
Next create a config file: ibay.cfg
# ibay.cfg, a shwish-e config file # IndexDir /usr/libexec/swish-e/DirTree.pl # SwishProgParameters /home/e-smith/files/ibays/ibayname/files # StoreDescription HTML <body> 20000 # # replace to make links to UNC # works in IE, needs fix for Firefox ReplaceRules remove /home/e-smith/files/ibays ReplaceRules prepend //smeservername # Next line will not work if you have dir's called "files"... ReplaceRules replace /files/ / #
Next: run the swish. The index file will be placed in the current dir.
swish-e -c ibay.cfg -S prog -v 9
This should create both index.swish-e and index.swish-e.prop in the current dir.
swish.cgi
For PoC I have setup this basic configuration in /home/e-smith/files/ibays/Primary/cgi-bin
Copy (or symlink) swish.cgi. I prefer copy as I can modify the script without loosing the original.
cp /usr/libexec/swish-e/swish.cgi /home/e-smith/files/ibays/Primary/cgi-bin/
Create /home/e-smith/files/ibays/Primary/cgi-bin/.swishcgi.conf:
return { swish_index => '/home/e-smith/files/ibays/Primary/cgi-bin/index.swish-e', title_property => 'Just a Sample Title ', # Not required, but recommended # # Next line to make it clickable # prepend_path => 'file:////', # link_property => 'swishdocpath', title_property => 'swishtitle', };
Options
Under construction
Usage
Search should now be available at http://smeservername/cgi-bin/swish.cgi