Difference between revisions of "WebFilter"

From SME Server
Jump to navigationJump to search
m (Unnilennium moved page WebFiltering to WebFilter)
(28 intermediate revisions by 5 users not shown)
Line 1: Line 1:
{{Languages}}
+
{{Languages|WebFiltering}}
  
  
Line 6: Line 6:
 
[http://www.firewall-services.com Firewall Services]<br>
 
[http://www.firewall-services.com Firewall Services]<br>
 
mailto:daniel@firewall-services.com
 
mailto:daniel@firewall-services.com
 
+
=== Version ===
 +
{{ #smeversion: smeserver-webfilter }}
 +
[[Version::contrib9|fws]][[Has SME9::true| ]]
  
 
=== Description ===
 
=== Description ===
 
This contrib brings 3 new features for squid proxy, and provides a simple panel to control most of it:
 
This contrib brings 3 new features for squid proxy, and provides a simple panel to control most of it:
 
*URL Filtering (with [http://squidguard.org/ squidGuard])
 
*URL Filtering (with [http://squidguard.org/ squidGuard])
Several categories of domain names and URLs are downloaded from the University of Toulouse and updated every night (you can get more informations on these lists [http://dsi.ut-capitole.fr/blacklists/ here]), in french). You can then just choose which catagories you want to block. You can enter a list of ip addresses which won't be filtered, and a local blacklist and whitelist.   
+
Several categories of domain names and URLs are downloaded from the University of Toulouse and updated every night (you can get more informations on these lists [http://dsi.ut-capitole.fr/blacklists/ here]), in french). You can then just choose which categories you want to block. You can enter a list of ip addresses which won't be filtered, and a local blacklist and whitelist.   
*On the fly antivirus scanning (using ([http://squidclamav.darold.net/ squidclamav])
+
*On the fly anti-virus scanning (using [http://squidclamav.darold.net/ squidclamav])
When enabled, all web trafic will be scanned before being sent to the client
+
When enabled, all web traffic will be scanned before being sent to the client
 
*log every requests in a MySQL database
 
*log every requests in a MySQL database
Every request passing through squid is logged in a database, making it easier to analyze squid logs. There's no frontend for this, but you can use your favorite mysql client to see which domains are the most visited, which user eats all your bandwidth, etc...
+
Every request passing through squid is logged in a database, making it easier to analyze squid logs. There's no front-end for this, but you can use your favourite mysql client to see which domains are the most visited, which user eats all your bandwidth, etc...
  
This contrib can replace dansguardian if you have simple filtering requirements. It's really easy to configure, but is also less powerfull. Dansguardian is a real content scanner (it analyze the content of the pages while squidguard only look at the URLs for example).
+
This contrib can replace dansguardian if you have simple filtering requirements. It's really easy to configure, but is also less powerful. Dansguardian is a real content scanner (it analyse the content of the pages while squidguard only look at the URLs for example).
  
 
===Requirements===
 
===Requirements===
  
*SME Server 8 (not tested and not supported on SME 7)
+
*SME Server 8 or 9
 
*You need to configure both [[Epel]] and [[Fws]] repositories
 
*You need to configure both [[Epel]] and [[Fws]] repositories
 +
 
=== Screenshots ===
 
=== Screenshots ===
  
Line 28: Line 31:
 
[[File:Webfilter_2.png|webfilter panel]]
 
[[File:Webfilter_2.png|webfilter panel]]
  
=== Installation ===
+
=== Installation 8.x and 9.x===
 
To install the contrib, simply run the following command:
 
To install the contrib, simply run the following command:
  
Line 36: Line 39:
 
  sv t /service/httpd-e-smith
 
  sv t /service/httpd-e-smith
  
You can then access the new panel in the server-manager. The first time you access it, you might have an empty category list. Just click the save button at the bottom of the page, wait a few minutes and try again (the list is empty because categories hasn't been downloaded yet). Now, you should be able to enable URL and AV filtering, and choose which categories you want to block. The next settings modification might take a long time (several minutes, you may also have a imeout error displayed). This is expected and is because squidGuard databases need to be compiled. After this, settings change should be fast.
+
You can then access the new panel in the server-manager. The first time you access it, you might have an empty category list. Just click the save button at the bottom of the page, wait a few minutes and try again (the list is empty because categories hasn't been downloaded yet). Now, you should be able to enable URL and AV filtering, and choose which categories you want to block. The next settings modification might take a long time (several minutes, you may also have a timeout error displayed). This is expected and is because squidGuard databases need to be compiled. After this, settings change should be fast.
 +
 
 +
===AV filtering and smartphones applications stores===
 +
When AV filtering is enabled, the AV engine overrides the client's UserAgent with its own, and this will break access to some websites, like the iOS AppStore and Android GooglePlay. To get arround this problem, just add the following in the whitelist:
 +
 
 +
clients.google.com
 +
android.clients.google.com
 +
*.phobos.apple.com
 +
 
 +
With this, those appstores won't be scanned by the AV engine, and they will work just as before.
  
 
===Customize category lists===
 
===Customize category lists===
Category lists are simple text files in /var/lib/squidGuard/blacklists. Each category is a directory, adn each directory may have a file names domains and another named urls. Each directory in /var/lib/squidGuard/blacklists will be displayed in the panel of the server-manager, except if it's listed in the DisabledCategories prop. You can see which categories are disabled with:
+
Category lists are simple text files in /var/squidGuard/blacklists. Each category is a directory, and each directory may have a file named '''domains''' and another named '''urls'''. Each directory in /var/lib/squidGuard/blacklists will be displayed in the panel of the server-manager, except if it's listed in the DisabledCategories prop. You can see which categories are disabled with:
 
  db configuration getprop squidguard DisabledCategories
 
  db configuration getprop squidguard DisabledCategories
This lets you ignore some useless category, and make the panel for simple.
+
This lets you ignore some useless categories, and hide them from the panel.
 
The default config update all the categories each night. This is done in the cron job /etc/cron.daily/squidGuard, which calls /etc/e-smith/events/actions/squidguard-update-databases. If you don't want to auto update those lists, you can disable this feature:
 
The default config update all the categories each night. This is done in the cron job /etc/cron.daily/squidGuard, which calls /etc/e-smith/events/actions/squidguard-update-databases. If you don't want to auto update those lists, you can disable this feature:
db configuration setprop squidguard AutoUpdate disabled
+
db configuration setprop squidguard AutoUpdate disabled
Then, you'll be able to manage the list the way you want. Remember you need to recompile squidGuard databases if you modify files in a list.
+
You can add your own categories. If they don't already exists, they won't be deleted or modified by the update feature.
 +
 
 +
===Denied page===
 +
With the default configuration, denied requests are redirected to https://hostname.domain.tld/squidGuard/cgi-bin/blocked.cgi with various parameters (like IP address, username, client group, category etc...). Username will be empty (only -), this is because squid authentication is disabled. If you enable squid authentication (with custom templates), you'll be able to log username. The downside is that you'll have to configure all your browsers to use squid as proxy, because authentication is not compatible with transparent proxying.
 +
 
 +
If you want to change the blocked page, you can. First, copy the default page to another name:
 +
 
 +
cp -a /usr/share/squidGuard/cgi-bin/blocked.cgi /usr/share/squidGuard/cgi-bin/custom.cgi
 +
 
 +
Now, you can edit this new file to your need. Then, just select it as the default blocked page:
 +
 
 +
db configuration setprop squidguard RedirectUrl \
 +
http://hostname.systemname.com/squidGuard/cgi-bin/custom.cgi?clientaddr=%a&clientname=%n&clientuser=%i&clientgroup=%s&targetgroup=%t&url=%u
 +
signal-event http-proxy-update
  
 
===MySQL logs===
 
===MySQL logs===
MySQL loging of clients requests is handled by a independant daemon called squid-db-logd. It monitors squid access log and squidGuard deny log in realtime, parse it and put everything in the database called squid_log. In this database, the table access_log list all the access while the deny_log only list denied pages. This feature may need a lot of space. On a busy server, you can easily reach 3GB / month only for the database (and more for the dump when you backup your server). To lmit the needed space, a cron job rotate and compress the access_log and deny_log tables each month. Old tables are also removed. The default config keeps one year of log. You can change this setting with (value is in day and default is 365)
+
MySQL loging of clients requests is handled by a independent daemon called squid-db-logd. It monitors squid access log and squidGuard deny log in real time, parse it and put everything in the database called squid_log. In this database, the table access_log list all the access while the deny_log only list denied pages. This feature may need a lot of disk space. On a busy server, you can easily reach 3GB / month only for the database (and more for the dump when you backup your server). To limit the needed space, a cron job remove the oldest entries. The default config keeps one year of log. You can change this setting with (value is in day and default is 365)
 
  db configuration setprop squid-db-logd Retention 180
 
  db configuration setprop squid-db-logd Retention 180
  
If you want to completly disable this feature, you can stop this daemon:
+
If you want to completely disable this feature, you can stop this daemon:
 
  db configuration setprop squid-db-logd status disabled
 
  db configuration setprop squid-db-logd status disabled
 
  sv d /service/squid-db-logd
 
  sv d /service/squid-db-logd
Line 64: Line 89:
 
*get all the pages requested by the client 192.168.7.50 on Oct 12 2012 between 10pm and 11 pm, and export the result in /tmp/result.csv
 
*get all the pages requested by the client 192.168.7.50 on Oct 12 2012 between 10pm and 11 pm, and export the result in /tmp/result.csv
  
  echo SELECT date_day,date_time,url,username INTO OUTFILE '/tmp/result.csv' FIELDS TERMINATED BY ','
+
  echo "SELECT date_day,date_time,url,username INTO OUTFILE '/tmp/result.csv' FIELDS TERMINATED BY ','
 
  OPTIONALLY ENCLOSED BY '"' ESCAPED BY '\\' LINES TERMINATED BY '\n'
 
  OPTIONALLY ENCLOSED BY '"' ESCAPED BY '\\' LINES TERMINATED BY '\n'
 
  FROM access_log WHERE client_ip='192.168.7.50' AND date_day='2012-10-08' AND date_time>'22:00:00' AND date_time<'23:00:00';" mysql squid_log
 
  FROM access_log WHERE client_ip='192.168.7.50' AND date_day='2012-10-08' AND date_time>'22:00:00' AND date_time<'23:00:00';" mysql squid_log
 +
 +
===Uninstall===
 +
If you want to uninstall this contrib, just run:
 +
yum remove squidGuard squidclamav
 +
expand-template /etc/squid/squid.conf
 +
squid -k reconfigure
 +
expand-template /etc/httpd/conf/httpd.conf
 +
sv t /service/httpd-e-smith
 +
 +
And if you want to remove every trace of it:
 +
rm -rf /var/log/squid-db-logd
 +
rm -rf /var/log/squidGuard
 +
rm -f /home/e-smith/db/mysql/squid_log.dump
 +
echo "drop database squid_log;" | mysql
 +
rm -rf /var/squidGuard
 +
rm -f /etc/squid/squidGuard.conf
 +
rm -f /etc/squidclamav.conf
 +
 +
===Sources===
 +
You can find the srpm in our repo here: http://repo.firewall-services.com/centos/5/SRPMS/
 +
You can also browse sources and clone the contrib from our git repo here: https://gitweb.firewall-services.com/?p=smeserver-webfilter;a=summary
 +
 +
===Panel and translation===
 +
The panel is translated in English, French, Dutch and Italian.
 +
 +
For now, this contrib is not available for translation in pootle (because it's in our own GIT repo). If you want to help with translation, you can get the file /etc/e-smith/locale/en-us/etc/e-smith/web/functions/webfilter (or directly from [https://gitweb.firewall-services.com/?p=smeserver-webfilter;a=blob_plain;f=root/etc/e-smith/locale/en-us/etc/e-smith/web/functions/webfilter here]) translate it, and send it back to us by mail at tech @ firewall-services . com
 +
{{#bugzilla:columns=id,product,version,status,summary |sort=id|order=desc |component=smeserver-webfilter|noresultsmessage="No open bugs found."}}
 +
[[Category:Contrib]]
 +
[[Category:Contrib:webfiltering]]

Revision as of 05:26, 18 April 2021



Maintainer

Daniel B.
Firewall Services
mailto:daniel@firewall-services.com

Version

Devel 10:
smeserver-webfilter
The latest version of smeserver-webfilter is available in the SME repository, click on the version number(s) for more information.


fws

Description

This contrib brings 3 new features for squid proxy, and provides a simple panel to control most of it:

Several categories of domain names and URLs are downloaded from the University of Toulouse and updated every night (you can get more informations on these lists here), in french). You can then just choose which categories you want to block. You can enter a list of ip addresses which won't be filtered, and a local blacklist and whitelist.

When enabled, all web traffic will be scanned before being sent to the client

  • log every requests in a MySQL database

Every request passing through squid is logged in a database, making it easier to analyze squid logs. There's no front-end for this, but you can use your favourite mysql client to see which domains are the most visited, which user eats all your bandwidth, etc...

This contrib can replace dansguardian if you have simple filtering requirements. It's really easy to configure, but is also less powerful. Dansguardian is a real content scanner (it analyse the content of the pages while squidguard only look at the URLs for example).

Requirements

  • SME Server 8 or 9
  • You need to configure both Epel and Fws repositories

Screenshots

webfilter panel webfilter panel

Installation 8.x and 9.x

To install the contrib, simply run the following command:

yum --enablerepo=epel --enablerepo=fws install smeserver-webfilter
signal-event http-proxy-update
expand-template /etc/httpd/conf/httpd.conf
sv t /service/httpd-e-smith

You can then access the new panel in the server-manager. The first time you access it, you might have an empty category list. Just click the save button at the bottom of the page, wait a few minutes and try again (the list is empty because categories hasn't been downloaded yet). Now, you should be able to enable URL and AV filtering, and choose which categories you want to block. The next settings modification might take a long time (several minutes, you may also have a timeout error displayed). This is expected and is because squidGuard databases need to be compiled. After this, settings change should be fast.

AV filtering and smartphones applications stores

When AV filtering is enabled, the AV engine overrides the client's UserAgent with its own, and this will break access to some websites, like the iOS AppStore and Android GooglePlay. To get arround this problem, just add the following in the whitelist:

clients.google.com
android.clients.google.com
*.phobos.apple.com

With this, those appstores won't be scanned by the AV engine, and they will work just as before.

Customize category lists

Category lists are simple text files in /var/squidGuard/blacklists. Each category is a directory, and each directory may have a file named domains and another named urls. Each directory in /var/lib/squidGuard/blacklists will be displayed in the panel of the server-manager, except if it's listed in the DisabledCategories prop. You can see which categories are disabled with:

db configuration getprop squidguard DisabledCategories

This lets you ignore some useless categories, and hide them from the panel. The default config update all the categories each night. This is done in the cron job /etc/cron.daily/squidGuard, which calls /etc/e-smith/events/actions/squidguard-update-databases. If you don't want to auto update those lists, you can disable this feature:

db configuration setprop squidguard AutoUpdate disabled

You can add your own categories. If they don't already exists, they won't be deleted or modified by the update feature.

Denied page

With the default configuration, denied requests are redirected to https://hostname.domain.tld/squidGuard/cgi-bin/blocked.cgi with various parameters (like IP address, username, client group, category etc...). Username will be empty (only -), this is because squid authentication is disabled. If you enable squid authentication (with custom templates), you'll be able to log username. The downside is that you'll have to configure all your browsers to use squid as proxy, because authentication is not compatible with transparent proxying.

If you want to change the blocked page, you can. First, copy the default page to another name:

cp -a /usr/share/squidGuard/cgi-bin/blocked.cgi /usr/share/squidGuard/cgi-bin/custom.cgi

Now, you can edit this new file to your need. Then, just select it as the default blocked page:

db configuration setprop squidguard RedirectUrl \
http://hostname.systemname.com/squidGuard/cgi-bin/custom.cgi?clientaddr=%a&clientname=%n&clientuser=%i&clientgroup=%s&targetgroup=%t&url=%u
signal-event http-proxy-update

MySQL logs

MySQL loging of clients requests is handled by a independent daemon called squid-db-logd. It monitors squid access log and squidGuard deny log in real time, parse it and put everything in the database called squid_log. In this database, the table access_log list all the access while the deny_log only list denied pages. This feature may need a lot of disk space. On a busy server, you can easily reach 3GB / month only for the database (and more for the dump when you backup your server). To limit the needed space, a cron job remove the oldest entries. The default config keeps one year of log. You can change this setting with (value is in day and default is 365)

db configuration setprop squid-db-logd Retention 180

If you want to completely disable this feature, you can stop this daemon:

db configuration setprop squid-db-logd status disabled
sv d /service/squid-db-logd

Here are some example of queries you can run:

  • Get the top 30 most visited domains
echo "SELECT DOMAIN,COUNT(DOMAIN) AS occurances FROM access_log GROUP BY DOMAIN ORDER BY occurances DESC LIMIT 30;" | mysql squid_log
  • Get the top 10 most used blocked categories
echo "SELECT category,COUNT(category) AS occurances FROM deny_log GROUP BY category ORDER BY occurances DESC LIMIT 10;" | mysql squid_log
  • get all the pages requested by the client 192.168.7.50 on Oct 12 2012 between 10pm and 11 pm, and export the result in /tmp/result.csv
echo "SELECT date_day,date_time,url,username INTO OUTFILE '/tmp/result.csv' FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"' ESCAPED BY '\\' LINES TERMINATED BY '\n'
FROM access_log WHERE client_ip='192.168.7.50' AND date_day='2012-10-08' AND date_time>'22:00:00' AND date_time<'23:00:00';" mysql squid_log

Uninstall

If you want to uninstall this contrib, just run:

yum remove squidGuard squidclamav
expand-template /etc/squid/squid.conf
squid -k reconfigure
expand-template /etc/httpd/conf/httpd.conf
sv t /service/httpd-e-smith

And if you want to remove every trace of it:

rm -rf /var/log/squid-db-logd
rm -rf /var/log/squidGuard
rm -f /home/e-smith/db/mysql/squid_log.dump
echo "drop database squid_log;" | mysql
rm -rf /var/squidGuard
rm -f /etc/squid/squidGuard.conf
rm -f /etc/squidclamav.conf

Sources

You can find the srpm in our repo here: http://repo.firewall-services.com/centos/5/SRPMS/ You can also browse sources and clone the contrib from our git repo here: https://gitweb.firewall-services.com/?p=smeserver-webfilter;a=summary

Panel and translation

The panel is translated in English, French, Dutch and Italian.

For now, this contrib is not available for translation in pootle (because it's in our own GIT repo). If you want to help with translation, you can get the file /etc/e-smith/locale/en-us/etc/e-smith/web/functions/webfilter (or directly from here) translate it, and send it back to us by mail at tech @ firewall-services . com

IDProductVersionStatusSummary (4 tasks)
12307SME Contribs10.0UNCONFIRMEDProblem activating url filtering and category filtering smeserver-webfilter
12065SME Contribs10.0IN_PROGRESSupdate to httpd 2.4 syntax and add systemd changes for SME10 [smeserver-webfilter]
11978SME Contribs10.0IN_PROGRESSimport to SME10 (smeserver-webfilter)
10199SME Contribs9.1UNCONFIRMEDproblemi con WebFiltering