Difference between revisions of "Dansguardian"

From SME Server
Jump to navigationJump to search
(Added Howto)
Line 1: Line 1:
 
[[Category:Howto]]
 
[[Category:Howto]]
'''Dansguardian'''
+
== Dansguardian ==
 +
 
  
 
'''Dansguardian web content filtering HOWTO install & configure on sme 7.x'''
 
'''Dansguardian web content filtering HOWTO install & configure on sme 7.x'''
Line 12: Line 13:
 
   
 
   
  
Contributors
+
'''Contributors'''
  
 
Thanks to Stephen Noble at dungog.net for providing rpms & information generally. This HOWTO requires command line control to edit configuration files & restart the dansguardian service after configuration changes.
 
Thanks to Stephen Noble at dungog.net for providing rpms & information generally. This HOWTO requires command line control to edit configuration files & restart the dansguardian service after configuration changes.
Line 18: Line 19:
 
Dungog.net sells a commercial implementation of Dansguardian for sme server which adds a server manager panel to allow GUI control of all Dansguardian functionality & settings.
 
Dungog.net sells a commercial implementation of Dansguardian for sme server which adds a server manager panel to allow GUI control of all Dansguardian functionality & settings.
  
Information
+
 
 +
'''Information'''
  
 
To have a proper understanding of how Dansguardian works and the importance of certain configuration settings you should read the detailed installation notes and Manual at the Dansguardian web site http://dansguardian.org
 
To have a proper understanding of how Dansguardian works and the importance of certain configuration settings you should read the detailed installation notes and Manual at the Dansguardian web site http://dansguardian.org
Line 28: Line 30:
 
The information on the Dansguardian website is of a generic nature and some of it is NOT applicable to sme server installations, refer to the instructions in this HOWTO in preference.
 
The information on the Dansguardian website is of a generic nature and some of it is NOT applicable to sme server installations, refer to the instructions in this HOWTO in preference.
  
Installation instructions
+
 
 +
'''Installation instructions'''
  
 
Warning - Do not upgrade dansguardian v2.9 over previous v2.8 (or earlier) installations as there are substantial changes. (The recommendation from Dansguardian is to edit the new configuration files/lists rather than try to edit your old ones)
 
Warning - Do not upgrade dansguardian v2.9 over previous v2.8 (or earlier) installations as there are substantial changes. (The recommendation from Dansguardian is to edit the new configuration files/lists rather than try to edit your old ones)
Line 70: Line 73:
 
   
 
   
  
Modifying Dansguardian configuration
+
'''Modifying Dansguardian configuration'''
  
 
You need to manually modify configuration files /etc/dansguardian/dansguardian.conf and /etc/dansguardian/dansguardianf1.conf and /etc/dansguardian/dansguardianf2.conf and /etc/dansguardian/dansguardianfn.conf
 
You need to manually modify configuration files /etc/dansguardian/dansguardian.conf and /etc/dansguardian/dansguardianf1.conf and /etc/dansguardian/dansguardianf2.conf and /etc/dansguardian/dansguardianfn.conf
Line 116: Line 119:
 
Ctrl o an d Ctrl x
 
Ctrl o an d Ctrl x
  
Modifying other Dansguardian configuration files
+
 
 +
'''Modifying other Dansguardian configuration files'''
  
 
You will need to change other config files to suit your site requirements:
 
You will need to change other config files to suit your site requirements:
Line 146: Line 150:
 
Some of the default settings in these files will prevent access to certain web sites and file types, which may conflict with your site requirements. See details in the "Further customisation" section at the end of this Howto or at http://dansguardian.org  
 
Some of the default settings in these files will prevent access to certain web sites and file types, which may conflict with your site requirements. See details in the "Further customisation" section at the end of this Howto or at http://dansguardian.org  
  
Modifying the default html error message page
+
 
 +
'''Modifying the default html error message page'''
  
 
You may also want to tailor the html template for the error message displayed when Dansguardian blocks a site, see
 
You may also want to tailor the html template for the error message displayed when Dansguardian blocks a site, see
Line 156: Line 161:
 
pico -w /etc/dansguardian/languages/ukenglish/template.html
 
pico -w /etc/dansguardian/languages/ukenglish/template.html
  
Starting Dansguardian
+
 
 +
'''Starting Dansguardian'''
  
 
After install & initial configuration you must manually start Dansguardian to enable web content filtering
 
After install & initial configuration you must manually start Dansguardian to enable web content filtering
Line 164: Line 170:
 
/etc/init.d/dansguardian start
 
/etc/init.d/dansguardian start
  
Stopping Dansguardian
+
'''Stopping Dansguardian'''
  
 
If you need to stop Dansguardian (ie to disable filtering or test your system without Dansguardian running)
 
If you need to stop Dansguardian (ie to disable filtering or test your system without Dansguardian running)
Line 170: Line 176:
 
/etc/init.d/dansguardian stop
 
/etc/init.d/dansguardian stop
  
Restarting Dansguardian
+
'''Restarting Dansguardian'''
  
 
You will need to restart Dansguardian after making any configuration changes (so they can take effect)
 
You will need to restart Dansguardian after making any configuration changes (so they can take effect)
Line 176: Line 182:
 
/etc/init.d/dansguardian restart
 
/etc/init.d/dansguardian restart
  
Status check of Dansguardian
+
'''Status check of Dansguardian'''
  
 
If you need to check that Dansguardian is running
 
If you need to check that Dansguardian is running
Line 182: Line 188:
 
/etc/init.d/dansguardian status
 
/etc/init.d/dansguardian status
  
Configuring your system to force Dansguardian usage & prevent bypassing
 
  
Dansguardian uses port 8080 for web proxy requests. If your browser does not use port 8080 then Dansguardian filtering will be bypassed. To force this usage & prevent users bypassing filtering you should do the following 3 steps:
+
'''Configuring your system to force Dansguardian usage & prevent bypassing'''
  
1) Configure your sme server to use Transparent Proxy port 8080 instead of the default port 3128
+
Dansguardian uses port 8080 for web proxy requests. If your browser does not use port 8080 then Dansguardian filtering will be bypassed. To force this usage & prevent users bypassing filtering you should do the following steps:
 +
 
 +
'''1) Configure your sme server to use Transparent Proxy port 8080 and to block direct access to the squid proxy port 3128 & redirect port 80 to port 8080'''
 +
 
 +
Note the functionality to create custom firewall rules using iptables is built in to the rpms provided by Stephen Noble
  
 
config setprop squid TransparentPort 8080
 
config setprop squid TransparentPort 8080
  
2) Configure your sme server to block direct access to the squid proxy port 3128 & redirect port 80 to port 8080
+
config setprop dansguardian portblocking yes
  
The functionality to create custom firewall rules using iptables is built in to the rpms provided by Stephen Noble & requires enabling with a db command
+
signal-event post-upgrade
  
config setprop dansguardian portblocking yes
+
reboot
 +
 
 +
To return Transparent Proxy port to default value and to disable portblocking
  
To disable portblocking
+
config setprop squid TransparentPort 3128
  
 
config delprop dansguardian portblocking
 
config delprop dansguardian portblocking
 
To enable any/all of the above setting changes you must follow the commands with:
 
  
 
signal-event post-upgrade  
 
signal-event post-upgrade  
Line 206: Line 215:
 
reboot
 
reboot
  
3) Configure your workstation web browser to auto detect proxy port
+
'''2) Configure your workstation web browser to auto detect proxy port'''
  
 
Go to your workstation and open your browser eg Internet Explorer or Firefox or your preferred browser
 
Go to your workstation and open your browser eg Internet Explorer or Firefox or your preferred browser
Line 216: Line 225:
 
Or alternatively use the server IP 192.168.1.1 (or whatever yours is) and use a port of 8080
 
Or alternatively use the server IP 192.168.1.1 (or whatever yours is) and use a port of 8080
  
Configuring Dansguardian to use Auth login
+
 
 +
'''Configuring Dansguardian to use Auth login'''
  
 
This functionality is built in to the rpms provided by Stephen Noble & requires enabling with a db command
 
This functionality is built in to the rpms provided by Stephen Noble & requires enabling with a db command
Line 227: Line 237:
  
 
config setprop squid RequireAuth pam
 
config setprop squid RequireAuth pam
 
 
or
 
or
 
 
config setprop squid RequireAuth nsca
 
config setprop squid RequireAuth nsca
 
 
or
 
or
 
 
config setprop squid RequireAuth ident
 
config setprop squid RequireAuth ident
  
Line 243: Line 249:
  
 
expand-template /etc/squid/squid.conf
 
expand-template /etc/squid/squid.conf
 
 
svc -t /service/squid
 
svc -t /service/squid
  
Line 266: Line 271:
 
http://dansguardian.org/downloads/michaelpike/DGID.zip
 
http://dansguardian.org/downloads/michaelpike/DGID.zip
  
Groups and Auth login
+
 
 +
'''Groups and Auth login'''
  
 
See http:/dansguardian.org re Group configuration functionality in relation to Auth login (ie filtering users access rights based on group membership)
 
See http:/dansguardian.org re Group configuration functionality in relation to Auth login (ie filtering users access rights based on group membership)
Line 278: Line 284:
 
   
 
   
  
Testing access
+
'''Testing access'''
  
 
From a workstation web browser go to the site of www.sex.com or www.sex.com.au
 
From a workstation web browser go to the site of www.sex.com or www.sex.com.au
Line 288: Line 294:
 
   
 
   
  
General information re Blacklists
+
'''General information re Blacklists'''
  
 
You can install blacklists from mesd.k12.or.us or alternatively use the commercial blacklist from URLBlacklist.com
 
You can install blacklists from mesd.k12.or.us or alternatively use the commercial blacklist from URLBlacklist.com
Line 298: Line 304:
 
   
 
   
  
Further customisation - configuration options
+
'''Further customisation - configuration options'''
 +
 
 
DansGuardian is highly configurable. The source code is available so you have the ultimate in configurability, although most people will be content with modifying the configuration files.  
 
DansGuardian is highly configurable. The source code is available so you have the ultimate in configurability, although most people will be content with modifying the configuration files.  
  
Line 305: Line 312:
 
There are two main configuration files, several banned lists and exception lists. These are all explained below:  
 
There are two main configuration files, several banned lists and exception lists. These are all explained below:  
  
exceptionsitelist
+
'''exceptionsitelist'''
 
This contains a list of domain endings that if found in the requested URL, DansGuardian will not filter the page. Note that you should not put the http:// or the www. at the beginning of the entries.  
 
This contains a list of domain endings that if found in the requested URL, DansGuardian will not filter the page. Note that you should not put the http:// or the www. at the beginning of the entries.  
  
exceptioniplist
+
'''exceptioniplist'''
 
This contains a list of client IPs who you want to bypass the filtering. For example, the network administrator's computer's IP.  
 
This contains a list of client IPs who you want to bypass the filtering. For example, the network administrator's computer's IP.  
  
exceptionuserlist
+
'''exceptionuserlist'''
 
Usernames who will not be filtered (basic authentication or ident must be enabled).  
 
Usernames who will not be filtered (basic authentication or ident must be enabled).  
  
exceptionphraselist
+
'''exceptionphraselist'''
 
If any of the phrases listed here appear in a web page then the filtering is bypassed. Care should be taken adding phrases to this file as they can easily stop many pages from being blocked. It would be better to put a negative value in the weightedphraselist.  
 
If any of the phrases listed here appear in a web page then the filtering is bypassed. Care should be taken adding phrases to this file as they can easily stop many pages from being blocked. It would be better to put a negative value in the weightedphraselist.  
  
exceptionurllist
+
'''exceptionurllist'''
 
URLs in here are for parts of sites that filtering should be switched off for.  
 
URLs in here are for parts of sites that filtering should be switched off for.  
  
bannediplist
+
'''bannediplist'''
 
IP addresses of client machines to disallow web access to. Only put IP addresses here, not host names.  
 
IP addresses of client machines to disallow web access to. Only put IP addresses here, not host names.  
  
bannedphraselist
+
'''bannedphraselist'''
 
This contains a list of banned phrases. The phrases must be enclosed between < and >. DansGuardian is supplied with an example list. You can not use phrases such as <sex> as this will block sites such as Middlesex University. The phrases can contain spaces. Use them to your advantage. This is the most useful part of DansGuardian and will catch more pages than PICS and URL filtering put together.  
 
This contains a list of banned phrases. The phrases must be enclosed between < and >. DansGuardian is supplied with an example list. You can not use phrases such as <sex> as this will block sites such as Middlesex University. The phrases can contain spaces. Use them to your advantage. This is the most useful part of DansGuardian and will catch more pages than PICS and URL filtering put together.  
  
 
Combinations of phrases can also be used, which if they are all found in a page, it is blocked. Exception phrases are no longer listed in this file - see exceptionphraselist.  
 
Combinations of phrases can also be used, which if they are all found in a page, it is blocked. Exception phrases are no longer listed in this file - see exceptionphraselist.  
  
banneduserlist
+
'''banneduserlist'''
 
Users names, who, if basic proxy authentication is enabled, will automatically be denied web access.  
 
Users names, who, if basic proxy authentication is enabled, will automatically be denied web access.  
  
bannedmimetypelist
+
'''bannedmimetypelist'''
 
This contains a list of banned MIME-types. If a URL request returns a MIME-type that is in this list, DansGuardian will block it. DansGuardian comes with some example MIME-types to deny. This is a good way of blocking inappropriate movies for example. It is obviously unwise to ban the MIME-types text/html or image/*.  
 
This contains a list of banned MIME-types. If a URL request returns a MIME-type that is in this list, DansGuardian will block it. DansGuardian comes with some example MIME-types to deny. This is a good way of blocking inappropriate movies for example. It is obviously unwise to ban the MIME-types text/html or image/*.  
  
bannedextensionlist
+
'''bannedextensionlist'''
 
This contains a list of banned file extensions. If a URL ends in an extension that is in this list, DansGuardian will block it. DansGuardian comes with some example file extensions to deny. This is a good way of blocking kiddies from downloading those lovely screen savers and hacking tools. You are a fool if you ban the file extension .html, or .jpg etc.  
 
This contains a list of banned file extensions. If a URL ends in an extension that is in this list, DansGuardian will block it. DansGuardian comes with some example file extensions to deny. This is a good way of blocking kiddies from downloading those lovely screen savers and hacking tools. You are a fool if you ban the file extension .html, or .jpg etc.  
  
bannedregexpurllist
+
'''bannedregexpurllist'''
 
This contains a list of banned regular expression URLs. For more information on regular expressions, see http://www.opengroup.org/onlinepubs/7908799/xbd/re.html
 
This contains a list of banned regular expression URLs. For more information on regular expressions, see http://www.opengroup.org/onlinepubs/7908799/xbd/re.html
  
 
Regular expressions are a very powerful pattern matching system. This file allows you to match URLs using this method.  
 
Regular expressions are a very powerful pattern matching system. This file allows you to match URLs using this method.  
  
bannedsitelist
+
'''bannedsitelist'''
 
This file contains a list of banned sites. Entering a domain name here bans the entire site. For banning specific parts of a site, see bannedurllist. Also, you can have a blanket ban all sites except those specifically excluded in exceptionsitelist. You can also block sites specified only as an IP address, and include a stock squidGuard blacklists collection. To enable these blacklists, download them from the extras section http://dansguardian.org/?page=extras
 
This file contains a list of banned sites. Entering a domain name here bans the entire site. For banning specific parts of a site, see bannedurllist. Also, you can have a blanket ban all sites except those specifically excluded in exceptionsitelist. You can also block sites specified only as an IP address, and include a stock squidGuard blacklists collection. To enable these blacklists, download them from the extras section http://dansguardian.org/?page=extras
  
 
Simply put them somewhere appropriate, un-comment the squidGuard blacklists collection lines at the bottom of the bannedsitelist file, and check the paths are correct. For URL blacklists, edit the bannedurllist in a similar way.  
 
Simply put them somewhere appropriate, un-comment the squidGuard blacklists collection lines at the bottom of the bannedsitelist file, and check the paths are correct. For URL blacklists, edit the bannedurllist in a similar way.  
  
bannedurllist
+
'''bannedurllist'''
 
This allows you to block specific parts of a site rather than the whole site. To block an entire site, see bannedsitelist. To enable squidGuard blacklists for URLs, you will need to download the blacklists and edit the squidGuard blacklists collection section at the bottom (as for bannedsitelist above).  
 
This allows you to block specific parts of a site rather than the whole site. To block an entire site, see bannedsitelist. To enable squidGuard blacklists for URLs, you will need to download the blacklists and edit the squidGuard blacklists collection section at the bottom (as for bannedsitelist above).  
  
weightedphraselist
+
'''weightedphraselist'''
 
Each phrase is given a value either positive or negative and the values are added up. Phrases to do with good subjects will have negative values, and bad subjects will have positive values. Once the naughtyness limit is reached (within dansguardian.conf), the page is blocked. See the Naughtyness Limit description within the dansguardian.conf section below.  
 
Each phrase is given a value either positive or negative and the values are added up. Phrases to do with good subjects will have negative values, and bad subjects will have positive values. Once the naughtyness limit is reached (within dansguardian.conf), the page is blocked. See the Naughtyness Limit description within the dansguardian.conf section below.  
  
pics
+
'''pics'''
 
This file allows you to finely tune the PICS filtering. Each PICS section comes with a description of the allowed settings and what they represent. The default settings with DansGuardian are set for youngish children, for example mild profanities and artistic nudity are allowed. PICS filtering can also be totally disabled / enabled using the enablePICS = on | off option.  
 
This file allows you to finely tune the PICS filtering. Each PICS section comes with a description of the allowed settings and what they represent. The default settings with DansGuardian are set for youngish children, for example mild profanities and artistic nudity are allowed. PICS filtering can also be totally disabled / enabled using the enablePICS = on | off option.  
  
 
For more detailed information on PICS ratings, see http://www.w3.org/PICS/
 
For more detailed information on PICS ratings, see http://www.w3.org/PICS/
 
 
   
 
   
 +
'''contentregexplist'''
  
contentregexplist
 
 
 
  
ICRA
+
'''ICRA'''
 
The ICRA section is fairly self-explanatory. A value of 0 means nothing of that category is allowed, whereas a value of 1 allows it. For example,
 
The ICRA section is fairly self-explanatory. A value of 0 means nothing of that category is allowed, whereas a value of 1 allows it. For example,
  
Line 371: Line 375:
 
allows nude art. For more in-depth information see http://www.rsac.org/
 
allows nude art. For more in-depth information see http://www.rsac.org/
  
+
'''RSAC'''
 
 
RSAC
 
 
RSAC is an older version of ICRA. The values here range from 0 meaning none allowed, through 2 (the default value), to 4, which allows wanton and gratuitous amounts of the given category. For more in-depth information see http://www.rsac.org/
 
RSAC is an older version of ICRA. The values here range from 0 meaning none allowed, through 2 (the default value), to 4, which allows wanton and gratuitous amounts of the given category. For more in-depth information see http://www.rsac.org/
  
+
'''evaluWEB'''
 
 
evaluWEB
 
 
evaluWEB rating uses a system similar to the British Film classification system:
 
evaluWEB rating uses a system similar to the British Film classification system:
  
Line 387: Line 387:
 
2 = 18 (Only suitable for viewers aged 18 and over)
 
2 = 18 (Only suitable for viewers aged 18 and over)
  
SafeSurf
+
'''SafeSurf'''
 
Similar to RSAC, but containing a larger range of categories with the range from 0 = full filtering to 9 = wanton and gratuitous. For more in-depth information, see http://www.safesurf.com
 
Similar to RSAC, but containing a larger range of categories with the range from 0 = full filtering to 9 = wanton and gratuitous. For more in-depth information, see http://www.safesurf.com
  
+
'''Weburbia'''
 
 
Weburbia
 
 
See evaluWEB. For more in-depth information, see http://www.weburbia.com/safe/index.shtml
 
See evaluWEB. For more in-depth information, see http://www.weburbia.com/safe/index.shtml
  
+
'''Vancouver Webpages'''
 
 
Vancouver Webpages
 
 
This is yet another ratings scheme. See http://vancouver-webpages.com/VWP1.0/
 
This is yet another ratings scheme. See http://vancouver-webpages.com/VWP1.0/
  
Line 404: Line 400:
 
   
 
   
  
dansguardian.conf & dansguardianf1.conf
+
'''dansguardian.conf & dansguardianf1.conf'''
 
The only setting that is vital for you to configure in the dansguardian.conf file is the accessdeniedaddress setting. You should set this to the address (not the file path) of your Apache server with the perl access denied reporting script. For most people this will be the same server as squid and DansGuardian. If you really want you can change this address to a normal html static page on any server.
 
The only setting that is vital for you to configure in the dansguardian.conf file is the accessdeniedaddress setting. You should set this to the address (not the file path) of your Apache server with the perl access denied reporting script. For most people this will be the same server as squid and DansGuardian. If you really want you can change this address to a normal html static page on any server.
  
Reporting Level
+
'''Reporting Level'''
 
You can change the reporting level for when a page gets denied. It can say just 'Access Denied', or report why, or report why and what the denied phrase is. The latter may be more useful for testing, but the middler would be more useful in a school environment. Stealth mode logs what would be denied but doesn't do any blocking.  
 
You can change the reporting level for when a page gets denied. It can say just 'Access Denied', or report why, or report why and what the denied phrase is. The latter may be more useful for testing, but the middler would be more useful in a school environment. Stealth mode logs what would be denied but doesn't do any blocking.  
  
Logging Settings
+
'''Logging Settings'''
 
This setting lets you configure the logging level. You can log nothing, just denied pages, text based and all requests. HTTPS requests only get logged when the logging is set to 3 - all requests.  
 
This setting lets you configure the logging level. You can log nothing, just denied pages, text based and all requests. HTTPS requests only get logged when the logging is set to 3 - all requests.  
  
Log Exception Hits
+
'''Log Exception Hits'''
 
Log if an exception (user, ip, URL, or phrase) is matched and so the page gets let through. This can be useful for diagnosing why a site gets through the filter.  
 
Log if an exception (user, ip, URL, or phrase) is matched and so the page gets let through. This can be useful for diagnosing why a site gets through the filter.  
  
Log File Format
+
'''Log File Format'''
 
This setting alters the format of the DansGuardian log file. Please note option 3 (standard log format) is not yet unimplemented.  
 
This setting alters the format of the DansGuardian log file. Please note option 3 (standard log format) is not yet unimplemented.  
  
Network Settings
+
'''Network Settings'''
 
These allow you to modify the IP address that DansGuardian is listening on, the port DansGuardian listens on, the IP address of the server running squid as well as the squid port. It is possible to configure the Access Denied reporting page here also.  
 
These allow you to modify the IP address that DansGuardian is listening on, the port DansGuardian listens on, the IP address of the server running squid as well as the squid port. It is possible to configure the Access Denied reporting page here also.  
  
Content Filtering Settings
+
'''Content Filtering Settings'''
 
Here you can modify the location of the list files. Adjusting these locations is not recommended.  
 
Here you can modify the location of the list files. Adjusting these locations is not recommended.  
  
Naughtyness limit
+
'''Naughtyness limit'''
 
This setting refers to the weighted phrase limit over which the page will be blocked. Each weighted phrase is given a value either positive or negative and the values added up. Phrases to do with good subjects will have negative values, and bad subjects will have positive values. See the weightedphraselist file for examples. As a rough guide, a value of 50 is for young children, 100 for older children, 160 for young adults.  
 
This setting refers to the weighted phrase limit over which the page will be blocked. Each weighted phrase is given a value either positive or negative and the values added up. Phrases to do with good subjects will have negative values, and bad subjects will have positive values. See the weightedphraselist file for examples. As a rough guide, a value of 50 is for young children, 100 for older children, 160 for young adults.  
  
Show weighted phrases found
+
'''Show weighted phrases found'''
 
If enabled then the phrases found that made up the total which exceeds the naughtyness limit will be logged and, if the reporting level is high enough, reported.  
 
If enabled then the phrases found that made up the total which exceeds the naughtyness limit will be logged and, if the reporting level is high enough, reported.  
  
Reverse Lookups for Banned Sites and URLs
+
'''Reverse Lookups for Banned Sites and URLs'''
 
If set to on, DansGuardian will look up the forward DNS for an IP URL address and search for both in the banned site and URL lists. This would prevent a user from simply entering the IP for a banned address. It will reduce searching speed somewhat so unless you have a local caching DNS server, leave it off and use the Blanket IP Block option in the bannedsitelist file instead.  
 
If set to on, DansGuardian will look up the forward DNS for an IP URL address and search for both in the banned site and URL lists. This would prevent a user from simply entering the IP for a banned address. It will reduce searching speed somewhat so unless you have a local caching DNS server, leave it off and use the Blanket IP Block option in the bannedsitelist file instead.  
  
Build bannedsitelist and bannedurllist Cache Files
+
'''Build bannedsitelist and bannedurllist Cache Files'''
 
This will compare the date stamp of the list file with the date stamp of the cache file and will recreate as needed. If a bsl or bul .processed file exists, then that will be used instead. It will increase process start speed by 300%. On slow computers this will be significant. Fast computers do not need this option.  
 
This will compare the date stamp of the list file with the date stamp of the cache file and will recreate as needed. If a bsl or bul .processed file exists, then that will be used instead. It will increase process start speed by 300%. On slow computers this will be significant. Fast computers do not need this option.  
  
POST protection (web upload and forms)
+
'''POST protection (web upload and forms)'''
 
This is for blocking or limiting uploads, not for blocking forms without any file upload. The value is given in kilobytes after MIME encoding and header information.  
 
This is for blocking or limiting uploads, not for blocking forms without any file upload. The value is given in kilobytes after MIME encoding and header information.  
  
Username identification methods (used in logging)
+
'''Username identification methods (used in logging)'''
 
The proxyauth option is for when basic proxy authentication is used (obviously no good for transparent proxying). The ntlm option is for when the proxy supports the MS NTLM authentication. This only works with IE5.5 sp1 and later, and has not been implemented yet. The ident option causes DansGuardian to try to connect to an identd server on the computer originating the request.  
 
The proxyauth option is for when basic proxy authentication is used (obviously no good for transparent proxying). The ntlm option is for when the proxy supports the MS NTLM authentication. This only works with IE5.5 sp1 and later, and has not been implemented yet. The ident option causes DansGuardian to try to connect to an identd server on the computer originating the request.  
  
Forwarded For
+
'''Forwarded For'''
 
This option adds an X-Forwarded-For: <clientIP> to the HTTP request header. This may help solve some problem sites that need to know the source IP.  
 
This option adds an X-Forwarded-For: <clientIP> to the HTTP request header. This may help solve some problem sites that need to know the source IP.  
  
Max Children
+
'''Max Children'''
 
This sets the maximum number of processes to spawn to handle the incoming connections. This will prevent DoS attacks killing the server with too many spawned processes. On large sites you might want to double or triple this number.  
 
This sets the maximum number of processes to spawn to handle the incoming connections. This will prevent DoS attacks killing the server with too many spawned processes. On large sites you might want to double or triple this number.  
  
Log Connection Handling Errors
+
'''Log Connection Handling Errors'''
 
This option logs some debug info regarding fork()ing and accept()ing which can usually be ignored. These are logged by syslog. It is safe to leave this setting on or off.
 
This option logs some debug info regarding fork()ing and accept()ing which can usually be ignored. These are logged by syslog. It is safe to leave this setting on or off.

Revision as of 02:41, 10 July 2007

Dansguardian

Dansguardian web content filtering HOWTO install & configure on sme 7.x

Author: Ray Mitchell - mitchellcpa_AT_yahoo.com.au

Howto Release Date & Version: 10 July 2007 - v7.2

sme server version supported: 7.1.3


Contributors

Thanks to Stephen Noble at dungog.net for providing rpms & information generally. This HOWTO requires command line control to edit configuration files & restart the dansguardian service after configuration changes.

Dungog.net sells a commercial implementation of Dansguardian for sme server which adds a server manager panel to allow GUI control of all Dansguardian functionality & settings.


Information

To have a proper understanding of how Dansguardian works and the importance of certain configuration settings you should read the detailed installation notes and Manual at the Dansguardian web site http://dansguardian.org

An old version 2.4 installation notes are here: http://dansguardian.org/downloads/detailedinstallation2.4.html#further

The FAQ is here: http://sourceforge.net/docman/display_doc.php?docid=27215&group_id=131757

The information on the Dansguardian website is of a generic nature and some of it is NOT applicable to sme server installations, refer to the instructions in this HOWTO in preference.


Installation instructions

Warning - Do not upgrade dansguardian v2.9 over previous v2.8 (or earlier) installations as there are substantial changes. (The recommendation from Dansguardian is to edit the new configuration files/lists rather than try to edit your old ones)

Please check the dungog.net web site for later versions http://sme.dungog.net/packages/smeserver/7.0/i386/html/index_dungog.html


Download the required rpms into an empty folder on your sme server using the Linux wget command

wget http://mirror.contribs.org/smeserver/contribs/rmitchell/smeserver/contribs/dansguardian/rpms/2.9.8-2/dansguardian-2.9.8-2.noarch.rpm

wget http://mirror.contribs.org/smeserver/contribs/rmitchell/smeserver/contribs/dansguardian/rpms/2.9.8-2/smeserver-dansguardian-2.9-3.el4.sme.noarch.rpm

wget http://mirror.contribs.org/smeserver/contribs/rmitchell/smeserver/contribs/dansguardian/rpms/2.8.0.6/dungog-blacklists-1.0-20061002.noarch.rpm

Instal the rpms

rpm -Uvh *.rpm


Alternatively you can add the dungog repository & use yum --enable-repo to download & install

Add the dungog repository from dungog.net (with status disabled as recommended by sme developers) with the following command:

db yum_repositories set dungog repository BaseURL http://sme.dungog.net/packages/smeserver/7.0/i386/dungog/ EnableGroups yes GPGCheck no Name 'SME Server 7 - dungog' Visible yes status disabled

(the above command should all be on one line)

expand-template /etc/yum.conf

Then download & install the packages

yum --enable-repo=dungog install dansguardian smeserver-dansguardian dungog-blacklists

To view available updates

yum --enable-repo=dungog list updates


Modifying Dansguardian configuration

You need to manually modify configuration files /etc/dansguardian/dansguardian.conf and /etc/dansguardian/dansguardianf1.conf and /etc/dansguardian/dansguardianf2.conf and /etc/dansguardian/dansguardianfn.conf

pico -w /etc/dansguardian/dansguardian.conf

You will initially need to change:

accessdeniedaddress = 'http://YOURSERVER.YOURDOMAIN/cgi-bin/dansguardian.pl'

for example to

accessdeniedaddress = 'http://www.mydomain.com/cgi-bin/dansguardian.pl'

Make any other required changes to suit your situation by carefully reviewing the other setting possibilities

Ctrl o (to save)

Ctrl x (to exit)

pico -w /etc/dansguardian/dansguardianf1.conf

You may initially need to change (to suit adult level of protection)

naughtynesslimit = 50

to

naughtynesslimit = 160 (or even 250 or 300 depending on your sensitivity/tolerance requirements)

Make any other required changes to suit your situation by carefully reviewing the other setting possibilities

Ctrl o and Ctrl x

pico -w /etc/dansguardian/dansguardianf2.conf

Make any required changes to suit your situation by carefully reviewing all the setting possibilities

Ctrl o and Ctrl x

pico -w /etc/dansguardian/dansguardianfn.conf

Make any required changes to suit your situation by carefully reviewing all the setting possibilities

Ctrl o an d Ctrl x


Modifying other Dansguardian configuration files

You will need to change other config files to suit your site requirements:

You can read information in the beginning of each config file that explains usage & syntax

These are located in /etc/dansguardian/.....

eg

pico -w /etc/dansguardian/bannedextensionlist

make the required changes

Ctrl o and Ctrl x

Most users will need to change these 4 files as a minimum

exceptionsitelist

bannedsitelist

bannedurllist

bannedextensionlist

You should review ALL the dansguardian config files in /etc/dansguardian as part of your initial Dansguardian setup.

Some of the default settings in these files will prevent access to certain web sites and file types, which may conflict with your site requirements. See details in the "Further customisation" section at the end of this Howto or at http://dansguardian.org


Modifying the default html error message page

You may also want to tailor the html template for the error message displayed when Dansguardian blocks a site, see

/etc/dansguardian/languages/(languagename)/template.html

eg

pico -w /etc/dansguardian/languages/ukenglish/template.html


Starting Dansguardian

After install & initial configuration you must manually start Dansguardian to enable web content filtering

(Note that suitable links to start Dansguardian at startup/reboot are setup when the rpm is installed)

/etc/init.d/dansguardian start

Stopping Dansguardian

If you need to stop Dansguardian (ie to disable filtering or test your system without Dansguardian running)

/etc/init.d/dansguardian stop

Restarting Dansguardian

You will need to restart Dansguardian after making any configuration changes (so they can take effect)

/etc/init.d/dansguardian restart

Status check of Dansguardian

If you need to check that Dansguardian is running

/etc/init.d/dansguardian status


Configuring your system to force Dansguardian usage & prevent bypassing

Dansguardian uses port 8080 for web proxy requests. If your browser does not use port 8080 then Dansguardian filtering will be bypassed. To force this usage & prevent users bypassing filtering you should do the following steps:

1) Configure your sme server to use Transparent Proxy port 8080 and to block direct access to the squid proxy port 3128 & redirect port 80 to port 8080

Note the functionality to create custom firewall rules using iptables is built in to the rpms provided by Stephen Noble

config setprop squid TransparentPort 8080

config setprop dansguardian portblocking yes

signal-event post-upgrade

reboot

To return Transparent Proxy port to default value and to disable portblocking

config setprop squid TransparentPort 3128

config delprop dansguardian portblocking

signal-event post-upgrade

reboot

2) Configure your workstation web browser to auto detect proxy port

Go to your workstation and open your browser eg Internet Explorer or Firefox or your preferred browser

Change the settings for Connections to LAN

Select Auto detect proxy

Or alternatively use the server IP 192.168.1.1 (or whatever yours is) and use a port of 8080


Configuring Dansguardian to use Auth login

This functionality is built in to the rpms provided by Stephen Noble & requires enabling with a db command

Dansguardian supports different types of auth login ie nsca, pam & ident

Depending on your requirements, enable using the appropriate command. Most users of sme will probably use pam auth as that will authorise access against sme users and passwords.

For details regarding the various auth login methods & other configuration requirements, see http://dansguardian.org or Google

config setprop squid RequireAuth pam or config setprop squid RequireAuth nsca or config setprop squid RequireAuth ident

To disable Auth login

config delprop squid RequireAuth

To enable any of the above setting changes you must follow the command with:

expand-template /etc/squid/squid.conf svc -t /service/squid

If you are using nsca auth, create the user & password authentication list (you don't require users to be valid sme users)

touch /home/e-smith/db/proxyusers

Enter user names & password combinations one by one using this command

htpasswd -b /home/e-smith/db/proxyusers username password

You can test the authentication list using the following command

/usr/lib/squid/ncsa_auth /home/e-smith/db/proxyusers

Then enter the username & password when asked

You will see a ERR or OK response

If you are using ident auth, you will require a ident client/server on your workstation available from:

http://dansguardian.org/downloads/michaelpike/DGID.zip


Groups and Auth login

See http:/dansguardian.org re Group configuration functionality in relation to Auth login (ie filtering users access rights based on group membership)

The Group filter files are located in:

/etc/dansguardian/lists/fn/*

Edit these to suit your site requirements


Testing access

From a workstation web browser go to the site of www.sex.com or www.sex.com.au

You should receive a message advising the site is blocked. Try browsing to other sites with inappropriate content or a site on your banned site list and you should receive a site blocked message.

Remember that access to sites is controlled by settings in the config files.


General information re Blacklists

You can install blacklists from mesd.k12.or.us or alternatively use the commercial blacklist from URLBlacklist.com

If you choose to use or trial the lists from blacklist .com, download the tgz file, uncompress and move to the

/etc/dansguardian/blacklists directory. There is also a blacklist from dungog.net that was installed at the beginning of this HOWTO.


Further customisation - configuration options

DansGuardian is highly configurable. The source code is available so you have the ultimate in configurability, although most people will be content with modifying the configuration files.

After you have modified any configuration file, to apply the changes you will need to restart DansGuardian.

There are two main configuration files, several banned lists and exception lists. These are all explained below:

exceptionsitelist This contains a list of domain endings that if found in the requested URL, DansGuardian will not filter the page. Note that you should not put the http:// or the www. at the beginning of the entries.

exceptioniplist This contains a list of client IPs who you want to bypass the filtering. For example, the network administrator's computer's IP.

exceptionuserlist Usernames who will not be filtered (basic authentication or ident must be enabled).

exceptionphraselist If any of the phrases listed here appear in a web page then the filtering is bypassed. Care should be taken adding phrases to this file as they can easily stop many pages from being blocked. It would be better to put a negative value in the weightedphraselist.

exceptionurllist URLs in here are for parts of sites that filtering should be switched off for.

bannediplist IP addresses of client machines to disallow web access to. Only put IP addresses here, not host names.

bannedphraselist This contains a list of banned phrases. The phrases must be enclosed between < and >. DansGuardian is supplied with an example list. You can not use phrases such as <sex> as this will block sites such as Middlesex University. The phrases can contain spaces. Use them to your advantage. This is the most useful part of DansGuardian and will catch more pages than PICS and URL filtering put together.

Combinations of phrases can also be used, which if they are all found in a page, it is blocked. Exception phrases are no longer listed in this file - see exceptionphraselist.

banneduserlist Users names, who, if basic proxy authentication is enabled, will automatically be denied web access.

bannedmimetypelist This contains a list of banned MIME-types. If a URL request returns a MIME-type that is in this list, DansGuardian will block it. DansGuardian comes with some example MIME-types to deny. This is a good way of blocking inappropriate movies for example. It is obviously unwise to ban the MIME-types text/html or image/*.

bannedextensionlist This contains a list of banned file extensions. If a URL ends in an extension that is in this list, DansGuardian will block it. DansGuardian comes with some example file extensions to deny. This is a good way of blocking kiddies from downloading those lovely screen savers and hacking tools. You are a fool if you ban the file extension .html, or .jpg etc.

bannedregexpurllist This contains a list of banned regular expression URLs. For more information on regular expressions, see http://www.opengroup.org/onlinepubs/7908799/xbd/re.html

Regular expressions are a very powerful pattern matching system. This file allows you to match URLs using this method.

bannedsitelist This file contains a list of banned sites. Entering a domain name here bans the entire site. For banning specific parts of a site, see bannedurllist. Also, you can have a blanket ban all sites except those specifically excluded in exceptionsitelist. You can also block sites specified only as an IP address, and include a stock squidGuard blacklists collection. To enable these blacklists, download them from the extras section http://dansguardian.org/?page=extras

Simply put them somewhere appropriate, un-comment the squidGuard blacklists collection lines at the bottom of the bannedsitelist file, and check the paths are correct. For URL blacklists, edit the bannedurllist in a similar way.

bannedurllist This allows you to block specific parts of a site rather than the whole site. To block an entire site, see bannedsitelist. To enable squidGuard blacklists for URLs, you will need to download the blacklists and edit the squidGuard blacklists collection section at the bottom (as for bannedsitelist above).

weightedphraselist Each phrase is given a value either positive or negative and the values are added up. Phrases to do with good subjects will have negative values, and bad subjects will have positive values. Once the naughtyness limit is reached (within dansguardian.conf), the page is blocked. See the Naughtyness Limit description within the dansguardian.conf section below.

pics This file allows you to finely tune the PICS filtering. Each PICS section comes with a description of the allowed settings and what they represent. The default settings with DansGuardian are set for youngish children, for example mild profanities and artistic nudity are allowed. PICS filtering can also be totally disabled / enabled using the enablePICS = on | off option.

For more detailed information on PICS ratings, see http://www.w3.org/PICS/

contentregexplist


ICRA The ICRA section is fairly self-explanatory. A value of 0 means nothing of that category is allowed, whereas a value of 1 allows it. For example,

ICRAnudityartistic = 1

allows nude art. For more in-depth information see http://www.rsac.org/

RSAC RSAC is an older version of ICRA. The values here range from 0 meaning none allowed, through 2 (the default value), to 4, which allows wanton and gratuitous amounts of the given category. For more in-depth information see http://www.rsac.org/

evaluWEB evaluWEB rating uses a system similar to the British Film classification system:

0 = U (Universal, ie. suitable for even the youngest viewer)

1 = PG (Parental Guidance recommended)

2 = 18 (Only suitable for viewers aged 18 and over)

SafeSurf Similar to RSAC, but containing a larger range of categories with the range from 0 = full filtering to 9 = wanton and gratuitous. For more in-depth information, see http://www.safesurf.com

Weburbia See evaluWEB. For more in-depth information, see http://www.weburbia.com/safe/index.shtml

Vancouver Webpages This is yet another ratings scheme. See http://vancouver-webpages.com/VWP1.0/

for more information.


dansguardian.conf & dansguardianf1.conf The only setting that is vital for you to configure in the dansguardian.conf file is the accessdeniedaddress setting. You should set this to the address (not the file path) of your Apache server with the perl access denied reporting script. For most people this will be the same server as squid and DansGuardian. If you really want you can change this address to a normal html static page on any server.

Reporting Level You can change the reporting level for when a page gets denied. It can say just 'Access Denied', or report why, or report why and what the denied phrase is. The latter may be more useful for testing, but the middler would be more useful in a school environment. Stealth mode logs what would be denied but doesn't do any blocking.

Logging Settings This setting lets you configure the logging level. You can log nothing, just denied pages, text based and all requests. HTTPS requests only get logged when the logging is set to 3 - all requests.

Log Exception Hits Log if an exception (user, ip, URL, or phrase) is matched and so the page gets let through. This can be useful for diagnosing why a site gets through the filter.

Log File Format This setting alters the format of the DansGuardian log file. Please note option 3 (standard log format) is not yet unimplemented.

Network Settings These allow you to modify the IP address that DansGuardian is listening on, the port DansGuardian listens on, the IP address of the server running squid as well as the squid port. It is possible to configure the Access Denied reporting page here also.

Content Filtering Settings Here you can modify the location of the list files. Adjusting these locations is not recommended.

Naughtyness limit This setting refers to the weighted phrase limit over which the page will be blocked. Each weighted phrase is given a value either positive or negative and the values added up. Phrases to do with good subjects will have negative values, and bad subjects will have positive values. See the weightedphraselist file for examples. As a rough guide, a value of 50 is for young children, 100 for older children, 160 for young adults.

Show weighted phrases found If enabled then the phrases found that made up the total which exceeds the naughtyness limit will be logged and, if the reporting level is high enough, reported.

Reverse Lookups for Banned Sites and URLs If set to on, DansGuardian will look up the forward DNS for an IP URL address and search for both in the banned site and URL lists. This would prevent a user from simply entering the IP for a banned address. It will reduce searching speed somewhat so unless you have a local caching DNS server, leave it off and use the Blanket IP Block option in the bannedsitelist file instead.

Build bannedsitelist and bannedurllist Cache Files This will compare the date stamp of the list file with the date stamp of the cache file and will recreate as needed. If a bsl or bul .processed file exists, then that will be used instead. It will increase process start speed by 300%. On slow computers this will be significant. Fast computers do not need this option.

POST protection (web upload and forms) This is for blocking or limiting uploads, not for blocking forms without any file upload. The value is given in kilobytes after MIME encoding and header information.

Username identification methods (used in logging) The proxyauth option is for when basic proxy authentication is used (obviously no good for transparent proxying). The ntlm option is for when the proxy supports the MS NTLM authentication. This only works with IE5.5 sp1 and later, and has not been implemented yet. The ident option causes DansGuardian to try to connect to an identd server on the computer originating the request.

Forwarded For This option adds an X-Forwarded-For: <clientIP> to the HTTP request header. This may help solve some problem sites that need to know the source IP.

Max Children This sets the maximum number of processes to spawn to handle the incoming connections. This will prevent DoS attacks killing the server with too many spawned processes. On large sites you might want to double or triple this number.

Log Connection Handling Errors This option logs some debug info regarding fork()ing and accept()ing which can usually be ignored. These are logged by syslog. It is safe to leave this setting on or off.