[Server-devel] web filtering

Adrian Chadd adrian at squid-cache.org
Tue Aug 4 08:32:26 EDT 2009


2009/8/1 Martin Langhoff <martin.langhoff at gmail.com>:
> On Fri, Jul 31, 2009 at 11:59 PM, Joshua N Pritikin<jpritikin at pobox.com> wrote:
>> On Fri, Jul 31, 2009 at 10:46:33PM -0600, Martin Langhoff wrote:
>>>  - wiki material?
>>
>> Added, http://wiki.laptop.org/go/XS_Installing_Software#Internet_Filtering
>>
>> Is it worth mentioning DansGuardian? DansGuardian is free and does
>> content filtering, not just URL or IP address filtering.
>
> Yes, though it seems to be rather crude in its abilities. Not DG's
> fault though, all content filters are rather rough. Even spam filters,
> which can take their time in analysing content, mess up plenty...

Dansguardian (and a lot of the other Squid integration stuff) has a
nasty habit of sucking all of the filtering info into RAM to speed up
requests. It also may store multiple copies of the filtering related
stuff in RAM for each user group you define.

Doing straight URL filtering from an on-disk hash table is actually
not all that difficult to do and it works surprisingly well. I've
drafted up stuff to do this with squid in the past for enormous (read:
million line) filtering tables and the performance was very good. Of
course, the performance then can fluctuate based on how much free
cache RAM your box has..

I'd love to actively help out with this but I'm still knee-deep in
other projects.

HTH,



Adrian


More information about the Server-devel mailing list