16. October 2018

Revisiting server-based antispam with Bogofilter

Some simple maintenance makes life interesting, but simplicity saves the day

One of the delights of running a little Raspberry Pi as a primary server for a coupe of users, all their devices and several domains, is the achievement associated with getting quarts out of pint pots.  While in our case, the main motivation is to use as little power as possible, as we are off-grid, I would probably use a Pi where I could even if we were on-grid, as it is so much more satisfying than simply throwing hardware at the delivery of  office services.  One part of that equation is anti-spam, and a while ago, I wrote about migrating from spamassasin, which is a resource hog, to dspam, which isn't. I then had to go back to spamassassin as dspam succumbed to bitrot and was no longer supported on Debian/Raspbian, before I find ways of running the venerable bogofilter as a server-based anti-spam solution. The previous posts are here and here.

While the initial setup of bogofilter against postfix and dovecot was not especially easy, as documentation is thin on the ground, we have been impressed with its accuracy, and its resource use is so light as to be mere noise rather than even registering in top or htop.  One design choice was whether to use the original Berkeley database or to use the more modern sqlite. Now I deeply distrust Berkeley database.  This is due to an incident 20 years ago when I ran Cyrus IMAP as an imap server in a company, and had to become quite adept at database recovery, as cyrus used Berkeley db as its message store.  When dovecot matured, with the option of using maildir, I migrated that company as quickly as I culd to get away from the database trouble.  So I have shied away from Berkeley ever since.  Meanwhile, I find that there are quite a few instances of sqlite in my life, and none seem to give trouble.  As bogofilter can be installed under Debian in either its db or sqlite versions, I plumped for the sqlte option.

I was checking over the weekly database backups when I noticed that the bogofilter words list was 107Mb in size. I wondered how fragmented it was likely to be, and soon found a utility I had never bothered to use before, bf_compact. How hard could it be to run this?  You run it against the folder in which the database and other files if you run other features of bogofilter, and there are version for Berkeley db and sqlite. I chose the sqlite version and it ran for a couple of minutes.

Then the error messages started.

Fortunately the utility's first job is to create a new version of the wordlist folder, so the .old directory still had the original files. I quickly copied that over, to get things working again, and took to search engines to find what went wrong.  Nothing much was available, but one post did note simply to use the native sqlite vacuum command. In my case, this was simply:

sqlite3 wordlist.db 'VACUUM;

which reduced the file from 107Mb to 85Mb.  But more to the point it reduced the average delay caused by running an email through bogofilter from around 4-5 second to 2-3 seconds.  This was the first time doing any maintenance at all to the database since installing the system around July or August last year.

All in all, I still think bogofilter is a very good anti-spam solution for servers, whereas its most common usage is at client level. I have it configured to maintain a word list globally, but it can be set up, perhaps it's best set up, to use per-user wordlists.  The will allow it to be trained to an individual's requirements.  I once worked at a company where a good subset of users had legitimate requirements to send and receive emails with words that would definitely trigger conventional anti-spam systems. Bogofilter would have allowed those individuals to have their own anti-spam database which might have been more effective.

How effective is this? Well we have had an average of around 2000-3000 spam attempts a month, with some peaks and some troughs, and excluding smtp probe attempts, which are around 10 times that number, many of which are blocked by fail2ban. The postfix defenses are pretty good, but it is rare for a spam message to get through both postfix and bogofilter. I would estimate about 2-5 a month at most when spammers change their techniques, but months can go by without ever seeing spam, other than those correctly classified and automatically put into the spam folder. These are moved to the users junk folder and are automatically reclassified, helping with learning.

2. October 2018

Latest update - off-grid Linux IT services

The latest (October 2018) update on hardware and software choices when your power supply is limited

This is a technical blog post which may not be of general interest, and assumes a certain level of technical understanding

Continue reading

18. August 2018

Off-grid - making do and mending

A bit of resourcefulness and "it'll-come-in-handy-some-day" hoarding saves the day

Continue reading

2. May 2018

Weather Station Software

Our weather data goes back to 2011. How to maintain that collection while changing the weather station software

Continue reading

5. February 2018

Raspberry Pi home server - setting up OpenDKIM with postfix

Some vague notes about setting up openDKIM against multiple domains on a single instance, all domains using one key.

Continue reading

14. September 2017

Preparing digital photo files for use on the Web

Reducing size, adding copyright information and other watermarking is useful for pictures one may wish to use on the Internet. Selecting the images, or using drag 'n' drop graphically is easier than pushing specific files through a script. 

Note - I am no programmer, and my scripting abilities are severely limited. This works for me, but may not work for you. Use these ideas at your own risk.


Continue reading

23. August 2017

Lightweight anti-spam alternative for small servers

Spamassassin may be the standard anti-spam utility for servers, but it can't be considered to be either fast or low on resources.  Bogofilter may offer some advantages, but unlike spamassassin, tutorials and how-to's are thin on the ground

Warning: this is a technical post, full of jargon and an expectation that it will be read by system administrators, so may not be of interest to all readers.

Integrating postfix, dovecot and bogofilter on a Raspberry Pi.

Edit: Some months have passed since installing bogofilter.  It is not as fast as a daemon, but not as slow as spamassassin, either.  As expected, it has taken a couple of months to build up accuracy, but the system is excellent, and now reliably marks some spam that always got through spamassassin. This seems a good way forward for lightweight email systems.

Continue reading

16. July 2017

The Calendar Hokey Cokey

A problem with the Nextcloud calendar can be resolved with a lot of to-ing and fro-ing, thanks to standards

Continue reading

16. June 2017

A Digital Re-think

Some thoughts on images in the digital world...

But note, as with all things photographic, this is all just opinion.

Continue reading

30. March 2017

A little dab in the wrong plaice makes you wonder...

Flying fish in the hills of Assynt

Continue reading

The Queen's Mite

Finding and photographing a bumble bee loaded with mites

Continue reading

26. February 2017

Lexie 2003 - 2017


The life and times of a wonderful Westie

Continue reading

27. January 2017

Tales of experience

Some stories from more than 25 years in corporate IT, some of which seem quite historical now

Continue reading

1. October 2016

Serving up some Pi

You really can run on a Raspberry Pi

Continue reading

13. July 2016

A new battery bank for our off-grid power supply


The final piece of the recent power revamp

Continue reading

9. April 2016

The Silver Darling - When you need a good Linux laptop


The Entroware Apollo laptop is a good option to be sure that new hardware runs well with Linux, and the system itself is good. But the process of buying a Linux laptop could be more straight forward.

Continue reading

8. March 2016

Music at home - HiFi gains a new source

Streaming music without corporate control allows you to listen to your music, not "consume" it

Self-sufficient hifi

Continue reading

7. March 2016

The Northern Lights


Forget the science, just enjoy them

Continue reading

2. March 2016

The Superwind's first winter

Looks like the right kit

Continue reading

20. June 2015

Despamming versus assassinating spam


Free software tools for ensuring email security and managing spam are good and self-managing, but they can be difficult to set up. This was proven by a  recent prod to investigate options.

Edited, September 2016 - I have reluctantly reverted to other anti-spam systems.  Debian have stopped providing dspam, although there is a more recent version than the last packaged version. Worse, the version that is packaged for the Raspberry Pi, which has most to gain from dspam, does not work. This is a great pity, as dspam, for the 15 months I have used it, is accurate (96.20%), lightning fast, and uses very few resources.

Continue reading

- page 1 of 3