Monthly Archives: July 2007

messing with purchase prediction software

A few years ago Tesco caused a stir with their loyalty card and the use of purchase histories to offer targeted coupons to users of the cards, which caused some privacy worries (IIRC about the same time Scott McNealy said “You have zero privacy anyway. Get over it.”) and that made me start wondering about the software used to predict buying habits, and what it made of my purchases. Am I statistically average ? Do I shop like a single person, or is there a category of ‘other-half forgotten bits’ if you sift enough data ? xkcd had a slightly more lo-tech version (don’t forget to hover over the image), and last night I started to wonder if the recent ‘healthy eating initiatives’ announced to guarantee column inches save our children from obesity might not get linked into these systems soon. My shopping took place at 20:34, and consisted of:

Chocolate Cake
Vanilla Vodka
Paracetamol

Which by my estimation would have either had the Samaritans calling me to make sure things were ok, or my Dr alerted to my unhealthy behaviour and sending me stern letters/phone messages telling me I was going to have clogged arteries and he wouldn’t be doing anything except saying ‘I told you so’ if I dared to turn up with high blood pressure, and warning me of the dangers of binge drinking.

In my defence all I can say is that the three items were unrelated, but I’m sure some profiling software knows better.

Share

exim-4.67 vs. spammers

I’ve been more than usually annoyed by a particular set of spammers who keep pushing press releases of their Windows blogging software – annoyed because it slips though SpamAssassin and has no clear sender for each new run (if I’m fast I can block it for a set, but it seems to come back every 90 days or so) but it’s 100% identifiable by the Receieved: header lines – no, I’m not going to mention the domains as they don’t deserve any more Google hits.

There’s a really neat new {exim} ACL command called forany which can be used to help out here, but be aware that these tests run on all Received: lines, and so if normal spam detection is like firing a shotgun wearing a blindfold whilst having directions shouted at you by an assistant, using this is pretty much like pointing the gun directly down and firing repeatedly: I’m sure there would be collateral damage from this at some stage, but I’m not an ISP or a company so I’m prepared to tolerate that. For now.

Anyway, put this in your main content ACL:

  deny message = This message has come via machines used to spam me in the past, and will not be delivered.
       condition = ${if def:h_Received:}
       condition = ${if forany{${readfile{/etc/exim/received_deny.list}{:}}}{match{$h_Received:}{$item}}}

And populate a plain text file (in the above example, /etc/exim/received_deny.list) with a one string per line that you’d like to ban, ie:

somedomain.tld
someotherdomain.tld

So in the above case anything that mentions either of those domains in any Received: line will cause a match and the email to be rejected with the custom error show above, so somedomain.tld, mail.somedomain.tld and blah.someotherdomain.tld will all give a positive match.

As this is a text match, putting IP addresses in there is perfectly possible too, and because the lookup is to an external file there is no need to HUP exim when adding to the file: they’ll be seen the next time the ACL is run.

Enjoy !

Share
Page 1 of 11