Archive for the ‘Mail’ Category

Spammers now subscribe

Friday, October 30th, 2009

During several years I’ve been setting mailing lists I admin to only accept posts from subscribers iA can with spamn order to avoid having to deal with very large amounts of spam posts.

While that is slightly awkward to users of the list, the huge benefit for me as admin has been the deciding factor.

Recently however, I’ve noticed how this way to prevent spam on the mailing lists have started to fail more and more frequently.

Now, I see a rapid growth in spam from users who actually subscribe first and then post their spam to the list. Of course, sometimes spammers happen to just fake the from address from a member of a list – like when a spammer fakes my address and sends spam to a list I am subscribed to, but it’s quite obvious that we also see the actual original spammer join lists and send spam as well.

It makes me sad, since I figure the next step I then need to take on the mailing lists I admin is to either spam check the incoming mails with a tool like spamassassin (and risk false positives or to not trap all spams) and/or start setting new members as moderated so that I have to acknowledge their first post to the list in order to make sure they’re not spammers.

Or is there any other good idea of what I can do that I haven’t thought of?

Why top-posting annoys me

Sunday, September 20th, 2009

This is hardly any news to anyone who cares, and those who should care the most are either not understanding what top-posting is in the first place or they’re not aware of that people like me think top-posting is an evil decease we need to extinguish.

My primary reason to hate top-posting is that it is fast and easy for the single user who writes the mail reply, but it gives more work to the large amounts of people who read it. When someone posts to a mailing list, one should rather expect that the single user would be the one to put in a little extra effort to make the result readable for the masses who will read it.

Top-posting also most often involves the habit of including the entire previous conversation in a quoted manner below.

A sensible post and quote ethic, is to only quote as much as you need from the previous conversation to make your point clear, and to respond in a way so that it is clear to what parts of the quotes you are referring to. That more or less implies doing “interlaced” or “inlined” posting, where you show a few lines of quotes and then a few lines of comments over and over until the end of the mail.

The act of doing bottom-posting but keeping the entire thing quoted above the new text you add is almost as bad as top-posting. You remove the focus of what you write by providing far too much irrelevant text. Remove the irrelevant parts!

These days large portions of the modern world use broadband connections so the actual size of the mail is not a concern for bandwidth or speed reasons, but you probably still want the receivers to focus on your actual point. Also, a lot of mails these days end up in web archives or similar so they are then searchable by internet search engines and browsable by future people and then you even more want the mail to be on topic to become more relevant and less misleading to searches.

In case it isn’t obvious: this of course primarily concerns mails sent to (largish) mailing lists.

HTTP Status Report

Wednesday, April 22nd, 2009

Mark Nottingham Mark Nottingham held a very interesting one hour talk on the status of HTTP and the work on HTTPbis on a QCon conference recently, and luckily for us HTTP geeks there’s this great video/presentation from that.

curl is mentioned at least twice in the slides, unfortunately it has a wrong fact on the second mention where it says curl uses “Pragma: no-cache” as it isn’t true anymore. It used to do that, but we’ve stopped doing it in curl since a while ago.

I’m a subscriber to the httpbis mailing list and a casual contributor, but nonetheless his summary and overview of the state was refreshing as I’ve not been able to keep up with all the details and I haven’t been tracking that working group from its start either.

Subject: Installer

Saturday, March 21st, 2009

This is a full quote from a genuine email I received just moments ago:

Subject: Installer

What URL do I put in to get free apps ?plese tell me

Sent from my iPhone

I have no words to describe it further.

Explanation for hjsdhjerrddf.com domains

Friday, January 23rd, 2009

In case you’ve checked some of your spam mails recently you might’ve discovered how a large amount of them include links to sites using seemingly very random names in the domain names. Like hjsdhjerrddf.com or qwetyqfweyqt.com and so on. Hammering-the-keyboard looking names.

The explanation behind these is quite simple and sad: ICANN allows for a “tasting period” before you pay for the domain. Thus spammers register all sorts of random names, spam the world with mails referring the users to these domains and then they return the domain names again before they’ve paid anything, and go on to the next names.

With a large enough set of people and programs doing this, a large amount of names will constantly be kept in use but not paid for and constantly changing owners.

Conclusion: wherever there’s a loophole in the system, someone is there to exploit it for the purpose of sending spam.

Please hide my email

Friday, November 7th, 2008

… I don’t want my employer/wife/friends to see that I’ve contributed something cool to an open source project, or perhaps that I said something stupid 10 years ago.

I host and co-host a bunch of different mailing list archives for projects on web sites, and I never cease to get stumped by how many people are trying hard to avoid getting seen on the internet. I can understand the cases where users accidentally leak information they intended to be kept private (although the removal from an archive is then not a fix since it has already been leaked to the world), but I can never understand the large crowd that tries to hide previous contributions to open source projects because they think the current or future employers may notice and have a (bad) opinion about it.

I don’t have the slightest sympathy for the claim that they get a lot of spam because of their email on my archives, since I only host very public lists and the person’s address was already posted publicly to hundreds of receivers and in most cases also to several other mailing list archives.

People are weird!

My best spam rules right now

Sunday, June 22nd, 2008

I’ve already before mentioned my antispam setup, but today I just ran a little check on my “hispam” mailbox (the spams with so high spam points that I never even bother to check them for false positives), 43MB of 7900+ spams (received during ~40 hours), to see which ones of my own handicrafted rules that get triggered the most. I use a set of 40+ custom spamassassin rules to help it trigger more mails as spam, since some of the very short mails seem to be hard to catch otherwise, and some of the mails are in many ways looking like mail I would normally get.

Anyway, my top-10 rules are:

  1. 1624 6.0 DS_BODY_DRUGBRAND      BODY: mentions drug brand
  2. 1428 6.0 DS_SUBJECT_DRUGBRAND   Subject mentions drug brand
  3. 828 6.0 DS_FROM_HAXX     spoofed haxx.se address
  4. 769 4.0 DS_BODY_DISCOUNT    BODY: mentions percent discount
  5. 745 4.0 DS_SUBJECT_DISCOUNT   subject mentions percent discount
  6. 415 2.1 DS_TO_OWNER   To contains -owner
  7. 200 6.0 DS_BODY_NODOCTOR  BODY: mentions “no doctor”
  8. 195 2.0 DS_MAILER_THEBAT  sent with the bat
  9. 189 6.0 DS_BODY_DESIGNBRANDS  BODY: mentions designer brand(s)
  10. 158 3.0 DS_BODY_REPLICAS  BODY: speaks of replicas

The first number is number of hits. The second is the “spam points” I assign a match. Then there’s the name of the rule and my description for it. The “spam points” can best be seen relative to the other rules, as what makes a single mail a spam in the end involves multiple factors that aren’t shown here.

Mail turned unreliable

Friday, May 23rd, 2008

I’ve always been proud of my ability to read and respond to email in a swift and reliable manner. I read and write emails every day, and most days I read mails more or less immediately as they land in my inbox.

However, during the recent year or so I’ve noticed that I’m no longer a reliable mail recipient. The amount of spam I get has made me tighten the screws so hard I get my share of false positives. The kind of mails that I need to rescue from my spam bin as they will otherwise suffer the death by delete. But how many do I miss? How often do I lose legitimate mails?

On some of the mailing lists I participate in, the spammers have started to send posts with my email in the From: field (circumventing the subscribers-only limitation), leading to me having to set my own mails as moderated to prevent spam to get posted… :-(

alpine in, pine out

Wednesday, May 21st, 2008

As one of the last living dinosaurs on the planet still using text-based email clients, I realized that pine has been replaced by alpine and I upgraded to that. When doing some reading up on the subject, I noticed that there’s another old grumpy guy still using this client. I’m not sure exactly what that says…

Anyway, the upside of this switch is that this client is now distributed under a proper open source license (Apache license 2.0), as that’s what I’ve been getting in my face from mutt users for years when I’ve explained what I use! (I mean the complaint that pine wasn’t proper open source)

My Antispam Measures

Monday, January 21st, 2008

I get a fair share of spam. I have something like 10 working private email addresses, I’m listed as recipient in numerous email aliases and they all end up in the same physical mailbox where I read them. I’ve also had my existing emails for many years and I’ve shown and used them publicly on the internet all the time. I’m a major spam email target now. A good day I get just 2000 spams, but bad days I’ve been well over 13000 spam emails.A can with spam

My biggest friends in this combat are: spamassassin and procmail.

I’ll describe how I have things setup, not as much as to inspire others but more to be able to get feedback from you on how I can or perhaps should improve my setup to get an even better email life.

  • I consider all mails with spam points >= 3 to be spam. I’ve also tweaked my spamassassin user_prefs to be harsher on (pure) HTML mail and a few other rules, and I’ve added a couple of my own rules to catch spams that previously did slip through a little too easy.
  • First, I filter out mail from trusted mailing lists that have their own antispam measures.
  • I catch what appears to be bounces (I have a huge regex) and if it looks like a bounce to an address I don’t send email from I nuke it immediately (and those could be a true bounce are saved in a dedicated mbox)
  • I have a white-list system that marks all incoming mails from previously marked friends as coming from a friend.
  • Mails from non-friends are passed through spamassassin. Those with spam points higher than N are put in the ‘hispam’ folder – of course with the intention that these are very very very unlikely to every have any false positives and can almost surely be deleted without check. N is currently 10 but I ponder on lowering it somewhat. Spams with less points than N are put in the ’spam’ folder, and I need to check that before I kill it because it happens that I get occasional false positives that end up there.
  • So, mails that aren’t from friends (or from a trusted mailing list) and aren’t marked as spam are then stored in the ’suspicious’ mailbox
  • Mails from friends or from trusted lists go directly into my mailbox, or into a dedicated mailbox (for lists with somewhat high traffic volumes).
  • Oh, a little additional detail: I “mark” my own outgoing mails with an additional custom header with no point whatsoever but to be able to detect when someone/something sends me mail using my own address…

My weakest point in all this right now is the fact that I don’t spam-check white-listed mails at all, so spams that are sent to me using my friends’ email addresses go through and annoy me.

BTW, I did use bogofilter in the past and for a while I actually ran both in parallel (both trained with rougly the same spam/ham boxes for the Bayes stuff) but quite heavily testing I performed at that time (a few years ago) showed that spamassissin caught a lot more spams than bogofilter, while bogofilter only caught a few extra so I dropped it then.