Spam and profanity modules for Drupal
Comment sections, on any website, are often where most of the action takes place. Sometimes, it’s filled with engaging banter and precious insight. Other times, it’s bulging with hatred and criticism. Although they’re a great tool for connecting content creators with content consumers, action needs to be taken against profanity, spam, and unnecessary (and sometimes downright mean) content.
Perhaps more so than other platforms, online communities have very lively comment sections. In Open Social, our community solution, comments are enabled by default for various user-generated content, including topics, events, and posts.
We even built comment-specific tools for our enterprise projects such as Crowd Innovation and Discussions. On GlobalDevHub, a platform we developed for the United Nations Development Program, there are discussions with hundreds and hundreds of comments. These active discussions make it harder for content managers and moderators to keep eye on the quality of the discussion.
In order to build upon a support system for acceptable comments, we implemented (and can recommend) a few spam protection and profanity modules for Drupal.
Many platforms struggle with spam, which are messages with the intent to insult or harm. These messages are either posted to probe for vulnerabilities, to get links published on your site or are posted by spambots that scan the website for places to put bogus content.
There are luckily a few things you can do to try and limit and prevent spam in your comment sections and discussions:
Require email registration. In Open Social, users have to register using an email confirmation in order to interact on the site. This removes anonymity and introduces accountability to your systems. This ensures that not just anybody interacts on the platform, but only those that put in the effort to join.
Implement Google’s reCAPTCHA. In our community platforms where spam bots are too aggressive, we use Google’s reCAPTCHA, which could be used as a checkbox “I’m not a robot” or even completely invisible with some custom development. This is helpful because it prevents spam and abuse in your community by recognizing which users are valid and which are automated software that conducts abusive behavior.
Not interested in third-party tools? You could try Honeypot instead. This tool creates an invisible form field, which in most cases will be filled by spam bots but not by real users. The biggest downside of this module is that it disables cache on pages where it’s used. A similar alternative is the Antibot module, which protects forms while still being able to cache the page.
Profanity filter modules
An even bigger problem than spambots is human spam or comments with profanity and offensive words. This issue has deep-seated roots. Many situations and places, that aren’t only to be found on the internet, suffer from humans abusing other humans such as extremism towards other cultures and religions. The UN approaches this as a world-wide issue and promotes tolerance (for instance, through their International Day for Tolerance).
We don’t want spam to occur on our community platforms. Therefore, we needed to automatically filter user-generated content for spam and profanity. We previously used a content moderation service called Mollom but it ended its support and maintenance as of April 2018. So, let’s see what alternatives there are to Mollom.
But first, I’d like to say one important thing about profanity filters in general.
Profanity filters will never be perfect!
Why? Because language is changing and you need to update your vocabulary frequently, and because of there is a large number of exceptions, which you should whitelist. As articles such as the Scunthorpe problem and the long-necked Giraffe and fluffy white bunny remind us, there are plenty of examples of issues and bad implementation of profanity filters. So, you should always be absolutely sure about your implementation of censorship.
But if you are looking for a profanity filter, then here are some of the options that we recommend.
The first one is Wordfilter. This module provides a suitable base for manually filtering out profanity. You can choose between the direct filtering of specified words or token filtering. This can be used be used as filters for any text format.
The downside of using this module is that you need to manually populate vocabulary. Over the internet, you can find a lot of archives with bad words, but anyway, let’s agree that it’s not the nicest work to do ?
If you don’t want to spend too much time and effort on looking for vocabulary, you can choose to use third-party services. There are not too many options for Drupal, but there is one that I found that seems to work really well.
The second option is WebPurify. It is similar to Wordfilter in that it can be used with any text format, and it can also be used on node titles and plain text fields. It already contains a pretty big vocabulary of profanity, but you can also create your own black and white list.
A similar and even more popular alternative to WebPurify is CleanTalk, but I was not able to make a profanity filter work there.
Prioritizing safety on your platform
Although comment sections should remain a part of the web - whether they are on YouTube, Facebook, Medium, or community platforms - we need to put in an effort to prevent profanity and spam. These sections need to be heavily moderated, manually or automated, to avoid people being scammed or, worse, bullied.
Hopefully, I’ve been able to provide some useful suggestions when it comes to Drupal modules that help prevent spam and profanity. It’s necessary for us to prioritize safety and quality on our platforms, and these measures are efficient first steps that you can take.
What are some steps that you have taken against spam and profanity? Are there any other modules you can recommend?
- Embracing Open Source Security with Drupal
- Simple, Safe, Fast: A New Tool for Data Cleaning
- Access Control Improvements for Drupal 8
- Obscenity Filters: Bad Idea, or Incredibly Intercoursing Bad Idea?