Detecting spam users automatically with a neural network

Spam Stopper

© Lead Image © Kirsty Pargeter,

© Lead Image © Kirsty Pargeter,

Article from Issue 195/2017

Build a neural network that uncovers spam websites.

Website builders – online hosting services that provide tools for non-technical users to build their own websites  – are frequently exploited by spammers looking for a convenient launching pad. Checking thousands, or sometimes millions, of web pages manually to look for evidence of a spammer is both tedious and inefficient.

In this article, I show how to build a suitable spam-searching neural network with help from Google's TensorFlow machine learning library [2] [3] and TFLearn [4], a library with a high-level API for TensorFlow. Even if you don't spend your days searching for spammers, the techniques described in this article will give you some insights on how to harness the power of neural networks for other complex problems.

Training Day

The neural network needs both positive and negative samples in order to learn. This solution starts with a manually compiled list of sample users divided into spammers and legitimate users, taking care to distribute both types in equal numbers. Alongside this classification (spammer or not spammer), the data set contained the user's name or the website that belongs to the user, the IP address with which the site is registered, and the language version associated with the site.


Use Express-Checkout link below to read the full article (PDF).

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • FAQ

    Welcome our new artificial intelligence overlords by tinkering with their gray matter.

  • Neural Networks

    3, 4, 8, 11… ? A neural network can complete this series without knowledge of the underlying algorithm – by a kind of virtual gut feeling. We’ll show you how neural networks solve problems by simulating the behavior of a human brain.

  • Programming Snapshot – AI Sequences

    2, 5, 7, 10, 12 – and what number comes next? Mike Schilli tests whether intelligence tests devised by psychologists can be cracked with modern AI Networks.

  • TensorFlow AI on the Pi

    You don't need a powerful computer system to use AI. We show what it takes to benefit from AI on the Raspberry Pi and what tasks the small computer can handle.

  • Neural networks learn from mistakes and remember successes

    The well-known Monty Hall game show problem can be a rewarding maiden voyage for prospective statisticians. But is it possible to teach a neural network to choose between goats and cars with a few practice sessions?

comments powered by Disqus