Detecting spam users automatically with a neural network

Future

The method described in this article has some limitations. Although a neural network can come close to any complex function, it may be the case that the optimization processes do not produce the optimum solution. In this case, the network only achieves a low accuracy level.

A further potential problem is caused by unbalanced or contradictory training data, which, for instance, might quite accidentally involve only the spammers having hyphens in their names. There is also the previously mentioned risk in large networks of overfitting, where the network learns the training data by heart but doesn't gain the ability to evaluate new, unknown data.

Despite these limitations, you can check far more pages than before using the method described in this article, because the neural network pre-sorts potential spammers. If additional spammers are found manually, you can feed them into the network later in the form of training data.

Infos

  1. All listings for the article: http://www.linux-magazin.de/static/listings/magazin/2016/12/machine_learning/
  2. TensorFlow: https://www.tensorflow.org
  3. TensorFlow: Large-scale machine learning on heterogeneous systems? (2015): http://download.tensorflow.org/paper/whitepaper2015.pdf
  4. TFLearn: http://tflearn.org
  5. Bengio, Yoshua, Practical recommendations for gradient-based training of deep architectures. In G. Montavon, G.B. Orr, and K.-R. Müller (eds.), Neural Networks: Tricks of the Trade, 2nd ed. Springer-Verlag, 2012, pp. 437-478
  6. Overfitting: https://www.ibm.com/developerworks/community/blogs/jfp/entry/Overfitting_In_Machine_Learning
  7. Installing TensorFlow: https://www.tensorflow.org/versions/r0.10/get_started/os_setup.html#pip-installation
  8. Installing TFLearn: http://tflearn.org/installation/

The Author

Chris Hinze studies IT at the University of Erlangen-Nuremberg, Germany, and works at Benjamin Lochmann New Media GmbH as a web developer. His work there involves back ends for smartphone apps, and he has collaborated on automated spam recognition for http://homepage-baukasten.de.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • FAQ

    Welcome our new artificial intelligence overlords by tinkering with their gray matter.

  • Neural Networks

    3, 4, 8, 11… ? A neural network can complete this series without knowledge of the underlying algorithm – by a kind of virtual gut feeling. We’ll show you how neural networks solve problems by simulating the behavior of a human brain.

  • Programming Snapshot – Mileage AI

    On the basis of training data in the form of daily car mileage, Mike Schilli's AI program tries to identify patterns in driving behavior and make forecasts.

  • TensorFlow AI on the Pi

    You don't need a powerful computer system to use AI. We show what it takes to benefit from AI on the Raspberry Pi and what tasks the small computer can handle.

  • Neural networks learn from mistakes and remember successes

    The well-known Monty Hall game show problem can be a rewarding maiden voyage for prospective statisticians. But is it possible to teach a neural network to choose between goats and cars with a few practice sessions?

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News