- Page: 1.9 » Linux Magazine

Detecting spam users automatically with a neural network

Future

The method described in this article has some limitations. Although a neural network can come close to any complex function, it may be the case that the optimization processes do not produce the optimum solution. In this case, the network only achieves a low accuracy level.

A further potential problem is caused by unbalanced or contradictory training data, which, for instance, might quite accidentally involve only the spammers having hyphens in their names. There is also the previously mentioned risk in large networks of overfitting, where the network learns the training data by heart but doesn't gain the ability to evaluate new, unknown data.

Despite these limitations, you can check far more pages than before using the method described in this article, because the neural network pre-sorts potential spammers. If additional spammers are found manually, you can feed them into the network later in the form of training data.

Infos

All listings for the article: http://www.linux-magazin.de/static/listings/magazin/2016/12/machine_learning/
TensorFlow: https://www.tensorflow.org
TensorFlow: Large-scale machine learning on heterogeneous systems? (2015): http://download.tensorflow.org/paper/whitepaper2015.pdf
TFLearn: http://tflearn.org
Bengio, Yoshua, Practical recommendations for gradient-based training of deep architectures. In G. Montavon, G.B. Orr, and K.-R. Müller (eds.), Neural Networks: Tricks of the Trade, 2nd ed. Springer-Verlag, 2012, pp. 437-478
Overfitting: https://www.ibm.com/developerworks/community/blogs/jfp/entry/Overfitting_In_Machine_Learning
Installing TensorFlow: https://www.tensorflow.org/versions/r0.10/get_started/os_setup.html#pip-installation
Installing TFLearn: http://tflearn.org/installation/

The Author

Chris Hinze studies IT at the University of Erlangen-Nuremberg, Germany, and works at Benjamin Lochmann New Media GmbH as a web developer. His work there involves back ends for smartphone apps, and he has collaborated on automated spam recognition for http://homepage-baukasten.de.

« Previous 1 2 3 4