Workflow-based data analysis with KNIME

Recommended Reading

The point of the exercise is to recommend articles that are closest to the reader's preferences. For example, if the reader is particularly interested in hardware and security, it would be a good idea to suggest articles from these categories – or even articles that are in both categories at the same time. The current workflow has already explored the extent to which a certain reader has a preference for each category, and the result is available in the form of a vector (Figure 5).

You can create a very similar vector for each article, where the columns of those categories contain a 1, to which the article is assigned. Category columns without connection to the article, on the other hand, contain a 0. The One-to-Many node makes this possible: The transformation of the article-heading assignments (Table 2) into a representation by a binary vector per article (Figure 10).

Table 2

Categories Table

Article ID

Category

Article 11

Hardware

Article 11

Software

Article 31

Development

Figure 10: Output table of the one-to-many node.

Once the two vector types have been created, the Similarity Search node can simply determine a distance between the vector of a reader (which represents its preferences) and the vector of an article (which indicates to which columns it is assigned). The smaller the distance, the more the reader's preference corresponds to the categories in which the article is classified.

From all articles that a certain reader has not yet read (see Row Reference Filter node), it is now easy to determine the article that has the shortest distance to the reader's preferences. This article is finally recommended for reading.

A loop (consisting of the Chunk Loop Start and Loop End nodes) corresponds to a For loop across all rows of the table. The loop determines the smallest distance of the vectors for each reader. The overall result with one article recommendation per reader (Figure 11) could then be written back to a database with the help of the Database Writer node.

Figure 11: The content of the meta node for article recommendations.

Conclusion

This article only covers a fraction of the nearly 2,000 nodes available for KNIME. Other exciting KNIME features include flow variables, the Workflow Coach, streaming, nodes for processing texts, and Deep Learning capabilities.

You can get an idea of the KNIME Analytics Platform by downloading and testing the software. Check out the documentation [3] and node guide [4] at the KNIME website for more on working with KNIME, or bring your questions to the KNIME Forum [5].

Infos

  1. KNIME: https://www.knime.com
  2. D3.js framework: https://d3js.org
  3. KNIME Documentation: https://www.knime.com/documentation
  4. KNIME Node Guide: https://www.knime.com/nodeguide
  5. KNIME Forum: https://www.knime.com/forum

The Author

Alexander Fillbrunn is a PhD student at the Department of Bioinformatics and Information Mining at the University of Konstanz. He is particularly interested in the development of machine learning algorithms.

Martin Horn is a postdoc at the Department of Bioinformatics and Information Mining at the University of Konstanz, where he studies data analysis – of course using KNIME.

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Node-RED

    Node-RED lets you connect ready-made code building blocks to create event-driven applications with little or no code writing.

  • Tutorials – Minetest

    Minetest is much more than a clone of a certain popular proprietary game. It offers infinite customization that allows you to create blocks, objects, fun educational exercises, and even games within the game, dishing up features well beyond those of any other closed source alternative.

  • Elixer

    The Elixir programming language on a Raspberry Pi lets you create distributed projects in just a few lines of code.

  • Animation with OpenToonz

    OpenToonz is a professional animation tool for comic and manga artists.

  • NocoDB

    NocoDB lets you build useful applications without writing a single line of code.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News