Workflow-based data analysis with KNIME
Recommended Reading
The point of the exercise is to recommend articles that are closest to the reader's preferences. For example, if the reader is particularly interested in hardware and security, it would be a good idea to suggest articles from these categories – or even articles that are in both categories at the same time. The current workflow has already explored the extent to which a certain reader has a preference for each category, and the result is available in the form of a vector (Figure 5).
You can create a very similar vector for each article, where the columns of those categories contain a 1
, to which the article is assigned. Category columns without connection to the article, on the other hand, contain a 0
. The One-to-Many
node makes this possible: The transformation of the article-heading assignments (Table 2) into a representation by a binary vector per article (Figure 10).
Table 2
Categories Table
Article ID | Category |
---|---|
Article 11 |
Hardware |
Article 11 |
Software |
Article 31 |
Development |
… |
Once the two vector types have been created, the Similarity Search
node can simply determine a distance between the vector of a reader (which represents its preferences) and the vector of an article (which indicates to which columns it is assigned). The smaller the distance, the more the reader's preference corresponds to the categories in which the article is classified.
From all articles that a certain reader has not yet read (see Row Reference Filter
node), it is now easy to determine the article that has the shortest distance to the reader's preferences. This article is finally recommended for reading.
A loop (consisting of the Chunk Loop Start
and Loop End
nodes) corresponds to a For loop across all rows of the table. The loop determines the smallest distance of the vectors for each reader. The overall result with one article recommendation per reader (Figure 11) could then be written back to a database with the help of the Database Writer
node.
Conclusion
This article only covers a fraction of the nearly 2,000 nodes available for KNIME. Other exciting KNIME features include flow variables, the Workflow Coach, streaming, nodes for processing texts, and Deep Learning capabilities.
You can get an idea of the KNIME Analytics Platform by downloading and testing the software. Check out the documentation [3] and node guide [4] at the KNIME website for more on working with KNIME, or bring your questions to the KNIME Forum [5].
Infos
- KNIME: https://www.knime.com
- D3.js framework: https://d3js.org
- KNIME Documentation: https://www.knime.com/documentation
- KNIME Node Guide: https://www.knime.com/nodeguide
- KNIME Forum: https://www.knime.com/forum
« Previous 1 2 3 4
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Canonical Bumps LTS Support to 12 years
If you're worried that your Ubuntu LTS release won't be supported long enough to last, Canonical has a surprise for you in the form of 12 years of security coverage.
-
Fedora 40 Beta Released Soon
With the official release of Fedora 40 coming in April, it's almost time to download the beta and see what's new.
-
New Pentesting Distribution to Compete with Kali Linux
SnoopGod is now available for your testing needs
-
Juno Computers Launches Another Linux Laptop
If you're looking for a powerhouse laptop that runs Ubuntu, the Juno Computers Neptune 17 v6 should be on your radar.
-
ZorinOS 17.1 Released, Includes Improved Windows App Support
If you need or desire to run Windows applications on Linux, there's one distribution intent on making that easier for you and its new release further improves that feature.
-
Linux Market Share Surpasses 4% for the First Time
Look out Windows and macOS, Linux is on the rise and has even topped ChromeOS to become the fourth most widely used OS around the globe.
-
KDE’s Plasma 6 Officially Available
KDE’s Plasma 6.0 "Megarelease" has happened, and it's brimming with new features, polish, and performance.
-
Latest Version of Tails Unleashed
Tails 6.0 is based on Debian 12 and includes GNOME 43.
-
KDE Announces New Slimbook V with Plenty of Power and KDE’s Plasma 6
If you're a fan of KDE Plasma, you'll be thrilled to hear they've announced a new Slimbook with an AMD CPU and the latest version of KDE Plasma desktop.
-
Monthly Sponsorship Includes Early Access to elementary OS 8
If you want to get a glimpse of what's in the pipeline for elementary OS 8, just set up a monthly sponsorship to help fund its continued existence.