Using Machine Learning to Filter the News

News Filter Hackaton Team wrote “In Oliver Wyman’s recent Hackathon, we choose to create a News Filtering solution. The idea was to allow users who are sensitive to certain issues to be able to read the news without stumbling upon those issues. This could be useful in many cases, for example, to allow people with health anxiety to read the…”

A brief introduction to Ripple Down Rules (RDR)

Patrick Tschorn wrote “Ripple Down Rules (RDR) provide a structure for rule-based classifiers and an incremental construction method. Ripple Down Rules are organized as trees: a case (data to be classified) enters the root node and ripples down a particular path to receive its classification each node comprises a list of rules a rule has a set of…”

Accuracy, Precision and Recall: Multi-class performance metrics for supervised learning—Elixir

Patrick Tschorn wrote “A key aspect of judging whether a classifier is fit for purpose is measuring its predictive performance. Any commercial project that involves machine learning is well advised to establish the minimum predictive performance that a classifier has to achieve in order to be viable. In a similar vein, it is useful to establish a baseline…”

Outline of a trainable, streaming tokenizer for NLP with Elixir

Patrick Tschorn wrote “Virtually all NLP tasks require some form of tokenization, and in many cases the tokenizers provided by popular NLP libraries are adequate. If, however, the input material strays sufficiently from the norm, the available tokenizers may not be satisfactory and it may turn out that it is nearly impossible or far too costly to adapt…”

Reading ARFF files with Elixir

Patrick Tschorn wrote “If you are implementing a machine learning approach, you are likely to want to test it on publicly available datasets. A large number of these datasets use the ARFF file format established by Weka. I am not aware of any Elixir ARFF readers, so I am going to explore writing one (‘Arfficionado‘) in this blog.…”

Building rule-based machine learning systems from scratch

Patrick Tschorn wrote “Sometimes, it is obvious that a project needs machine learning, but you can tell that simply pumping the data through all the algorithms in a popular library (and picking the one algorithm that performs least badly) is not the answer. Machine learning libraries cannot cover all algorithms, trade-offs and heuristics specific to arbitrary problem domains.…”

By total 13 (Leyendo el periódico) [CC BY 2.0 (], via Wikimedia Commons

Sentiment Analysis of News Articles

Sunghun Jung wrote “What is sentiment analysis? Sentiment analysis, in a nutshell, is used to predict whether a text is negative, neutral, or positive about certain topic without having to read the full text. With the development of various Natural Language Processing (NLP) libraries, sentiment analysis has been an interesting area of exploration. So far, tweets and product…”