Seminario di Informatica: “Interpretable Predictions of Tree-based Ensembles via Actionable Feature Tweaking”

Martedì 24 Ottobre 2017, ore 14:30 - Aula 1BC45 - Gabriele Tolomei


Martedì 24 Ottobre 2017 alle ore 14:30 in Aula 1BC45, Gabriele Tolomei (Università degli Studi di Padova) terrà un seminario dal titolo: “Interpretable Predictions of Tree-based Ensembles via Actionable Feature Tweaking”.

Machine-learned models are often described as “black boxes”. In many real-world applications however, models may have to sacrifice predictive power in favour of human-interpretability. When this is the case, feature engineering becomes a crucial task, which requires significant and time-consuming human effort. Whilst some features are inherently static, representing properties that cannot be influenced (e.g., the age of an individual), others capture characteristics that could be adjusted (e.g., the daily amount of carbohydrates taken). Nonetheless, once a model is learned from the data, each prediction it makes on new instances is irreversible - assuming every instance to be a static point located in the chosen feature space. There are many circumstances however where it is important to understand (i) why a model outputs a certain prediction on a given instance, (ii) which adjustable features of that instance should be modified, and finally (iii) how to alter such a prediction when the mutated instance is input back to the model. In this talk, I present a technique that exploits the internals of a tree-based ensemble classifier to offer recommendations for transforming true negative instances into positively predicted ones. I demonstrate the validity of this approach using a real-world online advertising application. First, I provide the audience with some background knowledge on the online advertising domain and define the specific business problem which this research work originates from and aims to solve. Concretely, I discuss the design and development of a Random Forest classifier that effectively separates between two types of ads: low (negative) and high (positive) quality ads (instances). Then, I introduce an algorithm that provides recommendations that aim to transform a low quality ad (negative instance) into a high quality one (positive instance). Finally, I present the main results and findings obtained when evaluating this approach on a subset of the active inventory of a large ad network, Yahoo Gemini.

Download Seminari di Informatica