Machine Learning and Human Interpretability

The key idea behind Conductrics is that marketing optimization is really a reinforcement learning problem, a class of machine learning problem, rather than an AB testing problem. Framing optimization as a reinforcement learning problem allowed us to provide, from the very beginning, not just AB and multivariate testing tools, but also multi-armed bandits, predictive targeting, and a type of multi-touch decision attribution using Q-learning. Our first release back in 2010 was in many ways more advanced than what the rest of the industry is providing today.

However, we discovered that no matter how accurate the machine learning might be, many clients and potential clients, were uncertain about ceding control, even of low risk aspects of the user experience, to automated systems if they couldn’t understand how the machine learning was making decisions.

This led us to reconsider what is a good ML solution, especially for customer facing applications, and to redesign our machine learning engine to both solve for accuracy and also be human interpretable.

What is Machine Learning?

Machine learning can be thought of as sitting at the intersection of computer science and statistics. ‘Computer Science has focused primarily on how to manually program computers, Machine Learning focuses on the question of how to get computers to program themselves …’ – Tom Mitchel (see: http://www.cs.cmu.edu/~tom/pubs/MachineLearning.pdf ).

For many tasks it makes relatively little difference if these programs are opaque to human introspection. We are almost exclusively concerned with performance for some task (‘is this a cat?’; how to win at Donkey Kong, etc.). Here, high capacity models, like deep learning, suffer little penalty for marginal increases in representational complexity.

However, for several valid reasons, marketers tend to be wary about ceding control of their customers’ experiences to black box methods. Firstly, they need assurances that they can trust automation to make reasonable decisions over a large space of possible environments. Secondly, companies often have internal and legal regulatory requirements around how, and which, data may be used for making marketing decisions. With the EU’s GDPR coming into effect in May 2018, this will be even more of a concern.

Why Human Understanding matters

1) Trust – People often need to understand something before they can trust it. This is not only true for marketers, but true of ML diagnostics systems for Doctors. Doctors are much more likely to consider the recommendation/decision from an automated system if the reasoning can be explained.

2) Insights – Not only do people need to trust these systems, but they also want to be able to glean insights from them – ‘I didn’t realize this type of offer would convert better on the weekends’.

3) Review and Accountability – will this decision from our machine learning be consistent with our stated data policies? Can we be sure this is fair, or non discriminatory? Is this even Legal?

4) Explanation – can you communicate and explain the prediction, or decisions, to users if they ask. You may be thinking, ‘who cares if you can explain it to users’? Well, as if May 2018, the General Data Protection goes into effect in the EU. Not only does this change the regulations of how personal data can be used, stored etc., it also places regulation on automated decision systems based on Machine Learning.

Under Article 22 of the GDPR, EU citizens will have the right of explanation for any ‘significant’ decision that was made via an automated ML system. While there is still ambiguity around what ‘significant’ will mean, and what will be a satisfactory explanation, the main take away is that Automated Decisions are contestable. The penalty for being in breach is up to 4% of a companies worldwide gross revenue. So it is a huge risk.

Human readable machine learning representation

While lacking the expressiveness, or capacity, of more complex representations, sparse decision trees have many appealing properties for the marketing use cases:

1)Human Readable – You can see based on the features of the users, what action / or output the system will take.

2)Discrete Rule set – the rules cover all possible use cases. This is useful for organizational review/legal before deploying since one can predetermine all outcomes for all inputs.

3)Loggable – since the decision policy is represented as rules, rather than as a function, as in Neural nets (deep or shallow) or regression, each decision can be logged with the exact policy/rule that was used. This lets the organization recall for customer support, or for a GDPR challenge for explanation, the exact reason for any particular decision that was made.

Conductrics’ Machine Learning Rules

The following is a simple example of a tree view for a set of decision rules that select between three possible user experiences (the ‘A’, ‘B’, and ‘C’ experiences are represented as bars in the audience nodes) generated by Conductrics’ predictive targeting engine:

The model begins with everyone in a single group (the Everyone Audience on the far left) – as you would with simple AB testing. Then, one by one, Conductrics finds the user features that are most useful to discriminate between the user experiences (tech note: unlike standard tree algorithms, that build decision rules from the data, Conductrics unpacks a sparse set of rules that are implicitly represented within a predictive model (function approximation)). This is very similar to the game of 20 questions you may have played as a child. Conductrics tries to ‘ask’ the fewest questions possible, while still finding the best solution. In this case, we only need to include three user features: Rural or not; Mobile Device or not; and Registered or not.

It is also useful to see each targeted audience side by side, with additional details, such as audience size, as well as the predicted values of each option and the probability that each option is the best one.

When Conductrics is making automated decisions, it will follow the rules implied by this report, which as a program looks something like:
if ( Rural ) { return A; }
else if ( Mobile ) { return A; }
else if ( Registered ) { return C; }
else { return B; }

But how do you know if the targeting rules are working? Well, we can evaluate our little targeting program by running a type of AB Test, but where our predictive program is one of our test’s options. By default, Conductrics will take a random sample to use as a baseline to compare against the predictive targeting rules. For example, our targeting program returned about $2.77 per user, whereas a simple random play of the four base options returned just $2.10, for a lift of 31.4%

Conductrics also includes each of the individual options under random selection, which can each also be used as baseline (just keep in mind to specify sample sizes beforehand).

Using the decision tree representation for the predictive control logic, enables the machine learning to be both machine and human readable. As we see ever increasing demand for customer facing applications that embed AI and machine learning automation, expect to see a greater focus on the human interpretability of these systems.

If you would like to learn more about how we use ML at Conductrics, please get in touch.

Category: AI & Bandits

Machine Learning and Human Interpretability

About the author

Matt Gershoff

Machine Learning and Human Interpretability

About the author

Matt Gershoff

Related articles