List of Machine Learning and Data Science Resources – Part 2

This is a follow up to a post to the list of Machine Learning and Data Sciences resources I put up a little while ago. This post contains some links to resources on clustering and Reinforcement Learning that I didn’t get to in the first post. Like the first one, it’s a bit haphazard, and is not meant to be definitive. Have fun, and feel free to post comments with your favorites.

Cinderella or the Ugly Stepchild – Reinforcement Learning

Almost every Machine Learning text or tutorial begins with something like ‘Machine Learning can be broken out into three primary tasks; Unsupervised Learning, Supervised learning, and Reinforcement Learning.’ After a quick overview of RL, they then blow off any more discussion of it.

Well, let’s correct that a bit.

What is an RL problem? Well, an RL problem is any problem where you are trying to figure out what the best course of action should be in a particular environment; basically it’s a problem of optimal control. A few examples are learning a robot controller, or learning a computer program to play a game like backgammon, or discovering the optimal ads to place in front of someone’s face when they are browsing the web. All of these problems can be thought of as an RL problem.

Unlike supervised learning, where you have the correct answers to train your learner, in RL problems you only get a reward signal back from the environment – some measure of goodness (or badness) associated with some action or policy you have taken. So with RL problems, you need to interact with an environment, in order to learn. Basically, it’s how we as people learn to navigate our world. We do something, and then see if it was good or not. If it was good, we keep doing it, if it is bad, we try something else.

Some related ideas from control theory are dynamic programing and optimal control.
First take a peek at MDPs from Wiki since MDPs are used as a framework for sequential decision problems.

TD, UMASS and the Pixies

While not cutting edge, you can’t go wrong with grounding yourself with Rich Sutton’s and Andy Barto’s intro text on RL. Here is a link to the free online version
Rich came up with Temporal Difference learning (TD) while he was working on his PhD at UMASS. Speaking of UMASS, check this out ! Yeah! FWIW, I took Don Levine’s Avant Garde Film class back in the day, which allegedly inspired the Pixies’ to write Debaser.

I’m looking at you Stanford.

Occasionally you bump into someone who wants to impress on you their data chops (c’mon, we all do it occasionally). Anyway, maybe they are involved in investing, or they have some startup or something in the valley. If they start dropping the following terms: ‘Stanford’,’ Coursera’, ‘Andy Ng’ along with the rest of the standard Bigdata buzzwords,start to get worried.
However, if they are talking about Stanford and Prof Ng while talking about Reinforcement learning, that is a whole different story. Why? Helicopters of course! How cool is that?!
Sure, Ng has jumped off into deep learning, but his group at Stanford has done a ton of stuff on applied RL. Take a look at his CS229 classes that cover RL.

Context is King

One of the many benefits to living in New York, is that there is a great community of folks involved in Machine Learning. One of them is John Langford. John has both helped build some large scale multi-armed bandit systems at Yahoo! (and I assume now at Microsoft) as well help push work on state of the art algorithms. Take a look at his blog Also, take a peek at John’s presentation with Alina Beygelzimer on Contextual Bandits (I make a cameo as the annoying question guy).

Also, if you want to play around with writing your own stuff go get John Myles White’s book

If you like the RL/Bandit stuff, but just want to apply it to your online app, please feel free play around with Conductrics – you can get a free account!

Group and Mix – Clustering Stuff

What is clustering anyway and what is a good clustering algorithm? Take a look at Shai Ben-David’s talk on the theory of clustering to find out. Surprisingly, there really isn’t a lot of theory around the general problem, but Shai is trying to fix that.

The Magic of Expectation Maximization

If you just want to know how to ‘do’ clustering you prob should have some idea of the EM algorithm. This video from Joaquin Quiñonero Candela covers the EM algorithm for clustering (K-Means, Gaussian Mixture Models). I couldn’t pass it up since he starts with the Supervised, Unsupervised, and Reinforcement learning set up, and then as expected, blows off RL 🙂 Ha!

Information and a Brief Diversion: Kulback and Liebier

How do we measure if things are different? Well, one way, is via the KL-divergence. The KL-D comes up all of the time, which is not that surprising since it is the measure of information gain. Also, its a great to drop this in conversations when you want to try to impress folks – hey, sometimes you just want to fit in with the data science gang. Anyway, this little guy is super important and will help form your mental bridge between Stats, Machine Learning, and Information theory. Just remember that it is a measure, not a metric, and you will be fine.
For a bit more on information theory, take a look at Mr Information theory himself, David MacKay (and check out those shorts he is rocking!).
Also, after you get into information theory a bit (pun unintended) ,maybe revisit the idea of BIGDATA in your mind, while thinking about entropy and Kolmogorov complexity. Does it make sense – eh, I am not sure.

The Russians thought of it first – Convexity and Boyd

You think your fancy machine learning or data science method is so brand spanking new that VCs
will shower money on you to get to market? Let Stephen Boyd disabuse you of your pretensions. While not exactly a course on Data Science, you will be surprised how much that he covers in this class . Boyd walks you through lots of your favorite ML algos from a mathematical programing perspective (Logistic regression, SVMs, etc.) This different approach can be really helpful to highlight some idea or concept that may have slipped by you. For example, we mucked around with the Schur complement in classes I have taken; I never understood that it was conditional variance for multivariate normal until reviewing his class. I think you will find little connections like that happening all of the time with his class. He also does a nice job of walking you through different loss functions (hinge, quadratic, Huber). As a bonus, he is hilarious – I often would chuckle to myself while watching his lectures on my subway rides. No way anyone ever caught on that I was watching Convex Optimization.
Also, you learn that the Russians did it all first.

*** Update ***
I reached out to Stephen Boyd and he suggested this link . It has his book online as well as links to the courses. Also, in his words ‘a particularly fun source of stuff, that not many people know about, is the additional exercises for the course, available on the book web site. this is dynamically updated (at least every quarter) and contains tons of exercises, many of them practical, on convex optimization and applications.’ To be honest, these exercises are a bit (well, maybe more than a bit) beyond my capacity, but have at it.

Please feel free to comment and sign up for Conductrics – start rolling out your own targeted experiments today!

Posted in Testing and Data Science | 1 Comment

A List of Data Science and Machine Learning Resources

Every now and then I get asked for some help or for some pointers on a machine learning/data science topic.  I tend respond with links to resources by folks that I consider to be experts in the topic area.   Over time my list has gotten a little larger so I decided to put it all together in a blog post. Since it is based mostly on the questions I have received, it is by no means complete, or even close to a complete list, but hopefully it will be of some use.  Perhaps I will keep it updated, or even better yet, feel free to comment with anything you think might be of help.

Also, when I think of data science, I tend to focus on Machine Learning rather than the hardware or coding aspects. If you are looking for stuff on Hadoop, or R, or Python, sorry, there really isn’t anything here.

Neo Makes Cheese

Before you do anything else, start boning up on your linear (matrix) algebra. This is the single most important thing you can do to get yourself bootstrapping your ML education.

This is the deal, and I don’t care what anyone else has told you, if you want to have any hope in understanding what is going on in Machine Learning, Data Science, Stats, etc. you have got to get a handle on Linear Algebra. 

Painful! Trust me, I know. I got an ‘F’ the first time I took it in college.  I had no idea what was going on. The only thing I really remember was the professor shouting at us after poor marks on homework and tests, ‘How can you make cheese if you don’t know where milk comes from!? Its plain, common ordinary horse sense!’

Kinda nuts, and it really didn’t make total sense, but his point was, you have got to have the basics down before you can actually make anything useful.

The flip side is that once you start getting a feel for linear algebra, you can much more easily hop around from various advanced topics. This is because much, but not all, of ML topics rest on applications of linear algebra.

Where to start? For me, there really is only one place that I go to get a refresh on a topic that I realize I don’t really understand, and that is Gilbert Strang’s undergrad class at MIT. Just an awesome intro course and it makes me covet the students who get to go to MIT. See his class here Linear Algebra Class

General Machine Learning

Now this resource is a big one – and I think just this link makes this post worth it. As far as I am concerned, is one of the most valuable sites on the internet.  Sure, maybe some of the other ‘disruptive’ educational sites are useful, but almost everything Machine Learning is in here and then some – Mother lode, Paydirt, or whatever you want to call it, you just hit it with this.
But don’t horde it, pass this resource along.

Also, a good first lecture on ML is Iain Murray’s tutorial from a machine learning summer school– Here is the lecture Murray Teaches ML

On to some TOPICS

LDA for Topic Models or How I Learned to Pronounce Dirichlet
One, this is useful, Two, data science folks are on about this so if you want to fit in – or hit up any ‘Big Data’ VCs, you better be able to name drop this, and be able to back it up if you get called out – not that the VC will be able to call you out ;). 
David Blei is probably the best source to start looking for Topic modeling research applications

Matt Hoffman – Matt is over at ADOBE now, but he wrote some python code for online Topic Models (I think this is his research as well)- check it out

From LDA to SVD to LSI
A non-Bayesian/probabilistic approach to topic modeling is Latent semantic indexing, where you use a version of SVD (actually I think it is really basically Principal components -which is a related eigendecomposition/factorization). There is a wiki here on LSI to get you started I know it is normally kinda lame to just link to Wiki, but it is a pretty good overview.

SVD or Recommending Everyone’s Favorite Factorization
If you didn’t feel the need to look over Strang’s course, take a look at this class on the SVD , which is the basis for most every recommendation system – I highly recommend getting at least the basic idea down.

Bayesian Or Frequentist Or Who Cares?

Huge topic that I am not going to really try to flesh out, but Michael Jordon (the one at Berkeley, not Chicago) has a nice lecture contrasting the two: Bayesian or Frequentist

Also check out MJ on the Dirichlet Process – not the best audio, but since he also sets up the Parametric/Non-Parametric and Bayesian/Frequentist  classifications on slide two, it is worth bending your ear a bit. MJ DUNKS!

On to Non Parametric Bayesian approaches

There seems to be a bit of chatter in the startup/ Big data space around Non Parametric Bayesian methods.  If you want to see more after checking out MJ’s talk, take a look at David MacKay’s tutorial on the Gaussian Process

Also check out Gaussian Process for Machine learning by Chris Williams and Carl Rasmussen here ‘Gaussian Process for Machine Learning’.  What is great is that this book is online and free! What is not so great is that it is pretty hairy going, but take a peak at chapter 6, it has a comparison of GPs with Support Vector Machines (SVMs).

PMR Madness! Joints and Moralizing

I took Chris Williams’ class on Probabilistic Models and Reasoning   several years back when I was studying AI at Edinburgh.  Take a look at the slides etc. There is stuff on graphical models, junction tree, etc.  Chris is one of those crazy smart guys who I use in my mind’s eye as a litmus test for when someone is trying to pass themselves off as an expert in the space.  Sort of a where are they on the CW scale – most often it’s pretty low 😉

Also see Carl’s talk on Gaussians

The First Order of Business – Let’s Get Stochastic

Online vs Offline – everyone is all agog about bigdata hadoop etc., but if you are interested in more than hardware/IT, you will want to think about how you are going to use data and what types of systems approaches are most appropriate. If you are going to be doing ML on a lot of data you should be aware of SGD. Leon B is the man for this – here is a video and his home page

Deep Learning or Six Degrees of Geoff Hinton

Here are just a few of those that have been students or post docs at his lab; Yann LeCun (ANNs/Deep Learning), Chris Williams, (GPs), Carl Rasmussen (GPs), Peter Dayan (NeuroScince and TD-Learning), Sam Roweis , and recently Iain Murray (see lecture above).

After the NYTimes had a piece on deep learning there was a fair amount of online chatter about it.  What is it? Let Yann LeCun tell you. Yann is a professor over at NYU, and has worked with Neural Nets for quite some time, using energy models and his convolutional net. Take a look at Yann’s presentation he gave to the Machine Learning summer school I attended back in ’08.

** Update **

Yann suggested the following updated links: 1) A recent invited talk at ICML 2012 and 2) some slides from a more recent summer school IPAM. I have not had the time to take a look, but since Yann suggested these personally, check them out.


As an aside, one of the great things about those Pascal machine learning summer schools is that you get to hang with these folks informally. So chat SVMs and feature selection with Isabelle Guyon at dinner, lunch with Rich Sutton, and perhaps talking shop with Yann over a glass of Pineau. If you can make it happen, I highly recommend attending one of these, next one looks to be in Germany.

Also, feel free to peruse Geoff Hinton’s site for some goodness on Autoencoders and RBMs.

NLP, but not LDA

I couldn’t figure out how to conjugate the verb ‘to be’ until I was like 12, so, not surprisingly, I never really got into NLP.

I was, however, fortunate to take a class with Philip Koehn, while I was at Edinburgh, who has helped drive much of the recent work on machine translation – I’ll let him explain it to you here.  You can also get his book if you are interested Statistical Machine Translation, but to be honest, I haven’t read it.

You may have noticed that he used the term informatics. Here is a description/definition of it on the Informatics web site at the U of Edinburgh (there is a PDF at the bottom of that page with more detail). I actually think informatics is a better term than Data Science, but hey, out with the old in with the new(ish).

Named Entity – The Subjects and Maximizing Entropy
If you are into NLP you might want to be able to figure out who the players are in text documents.  You will need to do this
Maybe this is good – I have never used it –
If you want to do your own NE model, MAX Ent models have been used – here is a resource with MAXENT for NLP.  I admit, I don’t know what the state of art is, or if this is still used, so feel free to comment with some better/newer stuff.

 Okay, this post is getting long,  so I will wrap it up. I didn’t get to Reinforcement learning or Cluster methods (other than LDA), so perhaps I will extend this post, or write an follow up soon. Please feel free to add thoughts via the comments and if you haven’t yet, please sign up for your free Conductrics account.
Sign Up for a free account!

Posted in Analytics, Testing and Data Science, Uncategorized | 7 Comments

Decision Attribution for Multi-Touch Optimization

Right now online advertisers and marketers have a lot of interest in attribution analysis.  Marketers are struggling to determine the impact of each individual campaign, across various channels, that make up their online marketing efforts.  While they can discern the aggregate effects, it is much more difficult to disentangle the individual effectiveness from any given campaign or marketing channel.

This is actually a tough problem, and it is similar to the problem faced when attempting to optimize over multiple customer touch points across your own organization.

For basic single touch point tests, it is fine to just run an AB style test.  Conductrics, like other solutions, learns to the best performing options by keeping track of the following:

  1. the number of times each test option is presented to users.
  2. the total value of the goals achieved (e.g. value of shopping cart) after each test option.

 As you can see, it’s not too complicated really.  Of course, when user targeting is activated, things get a bit more complicated under the hood, but the optimizer still uses essentially just these two bits of information to drive the optimization forward.

Things change a bit when we are solving muti-touch optimization problems. 

Here is why.  Not only do we need to know many goals happen after each test option, but we also need to learn how often the user goes to the other touch-points in your application. This means that we need to learn a bit more about the dynamics of the site and how the test impacts that dynamic.

Site Dynamics

Learning the dynamics means keeping track of not only when users convert, but also keeping track of how they move between the different touch points. For example, let’s say we want to run tests on three pages simultaneously. The following is a simple graph of what that might look like.


Okay, so this is a little messy looking, but bear with me.  We have three pages, each running an A/B style test.  The arrows represent the probability (how often) a user will move from one page to another after being exposed to one of the test options.  The actual conversion event, or goal, happens off of Page C.  By linking the three touch-points and the three tests, Conductrics will not only learn the option on Page C that has the highest goal conversion, but will also learn which options on Page A and Page B, will ultimately drive the most traffic to Page C, sort of acting like lead generators for Page C. 

Isn’t this Just Multivariate Testing?

In some ways, yes. Like Multivariate testing, we are trying to optimize with more than one decision variable. However, Conductrics takes advantage of the fact that in multi-touch optimization, decisions take place at different times and places. That lets us employ a type of artificial intelligence, sometimes known as approximate dynamic programing or reinforcement learning.

Its sort of like learning how to play a good game of Backgammon, the site may need to make a series of decisions before the user reaches a conversion event. Or there may be multiple conversion events so that Conductrics needs to learn and harmonize the decisions taken at different locations in the site.

What is neat is that you don’t have to really do anything special to setup Conductrics to learn this much more complicated type of problem. You can create a multi-touch agent, just as you would a normal AB style test agent.  You just tell us the decision-points, and the decisions for each decision point. That is it, we keep track of the connections and math so you can just focus on improving your customers experiences.




Can I also use targeting?

You bet you can. Not only can you add targeting, but you are also free to set up multivariate like tests at the touch-points –  that’s what is so great about the Conductrics’ agent method, it gives you total flexibility in setting up your optimization efforts.  

This table gives you some idea of the types of agents that you can set up, once you have signed up!


 As you can see, AB Testing, Multi-Armed Bandits, and Behavioral Targeting, can all be seen as just different versions of the same intelligent agent.  Cool, right?! Please comment with any thoughts or questions and don’t forget to go sign up for your free trial account.

Posted in Analytics, Testing and Data Science, Uncategorized | Tagged , , , , | Leave a comment

Conductrics’ Confidence and Lift Report – The Basics

While enabling Conductrics’ adaptive testing is a powerful way to auto-optimize your app, there are times when you might want to run  non-adaptive tests. Maybe you want to test something that you will apply in other media, or perhaps you want to do online research on your customers, to give you additional insights into their wants and needs.

Adaptive or Non-Adaptive – Your Choice

Regardless of the reasons, we have tried to make it super easy to toggle between adaptive and non-adaptive testing. In fact, in our UI you can just click a button at any time during the project to switch between the two modes of testing.

Adaptive Setting

Ok, so let’s say you decide to run a standard A/B style test. You are going to want to have some way to evaluate the results of your experiment. Your Conductrics account gives you a couple of ways to get at your results.

Accessing the Report

The first, and simplest way, is our Confidence report. This report is accessible via your online Conductrics’ console under the Confidence tab displayed for each of your agents.

 Report Tabs

Clicking on the Confidence tab brings you to the following report.


Lift Report

I won’t go into all the detail here but let me cover a few of the main parts of the report:

  1. Count – how many times each option has been selected
  2. Estimated Value – this is the mean or average value of selecting each option, along with a visual representation of the confidence interval. These values are based on measured success of each option with respect to your selected conversion goals, and their associated values.
  3. Confidence – a simple visual representation of the statistical hypothesis test that we run, which is used to determine if we have enough evidence to conclude that the option in question is out-performing the default case.
  4. Predicted Lift – This calculation tells us how much of an improvement we should expect if we were to only use this option. We calculated this two ways, the first uses the selected default as the baseline, and so the result is how much better we should expect to do relative to the default. In some cases, say when it is a totally new project, there really isn’t a default. So we also provide a lift calculation over the average value of all of the options.

We like this report because you can get answers pretty quickly from just glancing at how well each option is doing.


The Reporting API

Of course, following our everything with an API philosophy, you can also access the report results via the Conductrics Report API. I won’t get too bogged down on the API in this post but here is a quick look in case you might want to play around with it.

Getting the report results is as simple as calling the API, which looks something like the following for our report above:

This URL will return a JSON object with all of the data in the report, as well as some additional nitty-gritty that your stats person can poke around in. Here is a bit of the JSON for option-b in our report.

Report API

The JSON contains information about the agent as well as the option and what decision and decision-point it belongs to.  Additionally, if the agent is using targeting, we would get back a result for each segment or user feature. 

Along with the basic data points provided in the UI report, the API passes back the t-score, p-value, as well as the variance and standard deviation of the results. There are a few other data points in there, but they aren’t too important, we can discuss them in a future post on hypothesis testing. Oh,  for those interested in these sorts of things, we use the Welch pairwise t-test, and apply a Bonferroni adjustment to help alleviate repeated test bias.

Also, some of you may have noticed that the Confidence report only provides pairwise comparisons. We will be porting over our multivariate analysis report (essentially main effects ANOVA) down the road, and look forward to your feedback.

Make your own Adaptive or standard AB tests. Sign up for a Free access today at Conductrics

Posted in Analytics, Conductrics API, Reporting, Testing and Data Science, Uncategorized | Tagged | Leave a comment

Using our jQuery Plugin for trying out new content on your web pages

I wanted to take a few minutes to show how easy it is to use the Conductrics service with our jQuery Plugin.

Getting Set Up

First, you’ll include the script file, just as you would for any jQuery plugin. Our file is called conductrics.jquery.js, which you should grab from Github and include somewhere after the script tag for jQuery itself, like so:

  • Replace the owner and apiKey codes with the ones from your Conductrics account (you can find these in the welcome email from us when you sign up for our service, or by signing into your account at
  • Also, make up an agent code for your test. Each test should have a unique code, so we can tell your tests apart. (You can use letters, numbers, and dashes in the code.)

Trying Out New Content via a Simple Show / Hide Test

Let’s say you have some new content somewhere on your page. It might be a new “contact us” form, call to action, or banner image. It doesn’t really matter.

Just to keep it simple, let’s assume it’s a new image, included in your page HTML using a normal img tag. 

<img src="new-image.jpg">

To have Conductrics show the new image to some people, and keep it hidden for others, use an ordinary jQuery selector to select the desired img tag, then call our plugin’s toggle method, like so, like so:


Here’s what happens when the page loads in a user’s browser:

  1. The jQuery selector “finds” the image tag.
  2. The image is hidden initially while the Conductrics service is contacted.
  3. The plugin contacts the Conductrics service, which decides whether the image should be shown or hidden for this particular page view. At first, that’s just a random selection, but over time it will favor showing it if the image has proven to encourage conversion–or will favor hiding it if the image has proven to pull conversions down.
  4. When the plugin gets the show/hide decision from the Conductrics service, it shows the image or leaves it hidden as appropriate.

All of this happens very quickly. As far as the user is concerned, the image appears (or doesn’t appear) as a normal part of the over page loading process.

Of course, not just images

Because you can use any jQuery selector to indicate what you want to show or hide, you can test out just about any page content.

For instance, if you have some kind of new “callout” somewhere on the page (which might contain text, images, whatever), and it has new-callout as its ID, something like this:

<div id="new-callout">
  <h2>Check this out<h2>
  <p>Any content here...</p>

That means you can target the callout by its ID in the jQuery selector, which means testing it via show/hide would look like this:


Again, because you’re using normal jQuery selectors, your selector can use CSS class names, which is handy when the “new” content you want to test out is sprinkled around in different parts of a page.

<img src="new-banner.png" class="new-ux">
<div class="new-ux">Some new content</div>
<a href="..." class="new-ux">Some new link</a>

To show or hide all of the new content, track its effectiveness, and automatically favor the new content if it proves to actually work well for your visitors, you would use the same basic ‘toggle’ code as shown earlier with a jQuery selector that finds everything tagged with the ‘new-ux’ class, like so:


Beyond Showing and Hiding

This post kept it simple by just explaining how to show and hide content selectively.

But there’s a lot more that you can do with the plugin, especially if you know just a little bit of jQuery. It’s easy to do stuff like swap out content, decide whether to show interstitial overlays, decide whether to use animations, decide whether to use a new take on navigation, and more. Because you have the full power of jQuery available, you can create just about any kind of test you can dream up.

Take a look at the other methods available on the plugin’s Github page, and look for more blog posts here in the near future!

Posted in Code Examples, Conductrics API | Leave a comment

Bandit Optimization and Emetrics

This past week, I had the pleasure of getting to speak at Emetrics Stockholm. For those not aware, Emetrics is one of the largest digital analytics conferences, with local events in several locations throughout the year.

It was my first time in Stockholm. It is an absolutely gorgeous city. Plus, if you like Scotch, you are in luck, because Stockholm is a town that has a deep Scotch shelf. I was even able to get a dram of Dallas Dhu, well not a dram exactly, really more like 20 cl.

The conference was good clean fun, with a lot of interesting folks in attendance. Given the high level of interest in experimentation and AB style testing, I thought it might be fun to give a presentation on the Multi-armed Bandit problem.  I tried to keep it simple, and covered just Epsilon-Greedy and Upper Confidence Bounds (UCB), to try to give folks a flavor of the various approaches. I have put the slides here, so feel free to poke around.

Shout outs go to Lars Johansson, the team from Outfox, and the Rising Media posse for putting it all together. Thanks for a great conference.



Posted in Analytics, Testing and Data Science, Travel | Tagged , , | Leave a comment

New Release of Conductrics!

We are super excited to announce the release of our updated version of Conductrics! After getting feedback from both developers and digital analysts, we have given Conductrics a complete refresh.  Here are a few of the major improvements:


The system has been optimized for speed, ensuring that mPath is superfast and can easily scale to any sized customer’s needs.


We have rewritten the API to make it even easier to get up and running in only a few minutes.  In addition, we have opened up all of the reporting via a new report API.  Of course, you can still use Conductrics reporting UI get insights from your projects, but with the new report API you have full flexibility in what and how to display your data.


Speaking of reports, we will be rolling out a set of advanced reporting tools. First up is our new Targeting Impact report.  Conductrics has always maintained internal predictive models about the effectiveness of your optimization options across all targeting features and segments.  Unfortunately, all of this useful information was really only accessible to the Conductrics’ internal optimization processor –where it is used it to make the best selections for each user.  With the new Targeting Impact report, you can get access to this rich information via an intuitive graphical reporting tool.  Now you can discover at a glance which optimization efforts work best for each type of user.


Best of all, we will be offering offering a free developer account! We want to make using Conductrics as open and as transparent as possible so that you can evaluate it fully. Go ahead, just sign up and get access to the complete Conductrics platform which includes the following:

  1. AB and multivariate Testing via API, with all of the statistical testing reporting you need to evaluate your tests.
  2. Automated Optimization – great for when have time sensitive online campaigns, or if you want user targeting, let Conductrics optimize for you using its built-in machine learning methods.
  3. Complete reporting – you will get access to all of the reporting, so that the business analytics team members can poke around and see all of the insights they can get from using Conductrics.

So go ahead and sign up and if you have any questions or suggestions just reach out to us – we would love to hear from you.


Posted in Conductrics API, Releases | Tagged | Leave a comment

Balancing Earning with Learning: Bandits and Adaptive Optimization

While AB Testing is currently the default method for optimizing online conversions, there are frameworks other than AB testing for selecting between competing optimization options.  One of the main alternative approaches is known as the Multi-Armed Bandits problem.  Solutions to the so called bandit problems have several nice features that often make them a good choice for your optimization problems.

This post looks a little long, what am I going to learn?

You will learn how you might be able to squeeze even more from your optimization efforts by using alternatives to AB Testing.  Additionally,  you should have a basic handle on what the multi-armed bandit problems it and how it relates to your optimization goals.

Why Bandits?

A bandit is another name a slot machine, and online conversion optimization is in some ways analogous to the task of how to pick between slot machine with different payoffs.  So think about walking into a casino, and you have reason to believe that different machines have different payoffs.

If you knew beforehand which machine had the highest payoff, you would play that one exclusively to maximize your expected return from playing.  The difficulty is that you don’t know which machine is best beforehand, so you have to deduce which slot machine has the highest pay off.  To do this, you are going to have to try them all out in order to get a sense of the expected return from playing each machine.

But how to do this?  Run an AB style experiment?

You could do that, but remember; each time you play a poor performing machine, you lower your take that you walk out of the Casino with that night.  In order to maximize how much money you walk out of the casino with, you will have to be efficient with how you collect your data.

Learn and Earn

The solutions to Bandit style problems try to account for this data acquisition cost. They do it by trying to balance using the current best guess about the performance of each option and value of collecting more data to improve upon these guesses.

This tradeoff between using currently available information and collecting more data is often referred to as explore exploit problem, but some like to call it earning while learning.  You need to both learn in order to figure out what works and what doesn’t, but to earn; you take advantage of what you have learned. This is what I really like about the Bandit way of looking at the problem, it highlights that collecting data has a real cost, in terms of opportunities lost.

Unlike AB or multivariate testing, which all use basically the same methodology (apologies to Fisher, Neyman, and Jeffreys), there is no one standard approach for Bandit solutions, but they all try to incorporate this notion of balancing exploration with exploitation. Technically, these are problems that try to minimize what’s known as regret.


Regret is the difference between your actual payoff and the payoff you would have collected had you played the optimal (best) options at every opportunity.

This can get a little confusing, so to make this clearer, let’s compare AB Testing to one of the simpler bandit algorithms.  While there are many different approaches to solving Bandits, one of the simplest is the value weighted selection approach, where the frequency that each optimization option is selected is based on how much it is currently estimated to be worth. So if we have two options, ‘A’ and ‘B’, and ‘A’ has been performing better, we weight the probability of selecting ‘A’ higher than ‘B’.  We still randomly select between the two, but we  give ‘A’ a better chance of being selected.

First lets take a look at how AB Testing selects during learning compared our Bandit method.  For our example we consider a problem where we are trying to optimize with three possible options; ‘A’,’B’, and ‘C’.

In AB Testing, we see that each time we make a selection, each of the options gets the same chance to be selected.

In the Bandit approach, this isn’t the case.  The probabilities for being selected at each turn depend on the current best guess off how valuable each option is.  In the above case, Option A has the highest current estimated value, and so is getting selected more often than either options B or C, and likewise, option B has a higher estimated value than option C.

Since the selection probabilities are based on the current best estimates for each option, the Bandit selection probabilities can change over time as we update our value estimates for each option.  Let’s imagine we set up an optimization project to try the out three possible options.

Unbeknownst to us at the start, les say that the top performing option is worth $2.00, the next best, $1.00 and the worst option only $0.50. Let’s see how Bandit’s continuous Learn and Earn balancing act looks compared to AB Testing in the graph below.

With the AB Testing approach we try out each of the three options with equal proportions until we run our test at week 5, and then select the option with the highest value.  So we have effectively selected the best option 44% of the time over the 6 week period (33% of the time through week 5, and 100% at week 6), and the medium and lowest performing options 28% of time each (33% of the time through week 5, and 0% at week 6).  If we the weight these selection proportions by the actual value of each option, we get an expect yield of about $1.31  per user over the 6 weeks (44%*$2 + %28*$1 + %28*$0.50=$1.31).

Now lets look at the Bandit approach.  Remember, the bandit attempts to use what it knows about each option from the very beginning, so it continuously updates the probabilities that it will select each option throughout the optimization project.

In the above chart we can see that with each new week, the bandit reduces how often it selects the lower performing options and increases how often if selects the highest performing option.

So, over the 6 week period the bandit approach has selected the best option 72% of the time, the medium option 16% of the time, and the lowest performer on 12% of the time.  If we weight by the average value of each option we get an expected yield of about $1.66 per user.  In the case the bandit would have returned a 27% lift over the AB Testing approach – ($1.66-$1.31)/$1.31 – during the learning time of the project.

The following chart show the same idea in a slightly differently. Here we can see what the average value of each user will be during out optimization project.

For the first 5 weeks, the AB Test has an expected yield of $1.17 per user (33%*$2.00 + 33%*$1.00 + 33%*$0.50), and then an expected return of $2.00 once Option C has been selected.

The Bandit, on the other hand, starts at $1.17, like the AB Test, but then increases the average value continuously throughout the project. So you get more payoffs while learning.

In fact, the area under the bandit value curve and above the AB Testing value curve is the total extra payoff we get by using the bandit.  If we averaged around 1,000 visits per week, we would take in $4,666 while learning with AB Testing but $6,810 with Bandit learning – a lift of 46%.

Of course, the longer you extend your project out for, the smaller this incremental payout is of the total.  This means that for short lived projects, like online campaigns, are particularly well suited for adaptive bandit methods – since they are active for a limited time.

Why Should you Care?

There really are three main reasons to prefer Bandits over AB Testing

  1. Earn while you Learn.  Data collection is a cost, and we ought to try to do it as efficiently as possible.  Using a bandit style approach lets us at least consider these costs while we run our optimization projects.
  2. Automation – the bandit approach is a natural way to formulate the online optimization problems for automating the selection optimization with machine learning.  This is especially true when applying user targeting, as correct AB tests becomes much more complicated in this situation.
  3. A changing world – Bandit approaches are useful when you have an optimization problem where the best options can change over time.  By letting the bandit method always leave some chance to select poorer performing option, you give it the opportunity to ‘reconsider’ the option effectiveness.  If there is evidence that it is performing better than first thought, it will be automatically be promoted and used more often. It provides a working framework for swapping out low performing options with fresh options, in a continuous process.

Of course, there is no guarantee that using a Bandit method will perform better than running an AB Test. However, there is also no guarantee that using either a Bandit or AB Test selection approach is necessary going to be better than just guessing – but on average it should be.

To wrap up, Bandits are a continuous learning method for selecting between competing options. They weight the competing selections between what currently seems to work best and the potential benefits of collecting more information to improve future results.

Make your own adaptive or standard AB tests. Sign up for a Free access today at Conductrics

Posted in Testing and Data Science | Tagged , | 2 Comments