What is it?
If you are pretty familiar with machine learning in general, skip ahead to “Human Input“.
Machine learning is an area and direction within the software and computer programming fields that mainly focuses on developing algorithms for prediction of certain variables based on historic data.
Machine learning is widely used within the advertising and media buying industries – to predict the likelihood of a user profile to perform an action, e.g. to click a banner or to make a purchase or download an app or fill out a form on a website.
Predicting the future
Companies develop complex software algorithms which analyse historic data and from it determine a pattern, that can then be used to make the prediction.
As a simple example, if for a campaign advertising toasters all users who went to the website after seeing the banner and bought one were coming from the London region – we can predict that serving more banners to Londoners will very likely yield more toaster purchases.
Such algorithms use the following data for their analysis:
- Geographic regions
- Websites where the banners are viewed
- Category of the website, e.g. travel or finance
- Time of day
- Day of week
- Platform, e.g. Windows, iPhone, Android or Apple Mac
- Browser (Chrome, FireFox, Safari)
- Search behaviour (e.g. users searching for “travel” related words)
- Banners clicked (e.g. users clicking “travel” related campaign banners)
- Demographic data (income bracket, children, pets, etc)*
* – Demographic data is either derived, i.e. assumed from browsing behaviour (a user regularly visiting pet supply websites is very like to have a pet) or provided by companies who have this data confirmed from surveys.
Anonymity
Most data out there these days is supplied anonymously. That is, there is no name, address, email, phone number attached to it, or any data that can lead to the real person.
PII (Personally Identifiable Information) has been a fairly hot topic for a while and is often misunderstood, misinterpreted and misused. PII simply means that if data is deemed PII sensitive, it could lead to the real human.
IP addresses are sometimes considered PII sensitive, but the only way you could get to the human behind the address is by getting the data from the ISP who has it connected to a physical address.
Frequency
Pattern recognition is all about numbers. The more events the algorithm can look at, the more confident its prediction will be.
For example, if one user from London buys a toaster, we cannot assume another one will.
If 1000 out of 10,000 do – we can.
Probability rapidly and proportionally increases as the number of events increase (within the overall scales of total events).
E.g. 10,000 out of 100,000 is a strong pattern, 10,000 out of 10,000,000,000,000 is not.
So all machine learning algorithms need a minimum number of events to predict confidently.
Human input
Whilst not that common these days, machine learning works great when it is not just left to its own devices 100%, but if small samples or “training” and “learning” sets are fed in – to assist the software in making its assumptions.
This is especially useful in language related and semantic machine learning, where the complexity is so vast and so “non digital/binary”, that making accurate predictions with software alone is difficult and very error prone.
So what is a training set?
A training set is a set of historic data with the prediction provided by a human.
By supplying enough of these, the software algorithm can find that crucial pattern so that next time it confidently make the prediction unaided.
Training sets can help the algorithm get a head start and start predicting correctly faster “straight out of the stables”.
It can also help course correct the algorithm and help speed up certain pattern detection that sometimes is obvious to a human but takes a while for the software to conclude.
Don’t forget, an algorithm does not take the vast resources of the internet into account or some enormous “computer hive mind” database. It just uses some fairly simple (for the computer) mathematical formulas to calculate probability. It does not have all the enormous learned memories humans bring to the table.
Closing words
Machine learning – like teaching children – is best done with human/adult assistance.
It can work without this, but it works even better – with.




