Image 01

21st Century AI

A blog about achieving meaningful Artificial Intelligence

Posts Tagged ‘Cookie Monster’

Learning Without Supervision (Part 1)

Tuesday, August 10th, 2010

When I was a kid I read about Summerhill School (the book seems to be out of print but see here and here) and I was extraordinarily impressed. I think to some extent that a lot of the trouble that I had in school was because I wanted the stodgy elementary and high school that I attended to be more like this wonderful Summerhill were students were just encouraged to learn on their own, without supervision if they chose. I’ve never been much of a fan of rules so perhaps it shouldn’t be much of a surprise that when I began my doctoral research in AI I was more interested in ‘unsupervised machine learning’ rather than ‘case based’ (rules) reasoning.

One of these things...

Cookie Monster explains Unsupervised Machine Learning and Clustering

Unsupervised Machine Learning (specifically ‘clustering’ which is the form of machine learning that interests me) to quote Wikipedia, “is a class of problems in which one seeks to determine how the data are organized.” That’s a pretty succinct description of the subject, but there isn’t much more information on the page. Unsupervised Machine Learning (okay, I’m getting tired of typing this mouthful, so henceforth it will be ‘UML’) first appeared in 1980s and was an outgrowth of psychology. At its core UML is not that different from the old Sesame Street song (“One of these things isn’t like the other…“).

Let’s look at Cookie Monster’s data set: he has four plates, three of the plates contain two cookies and one plate contains three cookies. Technically we would say that Cookie Monster’s data can be clustered using certain object attributes. In this case, the only significant object attribute for clustering is the number of cookies. So, Cookie Monster’s data set of four objects (plates) and nine cookies would (should) be divided into two clusters: one cluster containing three plates each with two cookies and one cluster of one plate with three cookies.

Obviously, this is an extremely simple problem involving just four objects and only one attribute that is found in all objects. But, hopefully, you can see how this is a very powerful concept. Instead of separating plates of cookies into clusters (or groups) imagine separating battlefields into similar tactical situations (my doctoral research), or 3D topographical data into likely places to drill for oil, or find diamonds, or situations in a baseball game when it’s a good idea to send in a substitute runner, or…

Okay, that’s the general idea for UML; in the next episode we’ll push on and explain how we will accomplish clustering. Warning: there will be a little math involved.