An extremely vivid memory from childhood—probably about seven or eight years old. Waking up and coming downstairs with an absolute, unshakable conviction that what I wanted to do with all of my spare time for the next few months was to build near-full sized fairground rides in our back garden. I don’t know where this came from; prior to that point I had no especial interest in fairground rides, beyond the annual visit to the Goose Fair. I wanted to go into the garage immediately and start measuring pieces of wood, making designs, etc. It took my parents a couple of hours to dissuade me that doing this was utterly impractical, against my deep, passionate protestations. Truly, I cannot think of anything before or since that I wanted to do with such utter conviction.
Archive for the ‘Making’ Category
Memory (1)
Tuesday, July 24th, 2018The Extensional Revolution
Wednesday, July 16th, 2014We are on the threshold of an extensional revolution.
Philosophers draw a distinction between two ways of describing collections of objects. Intensional descriptions give some abstract definition, whereas extensional descriptions list all examples. For example, consider the difference between “the set of all polyhedra that can be made by joining together a number of identical, regular, convex polygons with the same number of polygons meeting at each vertex” (intensional), and “the set {tetrahedron, cube, octahedron, dodecahedron, icosahedron}” (extensional).
Despite its claims to be (amongst other things) the science of data, computer science has been very intensional in its thinking. Programs are treated as realisations of descriptive specifications, satisfying certain mathematically-described properties.
As more data becomes available, we can start to think about doing things in an extensional way. The combination of approximate matching + the availability of large numbers of examples is a very powerful paradigm for doing computing. We have started to see this already in some areas. Machine translation of natural language is a great example. For years, translation was dominated by attempts to produce even more complex models of language, with the idea that eventually these models would be able to represent the translation process. More recently, the dominant model has been “statistical language translation”, where correlations between large scale translated corpora are used to make decisions about how a particular phrase is to be translated. Instead of feeding the phrase to be translated through some engine that breaks it down and translates it via some complex human-built model, a large number of approximations and analogies are found in a corpus and the most dominant comparison used. (I oversimplify, of course).
More simply, we can see how a task like spellchecking can be carried out by sheer force of data. If I am prevaricating between two possible spellings of a word, I just put them both into Google and see which comes out with the most hits.
Once you start thinking extensionally, different approaches to complex problems start appearing. Could visual recognition problems be solved not by trying to find the features within the image that are relevant, but by finding the all the images from a vast collection (like Flickr) that approximately match the target, and then processing the metadata? Could a problem like robot navigation or the self-driving car be solved by taking a vast collection of human-guided trajectories and just picking the closest one second-by-second (perhaps this corpus could be gained from a game, or from monitoring a lot of car journeys)? Can we turn mathematical problems from manipulations of definitions into investigations driven by artificially created data (at least for a first cut)?
The possibilities appear endless.