"Real Artists Ship"

Colin Johnson’s blog


Archive for the ‘UniKentComp’ Category

Ever tried. Ever failed. No matter. Try Again. Fail again. Fail better.

Wednesday, April 26th, 2017

Here’s something interesting. It is common for people in entrepreneurship and startup culture to fetishise failure—”you can’t be a proper entrepreneur until you’ve risked enough to have had a couple of failed businesses”. There’s some justification for this—new business ventures need to try new things, and it is difficult to predict in advance whether they will work. Nonetheless, it is not an unproblematic stance—I have written elsewhere about how this failure culture makes problematic assumptions about the financial and life-circumstances ability to fail without disastrous consequences.

But, the interesting point is this. No-one ever talks like this about jobs, despite the reality that a lot of people are going to try out a number of careers before finding the ideal one, or simply switch from career to career as the work landscape changes around them during their lifetime. In years of talking to students about their careers, I’ve never come across students adopting this “failure culture” about employeeship. Why is it almost compulsory for a wannabe entrepreneur to say that, try as they might, they’ll probably fail with their first couple of business ventures; yet, it is deep defeatism to say “I’m going into this career, but I’ll probably fail but it’ll be a learning experience which’ll make me better in my next career.”?

Contradiction in Law

Saturday, February 11th, 2017

Why aren’t more legal-regulatory systems in conflict? A typical legal decision involves a number of different legal, contractual, and regulatory systems, each of which consists of thousands of statements of law and precedents, that latter only fuzzily fitting the current situation, with little meta-law to describe how these different systems and statements interact. Why, therefore, is it very rarely, if at all, that court cases and other legal decisions end up with a throwing up of hands and the judgement-makers saying “this says this, this says this, they contradict, therefore we cannot come to a well-defined decision”. Somehow, we avoid this situation—decisions are come to fairly definitively, albeit sometimes controversially. I cannot imagine that people framing laws and regulations have a sufficiently wide knowledge of the entire system to enable them to add decisions without contradiction. Perhaps something else is happening; the “frames” (in the sense of the frame problem in AI) are sufficiently constrained and non-interacting that it is possible to make statements without running the risk of contradiction elsewhere.

If we could understand this, could we learn something useful about how to build complex software systems?

The Fallacy of Formal Representations

Friday, September 9th, 2016

I went to an interesting talk by Jens Krinke earlier this week at UCL (the video will eventually be on that page). The talk was about work by him and his colleagues on observation-based program slicing. The general idea of program slicing is to take a variable value (or, indeed any state description) at a particular point in a program, and remove parts of the program that could not affect that particular value. This is useful, e.g. for debugging code—it allows you to look at just those statements that are influential on a statement that is outputting an undesirable value—and for other applications such as investigating how closely-coupled code is, helping to split code into meaningful sub-systems, and code specialisation.

The typical methods used in slicing are to use some formal model of dependencies in a language to eliminate statements. A digraph of dependencies is built, and paths that don’t eventually lead to the node of interest are eliminated. This had had some successes, but as Jens pointed out in his talk, progress on this has largely stalled for the last decade. The formal models of dependency that we currently have only allow us to discover certain kinds of dependency, and also using a slicer on a particular program needs a particular model of the language’s semantics to be available. This latter point is particularly salient in the contemporary computing environment, where “programs” are typically built up from a number of cooperating systems, each of which might be written in a different language or framework. In order to slice the whole system, a consistent, multi-language framework would need to be available.

As a contrast to this, he proposed an empirical approach. Rather than taking the basic unit as being a “statement” in the language, take it as a line of code; in most languages these are largely co-incident. Then, work through the program, deleting lines one-by-one, recompiling, and checking whether the elimination of that line makes a difference in practice to the output on a large, comprehensive set of inputs (this over-simplifies the process of creating that input test set, as programs can be complex entities where producing a thorough set of input examples can be difficult, as sometimes a very specific set of inputs is needed to generate a specific behaviour later in the execution; nonetheless, techniques exist for building such sets). This process is repeated until a fix point is found—i.e. none of the eliminations in the current round made a difference to the output behaviour for that specific input set. Therefore, this can be applied to a wide variety of different languages; there is no dependency on a model of the language semantics, all that is needed is access to the source code and a compiler. This enables the use of this on many different kinds of computer systems. For example, in the talk, an example of using it to slice a program in a graphics-description language was given, asking the question “what parts of the code are used in producing this sub-section of the diagram?”.

Of course, there is a cost to pay for this. That cost is the lack of formal guarantee of correctness across the input space. By using only a sample of the inputs, there is a possibility that some behaviour was missed. By contrast, methods that work with a formal model of dependencies make a conservative guarantee that regardless of inputs, the slice will be correct. Clearly, this is better. But, there are limits to what can be achieved using those methods too; by using a model that only allows the elimination of a statement if it is guaranteed under that model to never have a dependency, it ignores two situations. The first of these is that the model is not powerful enough to recognise a particular dependency, even though it is formally true (this kind of thing crops up all over the place; I remember getting frustrated with the Java compiler, which used to complain that a particular variable value “might not have been initialised” when it was completely obvious that it must have been; e.g. in the situation where a variable was declared before an if statement and then given a value in both possible branches, and then used afterward that statement). The second—and it depends on the application as to whether this matters—is that perhaps a formal dependency might crop up so infrequently as to not matter in practice. By taking an empirical approach, we observe programs as they are being run, rather than how they could be run, and perhaps therefore find a more rapid route to e.g. bug-finding.

In the question session after the talk, one member of the audience (sorry, didn’t notice who it was) declared that they found this approach “depressing”. Not, “wrong” (though other people may have thought that). The source of the depression, I would contend, is what I will call the fallacy of formal representations. There is a sense that permeates computer science that because we have an underlying formal representation for our topic of study, we ought to be doing nothing other than producing tools and techniques that work on that formal representation. Empirical techniques are both dangerous—they produce results that cannot be guaranteed, mathematically, to hold—and a waste of time—we ought to be spending our time producing better techniques that formally analyse the underlying representation, and that it is a waste of time to piss around with empirical techniques, because eventually they will be supplanted by formal techniques.

I would disagree with this. “Eventually” is a long time, and some areas have just stalled—for want of better models, or just in terms of practical application to programs/systems of a meaningful size. There is a lot of code that doesn’t require the level of guarantee that the formal techniques provide, and we are holding ourselves up as a useful discipline if we focus purely on techniques that are appropriate for safety-critical systems, and dismiss techniques that are appropriate, for, say, the vast majority of the million+ apps in the app store.

Other areas of study—let’s call them “science”—are not held up by the same mental blockage. Biology and physics, for example, don’t throw their hands up in the air and say “nothing can be done”, “we’ll never really understand this”, just because there isn’t an underlying, complete set of scientific laws available a priori. Instead, a primary subject of study in those areas is the discovery of those laws, or at least useful approximations thereto. Indeed, the development of empirical techniques to discover new things about the phenomena under study is an important part of these subject areas, to the extent that Nobel Prizes have been won (e.g. 1977; 2003; 1979; 2012; 2005) for the development of various measurement and observation techniques to get a better insight into physical or biological phenomena.

We should be taking—alongside the more formal approaches—an attitude similar to this in computer science. Yes, many times we can gain a lot by looking at the underlying formal representations that produce e.g. program behaviour. But in many cases, we would be better served by taking these behaviours as data and applying the increasingly powerful data science techniques that we have to develop an understanding of them. We are good at advocating the use of data science in other areas of study; less good at taking those techniques and applying them to our own area. I would contend that the fallacy of formal representations is exactly the reason behind this; because we have access to that underlying level, we cannot convince ourselves that, with sufficient thought and care, we cannot extract the information that we need from ratiocination about that material, rather than “resorting” to looking at the resulting in an empirical way. This also prevents the development of good intermediate techniques, e.g. those that use ideas such as interval arithmetic and qualitative reasoning to analyse systems.

Mathematics has a similar problem. We are accustomed to working with proofs—and rightly so, these are the bedrock of what makes mathematics mathematics—and also with informal, sketched examples in textbooks and talks. But, we lack an intermediate level of “data rich mathematics”, which starts from formal definitions, and uses them to produce lots of examples of the objects/processes in question, to be subsequently analysed empirically, in a data-rich way, and then used as the inspiration for future proofs, conjectures and counterexamples. We have failed, again due to the fallacy of formal representations, to develop a good experimental methodology for mathematics.

It is interesting to wonder why empirical techniques are so successful in the natural sciences, yet are treated with at best a feeling of depressed compromise, at worst complete disdain, in computer science. One issue seems to be the brittleness of computer systems. We resort (ha!) to formal techniques because there is a feeling that “things could slip through the net” if we use empirical techniques. This seems to be much less the case in, say, biological sciences. Biologists will, for example, be confident what they have mapped out a signalling pathway fairly accurately having done experiments on, say, a few hundred cells. Engineers will feel that they understand the behaviour of some material having carefully analysed a few dozen samples. There isn’t the same worry that, for example, there is some critical tight temperature range, environmental condition, or similar, that could cause the whole system to behave in a radically different way. Something about programs feels much more brittle; you just need the right (wrong!) state to be reached for the whole system to change its behaviour. This is the blessing and the curse of computer programming; you can do anything, but you can also do anything, perhaps by accident. A state that is more-or-less the same as another state can be transformed into something radically different by a single line of code, which might leave the first state untouched (think about a divide-by-zero error).

Perhaps, then, the fault is with language design, or programming practice. We are stuck with practices from an era where every line of code mattered (in memory cost or execution time), so we feel the need to write very tight, brittle code. Could we redesign languages so that they don’t have this brittleness, thus obviating the need for the formal analysis methods that are there primarily to capture the behaviours that don’t occur with “typical” inputs. What if we could be confident—even, perhaps, mathematically sures—that there were no weird pathological routes through code? Alternatively, what if throwing more code at a problem actually made us more confident of it working well; rather than having tight single paths through code, have the same “behaviour” carried out and checked by a large number of concurrent processes that interact in a way that don’t have the dependencies of traditional concurrency models (when was the last time that a biosystem deadlocked, or a piece of particle physics, for that matter?). What if each time we added a new piece of code to a system, we felt that we were adding something of value that interacted in only a positive way with the remainder of the code, rather than fearing that we have opened up some kind of potential interaction or dependency that will cause the system to fail. What if a million lines of code couldn’t be wrong?

Design Failures (1)

Thursday, September 8th, 2016

Here is an interesting design failure. A year or two ago, the entry gates on my local stations had a message from a charity saying with the slogan “no-one in Kent should face cancer alone.”. A good message, and basically well thought out. The problem is, that they were printed on two sides of the entry gates, which open when you put your ticket in it: as a result, one side of the gate says “face cancer alone”, and this part of the message is separated out when the gates open:

"face cancer alone"

Interestingly, someone clearly noticed this. When a repeat of the campaign ran this year, with more-or-less the same message, it had been modified so that one side of the gate now says “don’t face cancer alone”:

"don't face cancer alone"

There’s a design principle in here somewhere, along the lines of thinking through the lifetime of a user of the system, not just relying on a static snapshot of the design to envision what it is like.

There’s no F in Strategy (and usually doesn’t need to be)

Thursday, February 11th, 2016

A while ago I read a little article whilst doing a management course that was very influential on me (I’ll find the reference and add it here soon). It argued that the process of building a team—in the strict sense a group of people who could really work closely and robustly together on a complex problem—was difficult, time-consuming and emotionally fraught, and that actually, for most business processes, there isn’t really any need to build a team as such. Instead, just a decently managed group of people with a well-defined goal was all that was needed for most activities. Indeed, this goes further; because of the stress and strain needed to build a well-functioning team in the strong sense of the word, it is really unproductive to do this, and risks fomenting a “team-building fatigue” in people.

I’m wondering if the same is true for the idea of strategy. Strategy is a really important idea in organisations, and the idea of strategic change is really important when a real transformation needs to be made. But, I worry that the constant demands to produce “strategies” of all sorts, at all levels of organisations, runs the danger of causing “strategy fatigue” too. We have to produce School strategies, Faculty strategies, University strategies, all divided un-neatly into research, undergraduate, and postgraduate, and then personal research Strategies, and Uncle Tom Cobleigh and all strategies. Really, we ought to be keeping the word and concepts around “strategy” for when it really matters; describing some pissant objective to increase the proportion of one category of students from 14.1% to 15% isn’t a strategy, it’s almost a rounding error. We really need to retain the term—and the activity—for when it really matters.

How? (1)

Sunday, July 12th, 2015

If we are writing a program that takes four numbers, a, b, c, and d, and adds the four of them together, how do we know that (for example) writing the expression a+b is a good program fragment to write? If we could understand that sort of question in general, we would be a long way towards building a scalable system that could write code automatically.

Seeming more Specialised than you Actually Are

Monday, June 29th, 2015

Sometimes it is important to present yourself as more specialised than you actually are. This can be true for individuals and for businesses. Take, for example, the following apparently successful businesses:

Woaah there! What’s happening here? Surely any decent web design company can provide a website for a doctor’s surgery? The specific company might provide a tiny little bit more knowledge, but surely the knowledge required to write a decent website is around 99 percent of the knowledge required to write a doctor’s surgery website. Surely, handling payments from parents for school activities is just the same as, well, umm, handling payments, and there are plenty of companies that do that perfectly well.

This, of course, misses the point. The potential customers don’t know that. To them, they are likely to trust the over-specialised presentation rather than the generic one. Indeed, the generic one might sound a little bit shady, evasive or amateurish: “What kind of web sites do you make?”, “Well, all kinds really.”, “Yes, but what are you really good at.”, “Well, it doesn’t really matter, websites are all basically the same once you get into the code.”. Contrast that with “we make websites for doctors.” Simples, innit.

So that’s my business startup advice. Find an area that uses your skills, find some specialised application of those skills, then market the hell out of your skills in that specific area. You will know that your skills are transferrable—but, your potential customers won’t, and they will trust you more as a result.

I’ve noticed the same with trying to build academic collaborations. Saying “we do optimisation and data science and visualisation and all that stuff” doesn’t really cut it. I’ve had much more success starting with a specific observation—we can provide a way of grouping your data into similar clusters, for example—than trying to describe the full range of what contemporary data science techniques can do.

Similarly with courses. Universities have done well out of providing “MBA in Marketing for XX” or whatever, when the vast majority of the course might be generic marketing skills. Again, the point here is more one of trust than one of content.

Language (1)

Monday, October 27th, 2014

When we are learning creative writing at school, we learn that it is important to use a wide variety of terms to refer to the same thing. To refer to something over and over again using the same word is seen as “boring” and something to be avoided.

It is easy to think that this is a good rule for writing in general. However, in areas where precision is required—technical and scientific writing, policy documents, regulations—it is the wrong thing to be doing. Instead, we need to be very precise about what we are saying, and using different terminology for the sake of making the writing more “interesting” is likely to damn the future reader of the document to hours of careful analysis of whether you meant two different-but-overlapping words to refer to the same thing or not.

Significant (1)

Thursday, October 2nd, 2014

I really really really wish we hadn’t settled on the term “statistically significant”. There’s just too much temptation to elide from “these results show that situation X is statistically significantly different to situation Y” to “the difference between X and Y is significant” to “the difference between X and Y is important”.

Statistical significance is about deciding whether it is reasonable to say that the difference between two things is not due to sampling error. Two things can be statistically significantly different and the magnitude of the difference of no “significance” (in the day-to-day sense) to the situation at hand.

We really should have gone for a term like “robustly distinguishable” or something that doesn’t convey the idea that the difference is important or large in magnitude.

Life After Programming

Saturday, September 20th, 2014

What will eventually replace programming? As computing technology gets more advanced, will there be a point at which (most) tasks that are currently carried out by programming get carried out by some other method? I can think of two inter-related ideas.

The first is that more-and-more of the tasks that we currently do by programming get done by some kind of machine learning, some kind of abstraction from examples. We are already beginning to see this. Take, for example, the FlashFill feature in Microsoft Excel. This is a system where you highlight a number of columns, and then fill in a further column with examples of the calculation/transformation that you want to see achieved. As you fill in examples, a machine learning algorithm works behind the scene to learn a macro that matches the examples, and fills in the remaining columns automatically. If there are still errors, you can keep on feeding it more examples until it works. What is the analogy of this in other areas (e.g. database report generation)?

The second is that code generation becomes so good that we don’t program any more, we just give specifications, test cases etc. and the programs “write themselves”. We are unlikely to see this happen wholesale. But, it should be possible using current technologies to create an additional kind of tab in an IDE—not one that contains user-written code, but one that contains examples, and the code to realise them gets generated behind the scenes with machine learning and data mining from vast code-bases.

A lot of work in machine learning of programs has foundered on the problem of “how do you specify a whole system so that it can be learned”. We might eventually get there, but I thing we are more likely to see more fine-grained gains at the method/function/transformation level first.

Of course, there is still a need for some programming in these scenarios. But, it would play the role rather that operating systems, or firmware, plays to the average programmer these days—usually something that you don’t have to explicitly worry about at all.

Teaching Specialities

Saturday, September 20th, 2014

University research often works well when there is a critical mass in some area. University degrees usually aim to give a balanced coverage of the different topics within the subject. This is usually seen as a problem—how can a set of staff with narrow research specialities deliver such a broad programme of studies?

One solution to this is to encourage staff to develop teaching specialities. That is, to develop a decent knowledge of some syllabus topic that is (perhaps) completely contrasted with their research interests.

One problem is that we are apologetic with staff about asking them to teach outside of their research area. Perhaps a little bit of first year stuff? Okay, but teaching something elsewhere in the syllabus? We tend to say to people “would you possibly, in extenuating circumstances, just for this year, pretty, pretty, please teach this second year module”. This is completely the wrong attitude to be taking. By making it sound like an exception, we are encouraging those staff to treat it superficially. A better approach would be to be honest about the teaching needs in the department, and to say something more like “this is an important part of the syllabus, no-one does research in this area, but if you are prepared to teach this area then we will (1) give you time in the workload allocation to prepare materials and get up to a high level of knowledge in the subject and (2) commit, as much as is practical, to making this topic a major part of your teaching for the next five years or more”.

In practice, this just makes honest the practice that ends up happening anyway. You take a new job, and, as much as the university would like to offer you your perfect teaching, you end up taking over exactly what the person who retired/died/got a research fellowship/moved somewhere else/got promoted to pro vice chancellor/whatever was doing a few weeks earlier. Teaching is, amongst other things, a pragmatic activity, and being able to teach anything on the core syllabus seems a reasonable expectation for someone with pretensions to being a university lecturer in a subject.

Is this an unreasonable burden? Hell no! Let’s work out what the “burden” of learning material for half a module is. Let’s assume—super-conservatively—that the person hasn’t any knowledge of the subject; e.g. they have changed disciplines between undergraduate studies and their teaching career, or didn’t study it as an option in their degree, or it is a new topic since their studies. We expect students, who are coming at this with no background, and (compared to a lecturer) comparatively weak study skills, to be able to get to grips with four modules each term. So, half a module represents around a week-and-a-half of study. Even that probably exaggerates the amount of time a typical student spends on the module; a recent study has shown that students put about 900 hours each year into their studies, a contrast with university assertions that 1200 hours is a sensible number of hours. So, we are closer to that half-module representing around a week’s worth of full-time work.

Would it take someone who was really steeped in the subject that long to get to grips with it? Probably not; we could probably halve that figure. On the other hand, we are expecting a level of mastery considerably higher than the student, so let’s double the figure. We are still at around a week of work; amortised over five years, around a day per year. Put this way, this approach seems very reasonable, and readily incorporable into workload allocation models.

Justice (1)

Friday, September 12th, 2014

My heart sinks whenever I speak to a student who says “I thought that something was wrong (e.g. with marking), but I didn’t want to offend the lecturers by suggesting it.”. Sometimes the implication is worse—”I don’t want to bias lecturers against me in future classes by being seen to be a troublemaker.”, or “I didn’t want to challenge the accusation of plagiarism, even though I had a good explanation, because I don’t want the lecturers to mark me down on future assessments.”.

My impression is that universities are, on the whole, not like this. Indeed, the idea that we have time to pursue grudges like this, even if we had the inclination (and we don’t), seems risible from where I sit. Nonetheless, we have a genuine problem here; one of “justice being seen to be done” as well as justice being done.

Universities try to deal with complaints, plagiarism cases, problems with marking, etc. by having a clear, unbiased system—as much as there is a model at all, it is the judicial system. But, some students don’t see it like that. However much we emphasise that the process is neutral, there is always a fear of those exhortations being seen as a smokescreen to hide an even deeper bias. The same, of course, is true in the broader world—disadvantaged groups believe (in some cases correctly) that the justice system is set up against them, and no amount of exhortation that it is a neutral system will help.

What can we do? Firstly, I wonder if we need to explain more. In particular, we need to explain that things are different from school, that students are treated as adults at university, and that a university review process consists in a neutral part of the university making a fair judgement between the part of the university that is making the accusation and the student. Students entering the university system have only the school system to base their idea of an educational disciplinary/judicial system on, and that is a very different model. Certainly when I was at school, it was a rather whimsical system, which could have consequences for other aspects of school life. In particular, something which wound me up at the time was the reluctance of teachers to treat issues as substantive; if someone hit you over the head, and you put your hands up to stop them, then you were both seen as “fighting” and had to do detention. Universities are not like this, and perhaps we need to emphasise this difference more.

A second thing is to recruit student unions to play a greater role in the process. I’ve been on dozens of student disciplinary and appeal panels over the years, and the number of students who exercise their right to bring someone with them is tiny. If I were in their shoes, I’d damn well want a hard-headed union representative sat next to me. Speaking as someone who wants the best for everyone in these situations, I’d like them to be as nonconfrontational as possible; but, I wonder if making them slightly more adversarial would give a stronger reassurance that they were working fairly.

Thirdly, I wonder about the role of openness in these systems. One way that national judicial systems increase confidence in their workings is by transacting their business in public; only the rarest of trials are redacted. There is clearly a delicate issue around student and staff privacy here. Nonetheless, I wonder if there is some way in which suitably anonymised cases could be made public; or, whether we might regard the tradeoff of a little loss of privacy to be worth it in the name of justice being seen to be done. Certainly, the cases that go as far as the Office of the Independent Adjudicator are largely public.

You don’t want to do this (1)

Thursday, July 17th, 2014

It isn’t good when you see something like this:

It is recommended that you use a single window for this system. More than one may cause unexpected behaviour.

The Extensional Revolution

Wednesday, July 16th, 2014

We are on the threshold of an extensional revolution.

Philosophers draw a distinction between two ways of describing collections of objects. Intensional descriptions give some abstract definition, whereas extensional descriptions list all examples. For example, consider the difference between “the set of all polyhedra that can be made by joining together a number of identical, regular, convex polygons with the same number of polygons meeting at each vertex” (intensional), and “the set {tetrahedron, cube, octahedron, dodecahedron, icosahedron}” (extensional).

Despite its claims to be (amongst other things) the science of data, computer science has been very intensional in its thinking. Programs are treated as realisations of descriptive specifications, satisfying certain mathematically-described properties.

As more data becomes available, we can start to think about doing things in an extensional way. The combination of approximate matching + the availability of large numbers of examples is a very powerful paradigm for doing computing. We have started to see this already in some areas. Machine translation of natural language is a great example. For years, translation was dominated by attempts to produce even more complex models of language, with the idea that eventually these models would be able to represent the translation process. More recently, the dominant model has been “statistical language translation”, where correlations between large scale translated corpora are used to make decisions about how a particular phrase is to be translated. Instead of feeding the phrase to be translated through some engine that breaks it down and translates it via some complex human-built model, a large number of approximations and analogies are found in a corpus and the most dominant comparison used. (I oversimplify, of course).

More simply, we can see how a task like spellchecking can be carried out by sheer force of data. If I am prevaricating between two possible spellings of a word, I just put them both into Google and see which comes out with the most hits.

Once you start thinking extensionally, different approaches to complex problems start appearing. Could visual recognition problems be solved not by trying to find the features within the image that are relevant, but by finding the all the images from a vast collection (like Flickr) that approximately match the target, and then processing the metadata? Could a problem like robot navigation or the self-driving car be solved by taking a vast collection of human-guided trajectories and just picking the closest one second-by-second (perhaps this corpus could be gained from a game, or from monitoring a lot of car journeys)? Can we turn mathematical problems from manipulations of definitions into investigations driven by artificially created data (at least for a first cut)?

The possibilities appear endless.

Websites are Real, too

Tuesday, June 3rd, 2014

There are two radically different perspectives on what a website is. I only realised this a few weeks ago, and it suddenly made clear a number of confusing and frustrating conversations I’ve had over the years.

The first perspective sees a website as a brochure for the “real” thing. It is something that you read (that choice of word is very careful) before engaging with the “real” physical organisation. The second perspective sees the website as part of the reality of the thing. By interacting with the site (the choice of word is again very careful), you are interacting with the organisation, not engaging with some pre-real experience.

I noticed this when I was talking to a university marketing person a couple of weeks ago. What bemused me was that the marketing person kept asking “what is the message of this part of the website”? Of course, I understand the concept of a marketing message and why they are important—but, what I didn’t grok for a while was what that was the relevant question. This part of the website was the website “for” a new teaching facility and its activities (not a website “about” it). As such, it was going to contain a mixture of descriptions of the facility, signups for sessions, archival video material of activities, profiles of people involved, conversations about the activities, etc. My view was that the site was going to be a continuous part of the facility, such as much as the physical space and what happens in that space are, not just a one-shot “message”.

It is interesting to take a look at the history of the web with regard to this. Many early websites were seen as publications of their organisations, rather than being the online component of those organisations. This is obvious from their design; take, for example, this snapshot of the American Mathematical Society website from 1997 (thanks to the wonderful Wayback Machine for this):

e-math: website of the American Mathematical Society

the website isn’t “the online part of” the society; it is “e-math”; a website produced by the society. A snapshot from 2000 emphasises this even more:

e-math: The Web Site of the AMS

By 2001 the “e-math” branding has gone, replaced with just the organisation name; just like a modern web site would do:

American Mathematical Society

It is my impression that an increasing number of people see websites this second way, as a part of the reality of the organisation. To a digital native population in particular, the idea that the online experience is less “experiential” than the physical experience is otiose. Certainly, when I see a website for an organisation that is little more than a pamphlet, then I don’t think “oh goody, they have thought carefully about what they want to convey to me and distilled it down”; I think “this organisation isn’t anything more than a pamphlet”, or perhaps, to borrow an old slogan, “where’s the beef?”. (A related, but different, point is made by a well-known XKCD strip).

So, websites for rich, complex organisations need to contain their fair share of that richness (not “reflect” or “represent”; to have). In particular, for university websites, we really need to start undoing the tendency to move “real” content so that it can only be viewed by a restricted audience such as current students. That’s not to say that we can’t have marketing materials on the web, and indeed to give them prominence (much as we have marketing materials in meatspace). But, to say that just because websites play a marketing role, they should be handed over to solely marketing purposes is to sorely misunderstand how a large number of people engage with the web experience.

Information Systems vs Innovation

Thursday, January 23rd, 2014

It is common for a particular sector to be dominated by one or two large, commercially-available (or, occasionally, freely available open-source) computer-based systems to manage the information within organisations. I wonder if this has the side effect of militating against innovation within organisations.

I was at a meeting a few weeks ago, to talk about the new student admissions computer system that the University is installing. This is the industry-standard system, used by probably 75% of UK universities. Obviously, there is some scope for such systems to be customised; nonetheless, they come saddled with a certain amount of assumption about the way in which information is being handled within that kind of organisation (otherwise, you would just use a generic database system).

As we discussed how this system was going to roll out, it was clear that we were making some kind of compromise between how we wanted to do admissions and how the system was set up to handle admissions. In quite a lot of these discussions, it became clear that the assumptions in the system were rather deeply embedded, and so the system “won” be battle of how to do this kind of activity. It became clear that we hadn’t just bought a system to manage our information; we had bought a whole heap of workflow assumptions with it too.

Does this matter? From an administrative point of view, perhaps not. The argument can just about be made that all organisations in the sector have broadly similar requirements, and the major players in the provision of information systems will gradually drift towards these requirements.

But, it might matter in terms of competition. One of the tenets of the current government’s policy is that systems improve be competition, and to drive competition we need diversity of practice so that new ideas come into the sector. But, in a situation like this, innovation (and thus diversity) is quashed because the systems that are managing the information can’t be readily adapted to handle innovative experiments with practice. As someone said at the meeting: “We’ve been working carefully on how to attract students to our courses for the last five years, and now we have to throw all that away because it isn’t supported by the admissions system.”.

Terms of Art (1)

Sunday, January 19th, 2014

In most areas of human endeavour, we adopt words that have an everyday meaning and use them as the basis for terminology. For example, in physics, we talk about sub-microscopic objects having “spin” or “colour”. By this, we don’t mean this in a literal way, but we adopt these terms because we need to find names for things, and so we find something that is very loosely similar, and use that terminology. This doesn’t subsequently mean that we are allowed to take other properties of these labels and reason about the objects using those other properties (an elision that often seems to occur when word-drenched literary theorists wade into discussions of science).

When the day-to-day and technical usages of a word coincide, we can sometimes end up in a muddle. A couple of years ago I set a programming assignment about card games, and I used the word “stack” of cards. Despite being very careful to explain that this use of the work “stack” was not meant to imply that this piece of data should be represented by the data structure known as a “stack” (and, indeed, was best not), I still got lots of questions about this, and lots of submissions that did confuse the two. Perhaps I should have simplified it—but, there was a valuable learning point about requirements elicitation to be learned from leaving it as it was.

Another example is the UK government report from years ago that talked about the UK needing a “web browser for education”. This got lambasted in the technical press—why on earth would the education sector need its own, special, web browser? Of course, what was meant was not a browser at all, but some kind of portal or one-stop-shop. But, this could have caused a multi-billion pound procurement failure.

I think that we have a cognitive bias towards assuming that the person we are talking to is trying to make some precise, subtle, point, even when the weight of evidence is that they have simply misunderstood, or been unfamiliar with terminology.

I try to be aware of this when I am the non-expert, for example, when dealing with builders or plumbers.

This is a great danger in communication between people with different backgrounds. The person who is unfamiliar with the terminology can accidentally wade in looking like they are asking for something much more specific than they intended, because they accidentally use a word that has a technical meaning that they don’t intend.

3/5 ≠ 0.6

Monday, September 30th, 2013

Apple’s Pages word processor tries to be clever and anticipatory. When you create a table, by default cells in it are created as spreadsheet-like cells, rather than the text format that you might expect from a word processor. Furthermore, the type of the cells is “automatic”; that means that it waits until you type, and then determines what type to apply based on that.

I’ve just been using this for student marksheets. Interestingly, if I type a mark like 3/5 it “automatically” assumes that you are typing a date, and converts it to “3rd May 2013”. If you type something like “30/50” it assumes that you are typing a vulgar fraction, and converts it to “0.6”.

This seems to fit into the “too helpful” area of HCI failures.

Multiscale Modelling (1)

Monday, December 31st, 2012

Multiscale modelling is a really interesting scientific challenge that is important in a number of areas. Basically, the issue is how to create models of systems where activities and interactions at a large number of different (temporal and/or spatial) activities happen at the same time. Due to computational costs and complexity constraints we cannot just model everything at the smallest level; yet, sometimes, small details matter.

I wonder if there is a role for some kind of machine learning here? This is a very vague thought, but I wonder if somehow we can use learning to abstract simple models from more detailed models, and use those simple models as proxies for the more detailed model, with the option to drop back into the detailed model when and only when it is specifically needed?

What Happens when Nothing Happens?

Wednesday, December 26th, 2012

More times than I’d like to think, when I talk to someone in a call centre, or fill out an online form, nothing happens. For example, a few weeks ago I had a perfectly clear and polite conversation with a call centre person from O2 about increasing my phone data allowance. They explained very clearly what the options were, and what the costs were, I chose one, they confirmed when it would start, and then…nothing happened!

This isn’t snark. I’m just interested to know what happens within the business logic of the organisation that leads from this seemingly clear conversation to no actual action. Do these requests get lost immediately after the request has been made, e.g. the person makes some notes and then gets another call and loses the notes, or the context of the notes, when they get a chance to return to them? Surely large organisations can’t be relying on such a half-arsed system?

Do requests get added to some kind of queue or ticket based system to be actioned elsewhere in the organisation, and then somehow time out after a while, or get put in a permanent holding position whilst more urgent queries are dealt with? Or, are the requests that I am making too unreasonable or complex, so that the company policy is to make sympathetic noises to the customer and then just ignore them once they have got them off the phone? I can imagine that this might occasionally be the case, but surely not for a request like the one above, which must be one of the simplest piece of business logic for organisation to execute.

Or, are there people in the organisation who are just being lazy and ticking off a lot of their work without actually doing it, like my schoolfriend who, for months on end, got all of the advantages of having a paper round without any of the actual work by systematically collecting a bag of papers every morning, then setting fire to them in a ditch in the local park?

This strikes me as something that would be almost impossible to research, and indeed very difficult even for companies to discover the cause of internally. Yet, this must be a massive issue; I would reckon that around 20% of interactions of this kind have resulted in the agreed action not happening. What can organisations do about this?