“Real Artists Ship”

Colin Johnson’s blog

Legacy Code (1)

June 24th, 2019

It’s fascinating what hangs around in code-bases for decades. Following a recent update, Microsoft Excel files in a certain format (the old style .xls files rather than the .xlsx files) started showing up with this icon:

(old excel icon)

Which I haven’t seen for a couple of decades. More interestingly, the smaller version of the icon was this one:

(old resedit icon)

What has this to do with Excel? It looked vaguely familiar, but I couldn’t place it. After a bit of thought and Googling around, I realised that this was the icon for a program called ResEdit, which was an editor for binary files that I remember using back in the pre-OS X days. Looking at this further, I realised that the last version of ResEdit was made in 1994.

How did these suddenly appear? There are occasional references to this happening on various Mac forums from the last few years. I suspect that somehow they are in collections of visual assets in codebases that have been under continuous development for the last 30 years or more, and that somehow some connection to the contemporary icon has been deleted or mis-asssigned. I’m particularly surprised that Excel wasn’t completely written from scratch for OS X.

What do people think coding is like?

April 22nd, 2019

I wonder what activity non-coders think coding is like? I remember having a conversation with a civil servant a few years ago, where he struggled to understand why we were talking about coding being “creative” etc. I think that his point of view is not uncommon—seeing coding as something that requires both intellectual vigilance and slog, but is fairly “flat” as an activity.

Perhaps people think of it as like indexing a book? Lots of focus and concentration is needed, and you need some level of knowledge, and it is definitely intellectual, “close work”. But, in the end, it doesn’t have its ups and downs, and isn’t typically that creative; it’s just a job that you get on with.

Perhaps they think it is like what they think mathematics is like? Lots of pattern-matching, finding which trick fits which problem, working through lots of line-by-line stuff that kinda rolls out, albeit slowly and carefully, once you know what to do. This isn’t entirely absent from the coding process, but it doesn’t have the ups and downs that doing maths or doing coding has.

If people have a social science background, perhaps they think of “coding” in the sense of “coding an interview”—going through, step by step, assigning labels to text (and often simultaneously coming up with or modifying that labelling scheme). Again, this has the focus that we associate with coding, but again it is rather “flat”.

Perhaps it would be interesting to do a survey on this?

Differentiation in the Lecture Room

February 14th, 2019

Students come to university with a wide range of ability and prior knowledge, and take to different subjects with different levels of engagement and competence. This spread isn’t as wide as in other areas of education—after all, students have chosen to attend, been selected in a particular grade boundary, and are doing a subject of their choice—but, there is still a decent amount of variation there.

How do we deal with this variation? In school education, they talk a lot about differentiation—arranging teaching and learning activities so that students of different levels of ability, knowledge, progress, etc. can work on a particular topic. I think that we need to do more of this at university; so much university teaching is either aimed at the typical 2:1 student, or is off-the-scale advanced. How can we make adjustments so that our teaching recognises the diversity of student’s knowledge and experience?

In particular, how can we do this in lectures? If we have a canonical, non-interactive lecture, can we do this? I think we can: here are some ideas:

Asides. I find it useful to give little parenthetical asides as part of the lecture. Little definitions, bits of background knowledge. I do this particularly for the cultural background knowledge in the Computational Creativity module, often introduced with the phrase “as you may know”. For example: “Picasso—who, as you may know, was a painter in the early-mid 20th century who invented cubism which plays with multiple perspectives in the same painting—was…”. This is phrased so that it more-or-less washes over those who don’t need it, but is there as a piece of anchoring information for those that do. Similarly for mathematical definitions: “Let’s represent this as a matrix—which, you will remember from you maths course, is a grid of numbers—…”. Again, the reinforcement/reminder is there, without patronising or distracting the students who have this knowledge by having a “for beginners” slide.

Additional connections. Let’s consider the opposite—those students who are very advanced, and have a good knowledge of the area are broadly. I differentiate for these by making little side-comments that connect to the wider course or other background knowledge. Sometimes introduced with a phrase such as “if you have studied…” or “for those of you that know about…”. For example: “for those of you who have done an option in information retrieval, this might remind you of tf-idf.”. Again, this introduces the connection without putting on a slide and make it seem big and important for those students who are struggling to manage the basics, but gives some additional information and a spark of a connection for the students who are finding the material humdrum. (I am reminded of an anecdote from John Maynard Smith, who talked about a research seminar where the speaker had said “this will remind you of a phase transition in statistical physics”: “I can’t imagine a time in my life when anything will remind me of a phase transition”).

Code examples. A computing-specific one, this. I’ve found that a lot of students click into something once they have seen a code example. These aren’t needed for the high-flying coding ninjas, who can go from a more abstract description to working out how the code is put together. But, for many students, the code example is the point where all the abstract waffle from the previous few minutes clicks into place. The stronger students can compare the code that they have been writing in their heads to mine. I sometimes do the coding live, but I’ve sometimes chickened out and used a screencap video (this also helps me to talk over the coding activity). A particularly clear example of this was where I showed a double-summation in sigma notation to a group, to largely blank looks, followed by the same process on the next slide as a nested loop, where most students seemed to be following clearly.

Any other thoughts for differentiation tricks and tips specifically in the context of giving lectures?

Microtrends (1)

February 7th, 2019

Noticeable recent microtrend—people walking around, holding a phone about 40cm from their face, having a video chat on FaceTime/Skype. Been possible for years, but I’ve noticed a real uptick in this over the last few weeks.

On Bus Drivers and Theorising

February 7th, 2019

Why are bus drivers frequently almost aggressively literal? I get a bus from campus to my home most days (about a 2 kilometre journey), and there are two routes. Route 1 goes about every five minutes from campus, takes a fairly direct route into town, and stops at a stop about 100 metres from the West Station before turning off and going to the bus station. Route 2 goes about every half hour, takes a convoluted route through campus before passing the infrequently-used West Station stop, then goes on to the bus station.

Most weeks—it has happened twice this week—someone gets on a route 1 stop, asks for a “ticket to the West Station”, and is told “this bus doesn’t go there”. About half the time they then get off, about half the time they manage to weasel out the information that the bus goes near-as-dammit there. I appreciate that the driver’s answer is literally true—there is a “West Station” stop and route 1 buses don’t stop there. But, surely the reasonable answer isn’t a bluff “the bus doesn’t go there” but instead to say “the bus stops about five minutes walk away, is that okay?”. Why are they—in what seems to me to be a kind of flippant, almost aggressive way—not doing that?

I realised a while ago that I have a tendency towards theorising. When I get information, I fit it into some—sometimes mistaken—framework of understanding. I used to think that everyone did this but plenty of people don’t. When I hear “A ticket to the West Station, please” I don’t instantly think “can’t be done” but I think “this person wants to go to the West Station; this bus doesn’t go there, but the alternative is to wait around 15 minutes on average, then take the long route around the campus; but, if they get on this bus, it’ll go now directly to a point about five minutes from where they want to get to, so they should get this one.” It is weird to think that lots of people just don’t theorise in that way much at all. And I thought I was the non-neurotypical one!

Coke, Pepsi, and Universities

February 2nd, 2019

Why does Coca-Cola still advertise? For most people in most of the world, it is a universal product—everyone knows about it, and more advertising doesn’t give you more information to help you make a purchasing decision. After a while, advertising spend and marketing effort is primarily about maintaining public awareness, keeping the product in the public eye, rather than giving people more information on which to make a decision. There is something of the “Red Queen” effect here; if competitors are spending a certain amount to keep their product at the forefront of public attention, then you are obliged to do so, even though the best thing for all of the companies involved, and for the public, would be to scale it down. (This is explained nicely in an old documentary called Burp! Pepsi vs. Coke: the Ice Cold War.) There’s a certain threshold where advertising/marketing/promotion tips over from informative to merely awareness-raising.

This is true for Universities as much as other organisations. A certain amount of promotional material is useful for prospective students, giving a feel of the place and the courses that are available. But, after a while, a decent amount of both student’s own fee money, and public investment, goes into spend over this threshold; mere spend for the purpose of maintaining awareness. However, in this case, we do have some mechanism to stop it. Perhaps universities should have a cap on the proportion of their turnover that they can spend on marketing activities, enforced by the withdrawal of (say) loan entitlements if they exceed this threshold.

On Exponential Growth and Fag Ends

January 9th, 2019

I have often been confused when people talking about family history—often people with good genealogical knowledge—talk about their family “coming from” a particular location in the distant past. Don’t they know anything about exponential growth? When you talk about your family living in some small region of north Norfolk 400 years ago, what does that mean? That’s (inbreeding aside) over 32,000 people! Surely they didn’t all live in a few local villages.

Now, I appreciate that this is a bit of an exaggeration. Over a few hundred years there will be some (hopefully fairly distant) inbreeding and so each person won’t have tens of thousands of distinct relatives. I appreciate, too, that people travelled less in the past, and that even if you are genuinely descended from thousands of distinct people, those people will have been more concentrated in the past. But, still, the intuition that “your family” (by which they are imagining, I think, a few dozen people at a time) “comes from somewhere” still seems a little off.

The naïve explanation is that they just don’t realise the scale of this growth. I would imagine that most people, asked for an intuitive stab at how many great-great-···-grandparents they had 400 years ago, would guess at a few dozen, not a number in the tens of thousands. Perhaps they have some cultural bias that a particular part of the family tree is the “main line”, perhaps that matrilineal or patrilineal lines are the important ones, and that other parts of the family are just other families merging in. Or, perhaps they recognise that in practice main lines emerge in families when there are particular fecund sub-families, and other branches fade out.

Overall, these “fag ends” are not very well acknowledged. Most people depicted in fiction, e.g. in the complex family interconnections of soap operas, have a rich, involved family. There isn’t much depiction of the sort of family that I come from, which is at the ragged, grinding to a halt twig of a family tree.

Let’s think about my family as an example. Both of my parents were somewhat isolated within their families. My mother had three siblings, two of whom died in infancy. The other, my uncle, went on to have three children, two of whom in turn have had children and and grandchildren, and the one who didn’t married into a vast family (his wife has something like ten siblings). By contrast, my mother had only me, who hasn’t had any children, and didn’t get on particularly well with her brother, so we were fairly isolated from her side of the family after by grandmother died. So, from the point of view of my grandmother’s position in the family tree, it is clear that my uncle’s line is the “main line” of the family.

Similarly, on my father’s side, he was similarly at a ragged end. He had three sisters. One died fairly young (having had Down’s syndrome). The one he was closest to went to Australia and had a large family—four children, lots of grandchildren, etc; but, they were rather geographically isolated. The one that lived a few miles from us he wasn’t particularly close to, and only had one child, who remained child-free. He had one child from his first marriage (who had children and grandchildren and great-grandchildren, which bizarrely meant that by the age of 44 I was a great-great uncle), and had only me from his marriage to my mother. Again, there are big branches and fag ends: the branches of the family tree that dominate hugely are the Australian one, and the one starting from my half-brother, whereas mine (no children), and my aunt (who had only one child) are minor twigs.

So, perhaps there is some truth in the genealogist’s intuition after all. A small number of branches in the tree become the “main lines”, and others become “fag ends”, and there isn’t much in between. It would be interesting to formalise this using network science ideas, and test whether the anecdotal example that I have in my own family is typical when we look at lots of family trees.

On Responsibility

December 30th, 2018

When people collaborate on a codebase to build complex software systems, one of the purported advantages is that fixes spread. It is good to fix or improve something at a high level of abstraction, because then that fix not only helps your own code, but also redounds to improvements in code across the codebase.

However, people often don’t do this. Rather than fixing a problem with some class high up in the class hierarchy, or adding some behaviour to a well-used utility function, they instead write their own, local, often over-specialised version of it.

Why does this happen? One theory is about fear of breaking things. The fix you make might be right for you, but who knows what other changes it will have? The code’s intended functionality might be very well documented, but perhaps people are using abstruse features of a particular implementation to achieve something in their own code. In theory this shouldn’t happen, but in practice the risk:reward ratio is skewed towards not doing the fix.

Another reason—first pointed out to me by Hila Peleg—is that once you have fixed it, your name is in the version control system as the most recent modifier of the code. This often means that the code becomes your de facto responsibility, and questions about it then come to you. Particularly with a large code base and a piece of code that is well used, you end up taking on a large job that you hadn’t asked for, just for the sake of fixing a minor problem in your code. Better to write your own version and duck that responsibility.

Learning what is Unnecessary

December 28th, 2018

Learning which steps in a process are unnecessary is one of the hardest things to learn. Steps that are unnecessary yet harmless can easily be worked into a routine, and because they cause no problems apart from the waste of time, don’t readily appear as problems.

An example. A few years ago a (not very technical) colleague was demonstrating something to me on their computer at work. At one point, I asked them to google something, and they opened the web browser, typed the URL of the University home page into the browser, went to that page, then typed the Google URL into the browser, went the Google home page, and then typed their query. This was not at trivial time cost; they were a hunt-and-peck typist who took a good 20-30 seconds to type each URL.

Why did they do the unnecessary step of going to the University home page first? Principally because when they had first seen someone use Google, that person had been at the University home page, and then gone to the Google page; they interpreted being at the University home page as some kind of precondition for going to Google. Moreover, it was harmless—it didn’t stop them from doing what they set out to do, and so it wasn’t flagged up to them that it was a problem. Indeed, they had built a vague mental model of what they were doing—by going to the University home page, they were somehow “logging on”, or “telling Google that this was a search from our University”. It was only on demonstrating it to me that it became clear that it was redundant, because I asked why they were doing it.

Another example. When I first learned C++, I put semicolons after the brackets at the end of each block, after the curly bracket. Again, this is harmless: all it does is to insert some null statements into the code, which I assume the compiler strips out at optimisation. Again, I had a decent mental model for this: a vague notion of “you put semicolons at the end of meaningful units to mark the end”. It was only when I started to look at other people’s code in detail that I realised that this was unnecessary.

Learning these is hard, and usually requires us to either look carefully at external examples and compare them to our behaviour, or for a more experienced person to point them out to us. In many cases it isn’t all that important; all you lose is a bit of time. But, sometimes it can mark you out as a rube, with worse consequences than wasting a few seconds of time; an error like this can cause people to think “if they don’t know something as simple as that, then what else don’t they know?”.

Gresham’s Law (1)

December 7th, 2018

Gresham’s Law for the 21st Century: “Bad cultural relativism drives out good cultural relativism.”

Human in the Loop (1)

November 19th, 2018

Places with a pretension to being high-end often put a human in the loop in the belief that it makes it a better service. This is particularly the case in countries where basic labour costs are cheap. The idea, presumably, is that you can ask for exactly what you want, and get it, rather than muddling through understanding the system yourself. But, this can sometimes make for a worse service, by putting a social barrier in the loop. For example, I have just gone to a coffee machine at a conference, where there was someone standing by it waiting to operate it. As a result, I got a worse outcome than if I had been able to operate it myself. Firstly, I was too socially embarrassed to ask for what I would have done myself—press the espresso button twice— because that seems like an “odd” thing to do. Secondly, I got some side-eye from the server when I didn’t take the saucer; as a northerner I don’t really believe in them. So, by trying to make this more of a “service” culture, the outcome was worse for me, both socially and in terms of the product that I received.

Vote For Me and I’ll Crush Your Dreams!

October 30th, 2018

My (hard right-wing) mother used to go on about how people in some countries needed a strong dictator to keep them under control. It is one of the remarkable features of the last few years that politicians in democratic countries have managed to persuade their own populations that it is in their interests to vote for near-dictators to keep them under control.

Computer Science

October 22nd, 2018

Is the current state of computer science education analogous to a situation where there were no business schools, and everyone who wanted to do “business studies” had to do economics instead?

The Map that Precedes the Territory

September 23rd, 2018

I’ve sometimes joked that I only have hobbies because they are necessary for me to indulge my meta-hobbies of project management, product design, and logistics. Sometimes, I worry that I get more pleasure from the planning that goes around an activity than doing the activity itself. The planning the travel and activities for a trip, the well-organised and well-chosen set of accessories or tools for doing some craft, preferring to be the person who organises the meetings and makes up the groups rather than being a participant in the activity.

I wonder where this comes from? I think part of it is from growing up in a household where there wasn’t much money to spend on leisure stuff. As a result, I spent a lot of my childhood planning what I would do when I had things, making tables and catalogues of things, and endlessly going over the same small number of resources. I remember planning in great detail things like model railway layouts, devising complex electrical circuits, and filling notebook-after-notebook with code in anticipation of the day when I might finally have access to a computer to run it on—a computer which would be chosen not on a whim, but from detailed comparison tables I had drawn up from catalogues and ads so as to get the very best one for the limited money we had.

The intellectual resources I had access to were interesting. We had some books, bought from W.H. Smith, brought home from the school where my father taught, bought from a catalogue of discount improving educational books which was available at School (which introduced me to the excellent Usborne books which I still think are a model for exposition of complex concepts), or bought from the eccentric selection available at remainder shops (I particularly remember three random volumes of encyclopaedia that I had bought from one such shop). The local library was a good resource too, but I rapidly exhausted the books on topics of relevance to me, and just started reading my way through everything; one week I remember bringing home a haul of books on Anglicanism, resulting in my mother’s immortal line “You’re not going to become a bloody vicar, are you?”. Catalogues and the like were an endless source of information too, I remember endless poring over detailed technical catalogues such as the Maplin one, and spec sheets from computer shops, compiling my own lists and tables of electrical components, details of how different computers worked, etc. I remember really working through what limited resource I had; endlessly reading through the couple of advanced university-level science books that a colleague of my mother’s had given to her via a relative who had done some scientific studies at university.

There’s something to be said for trying damn hard to understand something that is just too difficult. I remember working for hours at a complex mathematical book from the local library about electrical motors, just because it was there and on an interesting topic, and learning linear and dynamic programming, university level maths topics, again because there happened to be a good book on it in the local library. These days, with access to a vast university library, books at cheap prices on Amazon, and talks on almost every imaginable topic available on YouTube, I think I waste a lot of time trying to find some resource that is just at my level, rather than really pushing myself to make my own meaning out of something that is on the very fringe of my level of possible understanding. Similarly, I remember the same for courses at University—I got a crazily high mark (88% or something) in a paper on number theory, where I had struggled to understand and the textbooks were pretty ropey, whereas the well-presented topics with nice neatly presented textbooks were the golden road to a 2:1 level of achievement.

Talking of lectures and YouTube etc., another thing that is near impossible to have a feel for was the ephemerality of media. There were decent TV and radio programmes on topics I was interested in, science and technology and the like, but it seems incomprehensibly primitive that these were shown once, at a specific time, and then probably not repeated for months. How bizarre that I couldn’t just revisit it. But, again, in made it special; I had to be there at a specific time. I think this is why lecture courses remain an important part of university education. About 20 years ago I worked with someone called Donald Bligh, who wrote an influential book called What’s the Use of Lectures?, which anticipated lots of the later developments in the flipped classroom etc. He couldn’t understand why, with the technology available to deliver focused, reviewable, breakable-downable, indexable online material, we still obsessed about the live lecture. I have a lot of sympathy for that point of view, but I think lecture courses deliver pace and, at their best, model “thinking out loud”—particularly, for technical and mathematical subjects. When everything is available at hand, we just get stuck in focus paralysis; I do that with things I want to learn, there are too many things and it is too easy when something gets hard to not persevere, and to turn to something else instead; or, I spend endless amounts of time in search of the perfect resource, one that is just at my level. This is what I wasn’t able to do, 30 years ago, in my little room with limited resources, and so I got on with the task at hand.

How can we regain this focus in a world of endless intellectual resource abundance? Some approaches are just to pace stuff out—even MOOCs, where the resources are at hand and could be released, box-set-like, all at once, nonetheless spoon them out bit-by-bit in an attempt to create a cohort and a sense of pace. Another approach is pure self-discipline; I force myself to sit down with a specific task for the day, and use techniques such as the Pomodoro technique to pace out my time appropriately. Others use technologies to limit the amount of time spent online, such as web-blockers that limit the amount of time spent either on the web in general, or specifically on distractors such as social media. But, I still think that we don’t have a really good solution to this.

Memory (3)

September 20th, 2018

When I was around 12 years old, we went for one of our regular family trips into the Derbyshire countryside. After lunch, I went off for a bike ride. I thought that I had communicated this to my parents, but they thought I had meant that I was going to ride my bike through the woods for 5-10 minutes, whereas I meant that I was going for an hour or two of riding.

When I got back, my family were worried sick about where I had got to. Later, I found out that my grandmother had at some point during my absence uttered the immortal line: “If he’s gone and cycled off a cliff, I’ll bloody well kill him!”.

Memory (2)

September 18th, 2018

At York university in the 1990’s, there was a lane called “Retreat Lane” which was the start of the main route from campus into town. It was somewhat sketchy, and we were warned not to use it at night; it is good to see that it has had proper lighting installed a while ago. There were three prominent pieces of graffiti on the walls and gates:

  • The words “WATFROD F.C. RULES OK” (yes, that spelling) in huge letters.
  • The words “Ah good the sea!” in chalk. That seems to have been there for years, it was still there a few years ago, people must re-chalk it from time-to-time (I would, I suppose, if I noticed it was fading).
  • The words “Meat is Murder” written at the top of a gate to a field that sometimes had cows in it. Later joined by various other (rather less sincerely meant) slogans, such as “Veg is Vomit” and “Fish is Foul”.

Computational Thinking (1)

September 18th, 2018

The idea of computational thinking has been used to denote set of skills that should be promoted as part of a broad education. The term originates with work by Jeanette Wing (e.g. this CACM article) over a decade ago. Computational thinking has developed to mean two, slightly different things. Firstly, the use of ideas coming out of computing for a wide variety of tasks, not always concerned with implementing solutions on computers. Systematic descriptions of processes, clear descriptions of data, ideas of data types, etc. are seen as valuable mental concepts for everyone to learn and apply. As a pithy but perhaps rather tone-deaf saying has it: “coding is the new Latin”.

A second, related, meaning is the kinds of thinking required to convert a complex real-world problem into something that can be solved on a computer. This requires a good knowledge of coding and the capabilities of computer systems, but is isn’t exactly the coding process as such: it is the process required to get to the point where the task is obvious to an experienced coder. These are the kind of tasks that are found in the Bebras problem sets, for example. We have found these very effective in checking whether people have the skills in abstraction and systematisation that are needed before attempting to learn to code; they test the kinds of things that are needed in computational thinking without requiring actual computing knowledge.

A thought that occurred to me today is that these problems provide a really good challenge for artificial intelligence. Despite being described as “computational thinking” problems, they are actually problems that test the kind of things that computers cannot do—the interstitial material between the messy real world and the structured problems that can be tackled by computer. This makes them exactly the sort of things that AI ought to be working towards and where we could gain lots of insight about intelligence. One promising approach is the “mind’s eye” visual manipulation described by Maithilee Kunda in this paper about visual mental imagery and AI.

Lift and What?!

September 12th, 2018

Your current SQL Servers can lift and shit without seeing any change,

Scared (1)

September 11th, 2018

As we approach the beginning of term, and a new cohort of students joining our universities, it is worth remembering that a decent number of our new students are arriving frightened of us, or assuming that we will look down on them. I think that the comment here, from a student admissions forum, is not untypical:
You never feel like you're inferior to the seminar leaders (despite their PhDs!) and nearly all of them are genuinely nice people.
It is important, in our first few interactions with them, to make it clear that this isn’t the case.


August 28th, 2018

What happens if you wrap an Engagement Chicken inside a Hand-knitted Sweater?