"Real Artists Ship"

Colin Johnson’s blog


Archive for the ‘Computing and IT’ Category

Harder than you Think

Wednesday, September 28th, 2022

Here’s an interesting example of where technology that tries to be helpful makes something really difficult when you step just a tiny bit outside the intended use case.

A colleague sent me a document with a QR code for an event—a link to eventbrite. There was no URL for the event, and I wanted to send the link to a group of students via email. Ok—let’s see if we can scan the QR code from the document somehow. No dice. There isn’t, as far as I can tell, any way to make my laptop QR scanner look at something on the screen rather than through the camera.

Ok, let’s look at it on my phone. Okay, taking a photo of my laptop screen on my phone detects the QR code, but it automatically forwards it to the eventbrite app, so I don’t have a URL. Tried searching for the name of the event on eventbrite, but it is a private, unlisted event so it didn’t show up in search.

Tried next to see if there was a website that converted QR to URL. Hard to find—most searches for “QR to URL” took me to sites that created QR codes from URLs, even with “QR to URL”, “URL from QR” etc. in quotation marks. Most of the “successful” links were not services to do it, but code fragments for coders wanting to build this functionality into a program. Found two sites that claimed to do this, took a screengrab of the URL, uploaded it to the sites, one didn’t recognise it, one just hung with “processing…” for several minutes.

Eventual solution. Downloaded a new QR reader app on my phone which, thankfully, didn’t automatically open the eventbrite app. Copied-and-pasted the URL into the Notes app, which synced with my laptop, then I was able to cut-and-paste into an email.

Acceptability of Deepfakes for Trivial Corrections: The Thin End of a Wedge?

Wednesday, June 17th, 2020

Clearly deepfakes are unacceptable, yes? It is morally unsound to create a fake video of someone saying or doing something, and to play that off as a real recording of that person doing it.

But, what about a minor correction? I recently saw a video about personal development, talking about how people move through various stages of life, and making a number of very positive points and pieces of advice. I thought that this might be useful as part of a professional development session to show to a group of students. But, there was a problem. At some point, the speaker talks about life changes, and talks about adolescence, including a reference to “when people start to get interested in the opposite sex”. The heteronormativity of this made me flinch, and I certainly wouldn’t want this to be presented, unadorned, to a group of students. This is both because of the content as such, and because I wouldn’t want the session to be derailed onto a discussion of this specific point, when it was a minor and easily replaceable example, not core to the argument.

I suppose what I would typically do would be to use it, but to offer a brief comment at the beginning that there was something not germane to the main argument, but which was problematic, but on balance I thought it would be good to use this resource despite the problematic phrase. I might even edit it out. Certainly if I was handing out a transcript rather than using the video, I would cut it out using an […] ellipsis. But, these solutions might still focus attention on it.

So—would it be acceptable to use a deepfake here? To replace “when people start to get interested in the opposite sex” with “when people start to develop an awareness of sexuality”, for example? There seems something dubious about this—we are putting words into someones mouth (well, more accurately, putting their mouth around some words). But, we aren’t really manipulating the main point. It’s a bit like how smoking has been edited out of some films, particularly when they are to be shown to children—the fact of the character smoking isn’t a big plot point, it was just what a character happened to be doing.

So, is this acceptable? Not acceptable? Just about okay, but the thin end of the wedge?

Big Scary Words

Tuesday, May 19th, 2020

I once saw a complaint about a crowdfunded project that was going awry. The substance of the complaint was that, in addition to their many other failings, the people funded by the project had used some of the money to set up a company. Basically: “I paid you to make a widget, not to waste my money setting up a company”. There’s an interesting contrast in the view of the word “company” here. To someone in business, spending a few hundred pounds to register a company is a basic starting point, providing a legal entity that can take money, hold the legal rights to inventions in a safe way, provide protection from personal bankruptcy, etc. But to the person making the complaint, “setting up a company” no doubt meant buying a huge office building, employing piles of accountants and HR departments, and whatnot.

We see a similar thing with other terms—some things that are part of normal business processes sound like something special and frightening to people who aren’t engaging with these processes as part of their day-to-day life. For example, your data being put “on a database” can sound like a big and scary process, something out-of-the-ordinary, rather than just how almost all data is stored in organisations of any substantial size. Similarly, “using an algorithm” can sound like your data is being processed in a specific way (perhaps rather anonymous and deterministic—the computer “deciding your fate”), rather than being a word used to describe any any computer-based process.

We need to be wary of such misunderstandings in describing our processes to a wider public.

Legacy Code (1)

Monday, June 24th, 2019

It’s fascinating what hangs around in code-bases for decades. Following a recent update, Microsoft Excel files in a certain format (the old style .xls files rather than the .xlsx files) started showing up with this icon:

(old excel icon)

Which I haven’t seen for a couple of decades. More interestingly, the smaller version of the icon was this one:

(old resedit icon)

What has this to do with Excel? It looked vaguely familiar, but I couldn’t place it. After a bit of thought and Googling around, I realised that this was the icon for a program called ResEdit, which was an editor for binary files that I remember using back in the pre-OS X days. Looking at this further, I realised that the last version of ResEdit was made in 1994.

How did these suddenly appear? There are occasional references to this happening on various Mac forums from the last few years. I suspect that somehow they are in collections of visual assets in codebases that have been under continuous development for the last 30 years or more, and that somehow some connection to the contemporary icon has been deleted or mis-asssigned. I’m particularly surprised that Excel wasn’t completely written from scratch for OS X.

What do people think coding is like?

Monday, April 22nd, 2019

I wonder what activity non-coders think coding is like? I remember having a conversation with a civil servant a few years ago, where he struggled to understand why we were talking about coding being “creative” etc. I think that his point of view is not uncommon—seeing coding as something that requires both intellectual vigilance and slog, but is fairly “flat” as an activity.

Perhaps people think of it as like indexing a book? Lots of focus and concentration is needed, and you need some level of knowledge, and it is definitely intellectual, “close work”. But, in the end, it doesn’t have its ups and downs, and isn’t typically that creative; it’s just a job that you get on with.

Perhaps they think it is like what they think mathematics is like? Lots of pattern-matching, finding which trick fits which problem, working through lots of line-by-line stuff that kinda rolls out, albeit slowly and carefully, once you know what to do. This isn’t entirely absent from the coding process, but it doesn’t have the ups and downs that doing maths or doing coding has.

If people have a social science background, perhaps they think of “coding” in the sense of “coding an interview”—going through, step by step, assigning labels to text (and often simultaneously coming up with or modifying that labelling scheme). Again, this has the focus that we associate with coding, but again it is rather “flat”.

Perhaps it would be interesting to do a survey on this?

Differentiation in the Lecture Room

Thursday, February 14th, 2019

Students come to university with a wide range of ability and prior knowledge, and take to different subjects with different levels of engagement and competence. This spread isn’t as wide as in other areas of education—after all, students have chosen to attend, been selected in a particular grade boundary, and are doing a subject of their choice—but, there is still a decent amount of variation there.

How do we deal with this variation? In school education, they talk a lot about differentiation—arranging teaching and learning activities so that students of different levels of ability, knowledge, progress, etc. can work on a particular topic. I think that we need to do more of this at university; so much university teaching is either aimed at the typical 2:1 student, or is off-the-scale advanced. How can we make adjustments so that our teaching recognises the diversity of student’s knowledge and experience?

In particular, how can we do this in lectures? If we have a canonical, non-interactive lecture, can we do this? I think we can: here are some ideas:

Asides. I find it useful to give little parenthetical asides as part of the lecture. Little definitions, bits of background knowledge. I do this particularly for the cultural background knowledge in the Computational Creativity module, often introduced with the phrase “as you may know”. For example: “Picasso—who, as you may know, was a painter in the early-mid 20th century who invented cubism which plays with multiple perspectives in the same painting—was…”. This is phrased so that it more-or-less washes over those who don’t need it, but is there as a piece of anchoring information for those that do. Similarly for mathematical definitions: “Let’s represent this as a matrix—which, you will remember from you maths course, is a grid of numbers—…”. Again, the reinforcement/reminder is there, without patronising or distracting the students who have this knowledge by having a “for beginners” slide.

Additional connections. Let’s consider the opposite—those students who are very advanced, and have a good knowledge of the area are broadly. I differentiate for these by making little side-comments that connect to the wider course or other background knowledge. Sometimes introduced with a phrase such as “if you have studied…” or “for those of you that know about…”. For example: “for those of you who have done an option in information retrieval, this might remind you of tf-idf.”. Again, this introduces the connection without putting on a slide and make it seem big and important for those students who are struggling to manage the basics, but gives some additional information and a spark of a connection for the students who are finding the material humdrum. (I am reminded of an anecdote from John Maynard Smith, who talked about a research seminar where the speaker had said “this will remind you of a phase transition in statistical physics”: “I can’t imagine a time in my life when anything will remind me of a phase transition”).

Code examples. A computing-specific one, this. I’ve found that a lot of students click into something once they have seen a code example. These aren’t needed for the high-flying coding ninjas, who can go from a more abstract description to working out how the code is put together. But, for many students, the code example is the point where all the abstract waffle from the previous few minutes clicks into place. The stronger students can compare the code that they have been writing in their heads to mine. I sometimes do the coding live, but I’ve sometimes chickened out and used a screencap video (this also helps me to talk over the coding activity). A particularly clear example of this was where I showed a double-summation in sigma notation to a group, to largely blank looks, followed by the same process on the next slide as a nested loop, where most students seemed to be following clearly.

Any other thoughts for differentiation tricks and tips specifically in the context of giving lectures?

Microtrends (1)

Thursday, February 7th, 2019

Noticeable recent microtrend—people walking around, holding a phone about 40cm from their face, having a video chat on FaceTime/Skype. Been possible for years, but I’ve noticed a real uptick in this over the last few weeks.

On Responsibility

Sunday, December 30th, 2018

When people collaborate on a codebase to build complex software systems, one of the purported advantages is that fixes spread. It is good to fix or improve something at a high level of abstraction, because then that fix not only helps your own code, but also redounds to improvements in code across the codebase.

However, people often don’t do this. Rather than fixing a problem with some class high up in the class hierarchy, or adding some behaviour to a well-used utility function, they instead write their own, local, often over-specialised version of it.

Why does this happen? One theory is about fear of breaking things. The fix you make might be right for you, but who knows what other changes it will have? The code’s intended functionality might be very well documented, but perhaps people are using abstruse features of a particular implementation to achieve something in their own code. In theory this shouldn’t happen, but in practice the risk:reward ratio is skewed towards not doing the fix.

Another reason—first pointed out to me by Hila Peleg—is that once you have fixed it, your name is in the version control system as the most recent modifier of the code. This often means that the code becomes your de facto responsibility, and questions about it then come to you. Particularly with a large code base and a piece of code that is well used, you end up taking on a large job that you hadn’t asked for, just for the sake of fixing a minor problem in your code. Better to write your own version and duck that responsibility.

Learning what is Unnecessary

Friday, December 28th, 2018

Learning which steps in a process are unnecessary is one of the hardest things to learn. Steps that are unnecessary yet harmless can easily be worked into a routine, and because they cause no problems apart from the waste of time, don’t readily appear as problems.

An example. A few years ago a (not very technical) colleague was demonstrating something to me on their computer at work. At one point, I asked them to google something, and they opened the web browser, typed the URL of the University home page into the browser, went to that page, then typed the Google URL into the browser, went the Google home page, and then typed their query. This was not at trivial time cost; they were a hunt-and-peck typist who took a good 20-30 seconds to type each URL.

Why did they do the unnecessary step of going to the University home page first? Principally because when they had first seen someone use Google, that person had been at the University home page, and then gone to the Google page; they interpreted being at the University home page as some kind of precondition for going to Google. Moreover, it was harmless—it didn’t stop them from doing what they set out to do, and so it wasn’t flagged up to them that it was a problem. Indeed, they had built a vague mental model of what they were doing—by going to the University home page, they were somehow “logging on”, or “telling Google that this was a search from our University”. It was only on demonstrating it to me that it became clear that it was redundant, because I asked why they were doing it.

Another example. When I first learned C++, I put semicolons after the brackets at the end of each block, after the curly bracket. Again, this is harmless: all it does is to insert some null statements into the code, which I assume the compiler strips out at optimisation. Again, I had a decent mental model for this: a vague notion of “you put semicolons at the end of meaningful units to mark the end”. It was only when I started to look at other people’s code in detail that I realised that this was unnecessary.

Learning these is hard, and usually requires us to either look carefully at external examples and compare them to our behaviour, or for a more experienced person to point them out to us. In many cases it isn’t all that important; all you lose is a bit of time. But, sometimes it can mark you out as a rube, with worse consequences than wasting a few seconds of time; an error like this can cause people to think “if they don’t know something as simple as that, then what else don’t they know?”.

Computer Science

Monday, October 22nd, 2018

Is the current state of computer science education analogous to a situation where there were no business schools, and everyone who wanted to do “business studies” had to do economics instead?

Computational Thinking (1)

Tuesday, September 18th, 2018

The idea of computational thinking has been used to denote set of skills that should be promoted as part of a broad education. The term originates with work by Jeanette Wing (e.g. this CACM article) over a decade ago. Computational thinking has developed to mean two, slightly different things. Firstly, the use of ideas coming out of computing for a wide variety of tasks, not always concerned with implementing solutions on computers. Systematic descriptions of processes, clear descriptions of data, ideas of data types, etc. are seen as valuable mental concepts for everyone to learn and apply. As a pithy but perhaps rather tone-deaf saying has it: “coding is the new Latin”.

A second, related, meaning is the kinds of thinking required to convert a complex real-world problem into something that can be solved on a computer. This requires a good knowledge of coding and the capabilities of computer systems, but is isn’t exactly the coding process as such: it is the process required to get to the point where the task is obvious to an experienced coder. These are the kind of tasks that are found in the Bebras problem sets, for example. We have found these very effective in checking whether people have the skills in abstraction and systematisation that are needed before attempting to learn to code; they test the kinds of things that are needed in computational thinking without requiring actual computing knowledge.

A thought that occurred to me today is that these problems provide a really good challenge for artificial intelligence. Despite being described as “computational thinking” problems, they are actually problems that test the kind of things that computers cannot do—the interstitial material between the messy real world and the structured problems that can be tackled by computer. This makes them exactly the sort of things that AI ought to be working towards and where we could gain lots of insight about intelligence. One promising approach is the “mind’s eye” visual manipulation described by Maithilee Kunda in this paper about visual mental imagery and AI.

Innovative (1)

Tuesday, March 20th, 2018

The major social media companies have basically been providing the same, largely unchanging product, for the last decade. Yes—they are doing it very well, managing to scale number of users and amounts of activity, and optimising the various conflicting factors around usability, advertising, etc. But, basically, Twitter has been doing the same schtick for the last decade. Yet, if media and government were looking to talk to an innovative, forward-looking company, they might well still turn to such companies.

By contrast, universities, where there is an enormous, rolling programme of change and updating, keeping up with research, innovating in teaching, all in the context of a regulatory and compliance regime that would be seen as mightily fuckoffworthy if imposed on such companies, are portrayed as the lumbering, conservative forces. Why is this? How have the social media companies managed to convey that impression—and how have we in higher education failed?

Design Puzzles (1)

Friday, March 2nd, 2018

What’s going on here?

# _____ DAY

This is the back of the packaging of my protein bar. What’s with the white stripe across the top left? It reads, basically “# _____ DAY, fuelled by 12g of PRIMAL PROTEIN”. Presumably the the # is a hashtag marker, and there is meant to be some text between that and “DAY”. Is this some kind of fill-in-the-blank exercise? I don’t think so, it seems rather obscure without any further cue. Did it at one point say something that they had to back away from for legal reasons: “# TWO OF YOUR FIVE A DAY”, perhaps? If so, why redesign it with a white block? Does packaging work on such a tight timescale that they were all ready to go, when someone emailed from legal to say “uh, oh, better drop that” and so someone fired up Indesign and put a white block there. Surely it can’t be working on such a timescale that there wasn’t time enough to make it the same shade of red as the rest, or rethink it, or just blank out the whole thing. Is it just a production error? At first I thought it was a post-hoc sticker to cover up some unfortunate error, but it is a part of the printed packaging. A minor mystery indeed.

Variations on Folk Sayings (20)

Friday, November 24th, 2017

Every cloud computer has a very expensive data centre lining.

Guilt-free (1)

Thursday, August 31st, 2017

Slack—like email, but somehow with a lot less guilt about ignoring it.

Not There

Wednesday, August 30th, 2017

It’s surprising to me, in a world where social media is generally assumed to be ubiquitous, how many people have minimal-to-no online presence. Whilst I was sorting through piles of stuff from my Dad’s house (well, sorting out in the sense of looking at it and then putting it in a box in a storage unit), I came across a lot of things with names on—old school photos, programmes from concerts and plays at school with lists of pupils and teachers, lists of people who were involved in societies at University, details of distant family members, etc. Looking up some people online, I was surprised how often there was no online trace. I understand that some people might have changed names, gone to ground, died, or whatever, but a good third of people, I would say, had no or close-to-no online presence. Don’t quite know what to make of this, but it shows how the idea that we are a completely online community to be unreliable.

int Considered Harmful; or, Are Computer Languages Too General

Friday, August 25th, 2017

The flexibility of computer languages is considered to be one of their sources of power. The ability for a computer to do, within limits of tractability and Turing-completeness, anything with data is considered one of the great distinguishing features of computer science. Something that surprises me is that we fell into this very early on in the history of computing; very early programmable computer systems were already using languages that offered enormous flexibility. We didn’t have a multi-decade struggle where we developed various domain-specific languages, and then the invention of Turing-complete generic languages was a key point in the development of computer programming. As-powerful-as-dammit languages were—by accident, or by the fact of languages already building on a strong tradition in mathematical logic etc.—there from the start.

Yet, in practice, programmers don’t use this flexibility.

How often have we written a loop such as for (int i=0;i<t;i++)? Why, given the vast flexibility to put any expression from the language in those three slots, hardly put anything other than a couple of different things in there? I used to feel that I was an amateurish programmer for falling into these clichés all the time—surely, real programmers used the full expressivity of the language, and it was just me with my paucity of imagination that wasn’t doing this.

But, it isn’t. Perhaps, indeed, the clichés are a sign of maturity of thinking, a sign that I have learned some of the patterns of thought that make a mature programmer?

The studies of Roles of Variables put some meat onto these anecdotal bones. Over 99% of variable usages in a set of programs from a textbook were found to be doing just one of around 10 roles. An example of a role is most-wanted holder, where the variable holds the value that is the “best” value found so far, for some problem-specific value of “best”. For example, it might be the current largest in a program that is trying to find the largest number in a list.

There is a decent argument that we should make these sorts of things explicit in programming languages. Rather than saying “int” or “string” in variable declarations we should instead/additionally say “stepper” or “most recent holder”. This would allow additional pragmatic checks to see whether the programmer was using the variable in the way that they think they are.

Perhaps there is a stronger argument though. Is it possible that we might be able to reason about such a restricted language more powerfully than we can a general language? There seems to be a tension between the vast Turing-complete capability of computer languages, and the desire to verify and check properties of programs. Could a subset of a language, where the role-types had much more restricted semantics, allow more powerful reasoning systems? There is a related but distinct argument that I heard a while ago that we should develop reasoning systems that verify properties of Turing-incomplete fragments of programs (I’ll add a reference when I find it, but I think the idea was at very early stages).

Les Hatton says that Software is cursed with unconstrained creativity. We have just about got to a decent understanding of our tools when trends change, and we are forced to learn another toolset—with its own distinctive set of gotchas—all over again. Where would software engineering have got to if we had focused not on developing new languages and paradigms, but on becoming master-level skilled with the already sufficiently expressive languages that already existed? There is a similar flavour here. Are we using languages that allow us to do far more than we ever need to, and subsequently limiting the reasoning and support tools we can provide?

Variations on Folk Sayings (18)

Wednesday, May 31st, 2017

Stranger things have happened in C.

Worse than Nothing?

Saturday, May 27th, 2017

Here’s a thought, which came from a conversation with Richard Harvey t’other week. Is it possible for a degree to harm your job prospects? The example that he came up with was a third class degree in some vocational or quasi-vocational subject such as computer science. If you have a third class degree in CS, what does that say to prospective employers? Firstly, that you are not much of a high-flyer in the subject—that is a no-brainer. But, it also labels you as someone who is a specialist—and not a very good one! The holder of a third in history, unless they are applying specifically for a job relating to history, isn’t too much harmed by their degree. Someone sufficiently desperate will take them on to do something generic (this relates to another conversation I had about careers recently—what are universities doing to engage with the third-class employers that will take on our third-class graduates? Perhaps we need to be more proactive in this area, rather than just dismissive, but this requires a degree of tact beyond most people.). But a third-class computing/architecture/pharmacy student is stuck in the bind that they have declared a professional specialism, and so employers will not consider them for a generic role; whilst at the same time evidencing that they are not very good in the specialism that they have identified with. Perhaps we need to do more for these students by emphasising the generic skills that computer science can bring to the workplace—”computing is the new Latin” as a rather tone-deaf saying goes.

Machine Learning with Context (1)

Friday, March 3rd, 2017

Two interesting machine learning/AI challenges (emerging from a chat with my former PhD student Lawrence Beadle yesterday):

  1. Devise a system for automatically doing substitutions in online grocery shopping, including the case which recognises that substituting a Manchester City-themed birthday cake is not an adequate substitution for a Manchester United-themed birthday cake, despite them both being birthday cakes, of the same weight, same price, and both having the word “Manchester” in the name.
  2. Devise a forecasting system that will not predict that demand for turkeys will be enormous on December 27th, or flowers on February 15th.

Both of these need some notion of context, and perhaps even explanation.