Is the current state of computer science education analogous to a situation where there were no business schools, and everyone who wanted to do “business studies” had to do economics instead?
Archive for the ‘Computing and IT’ Category
The idea of computational thinking as as set of skills that should be promoted as part of a broad education. The term originates with work by Jeanette Wing (e.g. this CACM article) over a decade ago. Computational thinking has developed to mean two, slightly different things. Firstly, the use of ideas coming out of computing for a wide variety of tasks, not always concerned with implementing solutions on computers. Systematic descriptions of processes, clear descriptions of data, ideas of data types, etc. are seen as valuable mental concepts for everyone to learn and apply. As a pithy but perhaps rather tone-deaf saying has it: “coding is the new Latin”.
A second, related, meaning is the kinds of thinking required to convert a complex real-world problem into something that can be solved on a computer. This requires a good knowledge of coding and the capabilities of computer systems, but is isn’t exactly the code process as such: it is the process required to get to the point where the task is obvious to an experienced coder. These are the kind of tasks that are found in the Bebras problem sets, for example. We have found these very effective in checking whether people have the skills in abstraction and systematisation that are needed before attempting to learn to code; they test the kinds of things that are needed in computational thinking without requiring actual computing knowledge.
A thought that occurred to me today is that these problems provide a really good challenge for artificial intelligence. Despite being described as “computational thinking” problems, they are actually problems that test the kind of things that computers cannot do—the interstitial material between the messy real world and the structured problems that can be tackled by computer. This makes them exactly the sort of things that AI ought to be working towards and where we could gain lots of insight about intelligence. One promising approach is the “mind’s eye” visual manipulation described by Maithilee Kunda in this paper about visual mental imagery and AI.
The major social media companies have basically been providing the same, largely unchanging product, for the last decade. Yes—they are doing it very well, managing to scale number of users and amounts of activity, and optimising the various conflicting factors around usability, advertising, etc. But, basically, Twitter has been doing the same schtick for the last decade. Yet, if media and government were looking to talk to an innovative, forward-looking company, they might well still turn to such companies.
By contrast, universities, where there is an enormous, rolling programme of change and updating, keeping up with research, innovating in teaching, all in the context of a regulatory and compliance regime that would be seen as mightily fuckoffworthy if imposed on such companies, are portrayed as the lumbering, conservative forces. Why is this? How have the social media companies managed to convey that impression—and how have we in higher education failed?
What’s going on here?
This is the back of the packaging of my protein bar. What’s with the white stripe across the top left? It reads, basically “# _____ DAY, fuelled by 12g of PRIMAL PROTEIN”. Presumably the the # is a hashtag marker, and there is meant to be some text between that and “DAY”. Is this some kind of fill-in-the-blank exercise? I don’t think so, it seems rather obscure without any further cue. Did it at one point say something that they had to back away from for legal reasons: “# TWO OF YOUR FIVE A DAY”, perhaps? If so, why redesign it with a white block? Does packaging work on such a tight timescale that they were all ready to go, when someone emailed from legal to say “uh, oh, better drop that” and so someone fired up Indesign and put a white block there. Surely it can’t be working on such a timescale that there wasn’t time enough to make it the same shade of red as the rest, or rethink it, or just blank out the whole thing. Is it just a production error? At first I thought it was a post-hoc sticker to cover up some unfortunate error, but it is a part of the printed packaging. A minor mystery indeed.
Every cloud computer has a very expensive data centre lining.
Slack—like email, but somehow with a lot less guilt about ignoring it.
It’s surprising to me, in a world where social media is generally assumed to be ubiquitous, how many people have minimal-to-no online presence. Whilst I was sorting through piles of stuff from my Dad’s house (well, sorting out in the sense of looking at it and then putting it in a box in a storage unit), I came across a lot of things with names on—old school photos, programmes from concerts and plays at school with lists of pupils and teachers, lists of people who were involved in societies at University, details of distant family members, etc. Looking up some people online, I was surprised how often there was no online trace. I understand that some people might have changed names, gone to ground, died, or whatever, but a good third of people, I would say, had no or close-to-no online presence. Don’t quite know what to make of this, but it shows how the idea that we are a completely online community to be unreliable.
The flexibility of computer languages is considered to be one of their sources of power. The ability for a computer to do, within limits of tractability and Turing-completeness, anything with data is considered one of the great distinguishing features of computer science. Something that surprises me is that we fell into this very early on in the history of computing; very early programmable computer systems were already using languages that offered enormous flexibility. We didn’t have a multi-decade struggle where we developed various domain-specific languages, and then the invention of Turing-complete generic languages was a key point in the development of computer programming. As-powerful-as-dammit languages were—by accident, or by the fact of languages already building on a strong tradition in mathematical logic etc.—there from the start.
Yet, in practice, programmers don’t use this flexibility.
How often have we written a loop such as for (int i=0;i<t;i++)? Why, given the vast flexibility to put any expression from the language in those three slots, hardly put anything other than a couple of different things in there? I used to feel that I was an amateurish programmer for falling into these clichés all the time—surely, real programmers used the full expressivity of the language, and it was just me with my paucity of imagination that wasn’t doing this.
But, it isn’t. Perhaps, indeed, the clichés are a sign of maturity of thinking, a sign that I have learned some of the patterns of thought that make a mature programmer?
The studies of Roles of Variables put some meat onto these anecdotal bones. Over 99% of variable usages in a set of programs from a textbook were found to be doing just one of around 10 roles. An example of a role is most-wanted holder, where the variable holds the value that is the “best” value found so far, for some problem-specific value of “best”. For example, it might be the current largest in a program that is trying to find the largest number in a list.
There is a decent argument that we should make these sorts of things explicit in programming languages. Rather than saying “int” or “string” in variable declarations we should instead/additionally say “stepper” or “most recent holder”. This would allow additional pragmatic checks to see whether the programmer was using the variable in the way that they think they are.
Perhaps there is a stronger argument though. Is it possible that we might be able to reason about such a restricted language more powerfully than we can a general language? There seems to be a tension between the vast Turing-complete capability of computer languages, and the desire to verify and check properties of programs. Could a subset of a language, where the role-types had much more restricted semantics, allow more powerful reasoning systems? There is a related but distinct argument that I heard a while ago that we should develop reasoning systems that verify properties of Turing-incomplete fragments of programs (I’ll add a reference when I find it, but I think the idea was at very early stages).
Les Hatton says that Software is cursed with unconstrained creativity. We have just about got to a decent understanding of our tools when trends change, and we are forced to learn another toolset—with its own distinctive set of gotchas—all over again. Where would software engineering have got to if we had focused not on developing new languages and paradigms, but on becoming master-level skilled with the already sufficiently expressive languages that already existed? There is a similar flavour here. Are we using languages that allow us to do far more than we ever need to, and subsequently limiting the reasoning and support tools we can provide?
Here’s a thought, which came from a conversation with Richard Harvey t’other week. Is it possible for a degree to harm your job prospects? The example that he came up with was a third class degree in some vocational or quasi-vocational subject such as computer science. If you have a third class degree in CS, what does that say to prospective employers? Firstly, that you are not much of a high-flyer in the subject—that is a no-brainer. But, it also labels you as someone who is a specialist—and not a very good one! The holder of a third in history, unless they are applying specifically for a job relating to history, isn’t too much harmed by their degree. Someone sufficiently desperate will take them on to do something generic (this relates to another conversation I had about careers recently—what are universities doing to engage with the third-class employers that will take on our third-class graduates? Perhaps we need to be more proactive in this area, rather than just dismissive, but this requires a degree of tact beyond most people.). But a third-class computing/architecture/pharmacy student is stuck in the bind that they have declared a professional specialism, and so employers will not consider them for a generic role; whilst at the same time evidencing that they are not very good in the specialism that they have identified with. Perhaps we need to do more for these students by emphasising the generic skills that computer science can bring to the workplace—”computing is the new Latin” as a rather tone-deaf saying goes.
Two interesting machine learning/AI challenges (emerging from a chat with my former PhD student Lawrence Beadle yesterday):
- Devise a system for automatically doing substitutions in online grocery shopping, including the case which recognises that substituting a Manchester City-themed birthday cake is not an adequate substitution for a Manchester United-themed birthday cake, despite them both being birthday cakes, of the same weight, same price, and both having the word “Manchester” in the name.
- Devise a forecasting system that will not predict that demand for turkeys will be enormous on December 27th, or flowers on February 15th.
Both of these need some notion of context, and perhaps even explanation.
Why aren’t more legal-regulatory systems in conflict? A typical legal decision involves a number of different legal, contractual, and regulatory systems, each of which consists of thousands of statements of law and precedents, that latter only fuzzily fitting the current situation, with little meta-law to describe how these different systems and statements interact. Why, therefore, is it very rarely, if at all, that court cases and other legal decisions end up with a throwing up of hands and the judgement-makers saying “this says this, this says this, they contradict, therefore we cannot come to a well-defined decision”. Somehow, we avoid this situation—decisions are come to fairly definitively, albeit sometimes controversially. I cannot imagine that people framing laws and regulations have a sufficiently wide knowledge of the entire system to enable them to add decisions without contradiction. Perhaps something else is happening; the “frames” (in the sense of the frame problem in AI) are sufficiently constrained and non-interacting that it is possible to make statements without running the risk of contradiction elsewhere.
If we could understand this, could we learn something useful about how to build complex software systems?
I often refer to the process of taking the content that I want to communicate and putting it into the 200-by-300 pixel box reserved for content in the middle of our University’s webpages as “putting the clutter in”. I get the impression that my colleagues on the Marketing and Communication team don’t quite see it this way.
Been doing quite a bit of python programming this week. So far I have managed to mistype python as:
Bought the an album called Sex from Amazon a few days ago (by the excellent jazz trio The Necks). Inevitably, this caused the following request for feedback to appear in my inbox a few days later:
Followed, inevitably, by the following when I next went onto the Amazon website:
I went to an interesting talk by Jens Krinke earlier this week at UCL (the video will eventually be on that page). The talk was about work by him and his colleagues on observation-based program slicing. The general idea of program slicing is to take a variable value (or, indeed any state description) at a particular point in a program, and remove parts of the program that could not affect that particular value. This is useful, e.g. for debugging code—it allows you to look at just those statements that are influential on a statement that is outputting an undesirable value—and for other applications such as investigating how closely-coupled code is, helping to split code into meaningful sub-systems, and code specialisation.
The typical methods used in slicing are to use some formal model of dependencies in a language to eliminate statements. A digraph of dependencies is built, and paths that don’t eventually lead to the node of interest are eliminated. This had had some successes, but as Jens pointed out in his talk, progress on this has largely stalled for the last decade. The formal models of dependency that we currently have only allow us to discover certain kinds of dependency, and also using a slicer on a particular program needs a particular model of the language’s semantics to be available. This latter point is particularly salient in the contemporary computing environment, where “programs” are typically built up from a number of cooperating systems, each of which might be written in a different language or framework. In order to slice the whole system, a consistent, multi-language framework would need to be available.
As a contrast to this, he proposed an empirical approach. Rather than taking the basic unit as being a “statement” in the language, take it as a line of code; in most languages these are largely co-incident. Then, work through the program, deleting lines one-by-one, recompiling, and checking whether the elimination of that line makes a difference in practice to the output on a large, comprehensive set of inputs (this over-simplifies the process of creating that input test set, as programs can be complex entities where producing a thorough set of input examples can be difficult, as sometimes a very specific set of inputs is needed to generate a specific behaviour later in the execution; nonetheless, techniques exist for building such sets). This process is repeated until a fix point is found—i.e. none of the eliminations in the current round made a difference to the output behaviour for that specific input set. Therefore, this can be applied to a wide variety of different languages; there is no dependency on a model of the language semantics, all that is needed is access to the source code and a compiler. This enables the use of this on many different kinds of computer systems. For example, in the talk, an example of using it to slice a program in a graphics-description language was given, asking the question “what parts of the code are used in producing this sub-section of the diagram?”.
Of course, there is a cost to pay for this. That cost is the lack of formal guarantee of correctness across the input space. By using only a sample of the inputs, there is a possibility that some behaviour was missed. By contrast, methods that work with a formal model of dependencies make a conservative guarantee that regardless of inputs, the slice will be correct. Clearly, this is better. But, there are limits to what can be achieved using those methods too; by using a model that only allows the elimination of a statement if it is guaranteed under that model to never have a dependency, it ignores two situations. The first of these is that the model is not powerful enough to recognise a particular dependency, even though it is formally true (this kind of thing crops up all over the place; I remember getting frustrated with the Java compiler, which used to complain that a particular variable value “might not have been initialised” when it was completely obvious that it must have been; e.g. in the situation where a variable was declared before an if statement and then given a value in both possible branches, and then used afterward that statement). The second—and it depends on the application as to whether this matters—is that perhaps a formal dependency might crop up so infrequently as to not matter in practice. By taking an empirical approach, we observe programs as they are being run, rather than how they could be run, and perhaps therefore find a more rapid route to e.g. bug-finding.
In the question session after the talk, one member of the audience (sorry, didn’t notice who it was) declared that they found this approach “depressing”. Not, “wrong” (though other people may have thought that). The source of the depression, I would contend, is what I will call the fallacy of formal representations. There is a sense that permeates computer science that because we have an underlying formal representation for our topic of study, we ought to be doing nothing other than producing tools and techniques that work on that formal representation. Empirical techniques are both dangerous—they produce results that cannot be guaranteed, mathematically, to hold—and a waste of time—we ought to be spending our time producing better techniques that formally analyse the underlying representation, and that it is a waste of time to piss around with empirical techniques, because eventually they will be supplanted by formal techniques.
I would disagree with this. “Eventually” is a long time, and some areas have just stalled—for want of better models, or just in terms of practical application to programs/systems of a meaningful size. There is a lot of code that doesn’t require the level of guarantee that the formal techniques provide, and we are holding ourselves up as a useful discipline if we focus purely on techniques that are appropriate for safety-critical systems, and dismiss techniques that are appropriate, for, say, the vast majority of the million+ apps in the app store.
Other areas of study—let’s call them “science”—are not held up by the same mental blockage. Biology and physics, for example, don’t throw their hands up in the air and say “nothing can be done”, “we’ll never really understand this”, just because there isn’t an underlying, complete set of scientific laws available a priori. Instead, a primary subject of study in those areas is the discovery of those laws, or at least useful approximations thereto. Indeed, the development of empirical techniques to discover new things about the phenomena under study is an important part of these subject areas, to the extent that Nobel Prizes have been won (e.g. 1977; 2003; 1979; 2012; 2005) for the development of various measurement and observation techniques to get a better insight into physical or biological phenomena.
We should be taking—alongside the more formal approaches—an attitude similar to this in computer science. Yes, many times we can gain a lot by looking at the underlying formal representations that produce e.g. program behaviour. But in many cases, we would be better served by taking these behaviours as data and applying the increasingly powerful data science techniques that we have to develop an understanding of them. We are good at advocating the use of data science in other areas of study; less good at taking those techniques and applying them to our own area. I would contend that the fallacy of formal representations is exactly the reason behind this; because we have access to that underlying level, we cannot convince ourselves that, with sufficient thought and care, we cannot extract the information that we need from ratiocination about that material, rather than “resorting” to looking at the resulting in an empirical way. This also prevents the development of good intermediate techniques, e.g. those that use ideas such as interval arithmetic and qualitative reasoning to analyse systems.
Mathematics has a similar problem. We are accustomed to working with proofs—and rightly so, these are the bedrock of what makes mathematics mathematics—and also with informal, sketched examples in textbooks and talks. But, we lack an intermediate level of “data rich mathematics”, which starts from formal definitions, and uses them to produce lots of examples of the objects/processes in question, to be subsequently analysed empirically, in a data-rich way, and then used as the inspiration for future proofs, conjectures and counterexamples. We have failed, again due to the fallacy of formal representations, to develop a good experimental methodology for mathematics.
It is interesting to wonder why empirical techniques are so successful in the natural sciences, yet are treated with at best a feeling of depressed compromise, at worst complete disdain, in computer science. One issue seems to be the brittleness of computer systems. We resort (ha!) to formal techniques because there is a feeling that “things could slip through the net” if we use empirical techniques. This seems to be much less the case in, say, biological sciences. Biologists will, for example, be confident what they have mapped out a signalling pathway fairly accurately having done experiments on, say, a few hundred cells. Engineers will feel that they understand the behaviour of some material having carefully analysed a few dozen samples. There isn’t the same worry that, for example, there is some critical tight temperature range, environmental condition, or similar, that could cause the whole system to behave in a radically different way. Something about programs feels much more brittle; you just need the right (wrong!) state to be reached for the whole system to change its behaviour. This is the blessing and the curse of computer programming; you can do anything, but you can also do anything, perhaps by accident. A state that is more-or-less the same as another state can be transformed into something radically different by a single line of code, which might leave the first state untouched (think about a divide-by-zero error).
Perhaps, then, the fault is with language design, or programming practice. We are stuck with practices from an era where every line of code mattered (in memory cost or execution time), so we feel the need to write very tight, brittle code. Could we redesign languages so that they don’t have this brittleness, thus obviating the need for the formal analysis methods that are there primarily to capture the behaviours that don’t occur with “typical” inputs. What if we could be confident—even, perhaps, mathematically sures—that there were no weird pathological routes through code? Alternatively, what if throwing more code at a problem actually made us more confident of it working well; rather than having tight single paths through code, have the same “behaviour” carried out and checked by a large number of concurrent processes that interact in a way that don’t have the dependencies of traditional concurrency models (when was the last time that a biosystem deadlocked, or a piece of particle physics, for that matter?). What if each time we added a new piece of code to a system, we felt that we were adding something of value that interacted in only a positive way with the remainder of the code, rather than fearing that we have opened up some kind of potential interaction or dependency that will cause the system to fail. What if a million lines of code couldn’t be wrong?
Here’s an interesting phenomenon about memory. I sometimes remember things in an indirect way, that is, rather than remembering something directly, I remember how it deviates from the default. Two examples:
- On my father’s old car, I remembered how to open the window as “push the switch in the opposite way to what seems like the right direction.”
- On my computer, I remember how to find things about PhD vivas as “really these ought to be classified under research, but there’s already a directory called ‘external examining’ under teaching, so go in there and look for the directory called extExams and then the sub-directory called PhD“.
It makes me wonder what other things that I do have a similar convoluted story in my memory, but where the process just all happens pre-consciously.
Brain-computer interfaces are currently at about the same stage of development as a system where a moderately skilled person throws golf balls at a keyboard from around 30 feet away.
My colleague Sally Fincher has pointed out that one interesting aspect of architecture and design academics is that the vast majority of them continue with some kind of personal practice in their discipline alongside carrying out their teaching and research work. This contrasts with computer science, where such a combination is rather unusual. It might be interesting to do a pilot scheme that gave some academic staff a certain amount of time to do this in their schedule, and see what influence it has on their research and teaching.
Interestingly, a large proportion of computer science students have a personal practice in some aspect of computing/IT. It is interesting to note quite how many of our students are running a little web design business or similar on the side, alongside their studies.
An interesting challenge for computational creativity research. Build a system which takes in a large dataset, and which builds an interesting and informative infographic from that data.