This is the first episode of a special Speaking With podcast series titled No Problem Too Big, where a panel of artists and researchers speculate on the end of the world as though it has already happened.
It’s not the world we grew up in. Not since artificial intelligence. The machines have taken control.
Three fearless researchers gather in the post-apocalyptic twilight: a computer scientist, a mechanical engineer and a sci-fi author.
Together, they consider the implications of military robots and autonomous everything, and discover that the most horrifying post-apocalyptic scenario might look something like unrequited robot love.
Joanne Anderton is an award-winning author of speculative fiction stories for anyone who likes their worlds a little different. More information about Joanne and her novels can be found here.
No Problem Too Big is created and hosted by Adam Hulbert, who lectures in media and sonic arts at the University of New South Wales. It is produced with the support of The Conversation and University of New South Wales.
Artificial intelligence is already helping determine your future – whether it’s your Netflix viewing preferences, your suitability for a mortgage or your compatibility with a prospective employer. But can we agree, at least for now, that having an AI determine your guilt or innocence in a court of law is a step too far?
Worryingly, it seems this may already be happening. When American Chief Justice John Roberts recently attended an event, he was asked whether he could forsee a day “when smart machines, driven with artificial intelligences, will assist with courtroom fact finding or, more controversially even, judicial decision making”. He responded: “It’s a day that’s here and it’s putting a significant strain on how the judiciary goes about doing things”.
Roberts might have been referring to the recent case of Eric Loomis, who was sentenced to six years in prison at least in part by the recommendation of a private company’s secret proprietary software. Loomis, who has a criminal history and was sentenced for having fled the police in a stolen car, now asserts that his right to due process was violated as neither he nor his representatives were able to scrutinise or challenge the algorithm behind the recommendation.
The report was produced by a software product called Compas, which is marketed and sold by Nortpointe Inc to courts. The program is one incarnation of a new trend within AI research: ones designed to help judges make “better” – or at least more data-centric – decisions in court.
While specific details of Loomis’ report remain sealed, the document is likely to contain a number of charts and diagrams quantifying Loomis’ life, behaviour and likelihood of re-offending. It may also include his age, race, gender identity, browsing habits and, I don’t know … measurements of his skull. The point is we don’t know.
What we do know is that the prosecutor in the case told the judge that Loomis displayed “a high risk of violence, high risk of recidivism, high pretrial risk.” This is standard stuff when it comes to sentencing. The judge concurred and told Loomis that he was “identified, through the Compas assessment, as an individual who is a high risk to the community”.
The Wisconsin Supreme Court convicted Loomis, adding that the Compas report brought valuable information to their decision, but qualified it by saying he would have received the same sentence without it. But how can we know that for sure? What sort of cognitive biases are involved when an all-powerful “smart” system like Compas suggests what a judge should do?
Now let’s be clear, there is nothing “illegal” about what the Wisconsin court did – it’s just a bad idea under the circumstances. Other courts are free to do the same.
Worryingly, we don’t actually know the extent to which AI and other algorithms are being used in sentencing. My own research indicates that several jurisdictions are “trialling” systems like Compas in closed trials, but that they cannot announce details of their partnerships or where and when they are being used. We also know that there are a number of AI startups that are competing to build similar systems.
However, the use of AI in law doesn’t start and end with sentencing, it starts at investigation. A system called VALCRI has already been developed to perform the labour-intensive aspects of a crime analyst’s job in mere seconds – wading through tonnes of data like texts, lab reports and police documents to highlight things that may warrant further investigation.
The UK’s West Midlands Police will be trialling VALCRI for the next three years using anonymised data – amounting to some 6.5m records. A similar trial is underway from the police in Antwerp, Belgium. However, past AI and deep learning projects involving massive data sets have been have been problematic.
Benefits for the few?
Technology has brought many benefits to the court room, ranging from photocopiers to DNA fingerprinting and sophisticated surveillance techniques. But that doesn’t mean any technology is an improvement.
While using AI in investigations and sentencing could potentially help save time and money, it raises some thorny issues. A report on Compas from ProPublica made clear that black defendants in Broward County Florida “were far more likely than white defendants to be incorrectly judged to be at a higher rate of recidivism”. Recent work by Joanna Bryson, professor of computer science at the University of Bath, highlights that even the most “sophisticated” AIs can inherit the racial and gender biases of those who create them.
What’s more, what is the point of offloading decision making (at least in part) to an algorithm on matters that are uniquely human? Why do we go through the trouble of selecting juries composed of our peers? The standard in law has never been one of perfection, but rather the best that our abilities as mere humans allow us. We make mistakes but, over time, and with practice, we accumulate knowledge on how not to make them again – constantly refining the system.
What Compas, and systems like it, represent is the “black boxing” of the legal system. This must be resisted forcefully. Legal systems depend on continuity of information, transparency and ability to review. What we do not want as a society is a justice system that encourages a race to the bottom for AI startups to deliver products as quickly, cheaply and exclusively as possible. While some AI observers have seen this coming for years, it’s now here – and it’s a terrible idea.
An open source, reviewable version of Compas would be an improvement. However, we must ensure that we first raise standards in the justice system before we begin offloading responsibility to algorithims. AI should not just be an excuse not to invest.
While there is a lot of money to be made in AI, there is also a lot of genuine opportunity. It can change a lot for the better if we get it right, and ensure that its benefits accrue for all and don’t just entrench power at the top of the pyramid.
I have no perfect solutions for all these problems right now. But I do know that when it comes to the role of AI in law we must ask in which context they are being used, for what purposes and with what meaningful oversight. Until those questions can be answered with certainty, be very very sceptical. Or at the very least know some very good lawyers.
The Crystal Maze, the popular UK television show from the early 1990s, included a puzzle that is very useful for explaining one of the main conundrums in artificial intelligence. The puzzle appeared a few times in the show’s Futuristic Zone, one of four zones in which a team of six contestants sought to win “time crystals” that bought time to win prizes at the Crystal Dome at the end of the show.
Never solved in the two-minute time frame, the puzzle was based on a network of connected red circles (see clip below). On the wall was written a clue: “No consecutive letters in adjacent circles”. The letters A to H were printed on circular plates which could be fitted onto each circle.
So what is the right approach? We might start by considering which circles are hardest to label. With a little thought, you might choose the two middle circles, since they have the most connections. Now consider which letters might best be put on them: A and H are natural candidates because they each have only one neighbour (B and G, respectively). We might put them into the grid like this:
We can now do some deduction to eliminate incompatible possibilities for the other circles. For example the top-left circle is connected to both of the central circles. Since no consecutive letters can appear in connected circles, it can’t now contain B or G. Similar reasoning can be applied to the top-right, bottom-left, and bottom-right circles:
The leftmost and rightmost circles have to be treated differently, since each is only adjacent to one central circle. On the left we can rule out B, and on the right we can rule out G:
Look carefully at the remaining options and only the leftmost circle still has G as a possibility, and only the rightmost circle has B. Once we put them in place, we can remove further possibilities from the adjacent circles:
It is now time to make another guess. It seems reasonable to start with the top-left circle and try its first possibility: C. This allows us to rule out D from the adjacent circle and C from the bottom left. If we now guess E for the top-right circle, the bottom-left circle has only one possibility left, D, which leaves just F for the bottom-right circle. We have a solution:
This puzzle is an example of a much wider class of decision-making problems that arise in our lives, such as rostering decisions in a hospital or factory, scheduling buses or trains, or designing medical experiments. To save us the aggravation of coming up with the best solutions, one of the challenges for artificial intelligence is to develop a general way of representing and reasoning about them.
One method is known as the constraint satisfaction problem. Just like our Crystal Maze puzzle, problems that fit this model involve a set of required decisions (“cover each circle with a plate”); a fixed set of possibilities (“use the plates from A to H provided”); and a set of constraints that allow only certain combinations of possibilities (“no consecutive letters in adjacent circles”). If you input the requirements for your particular problem into a piece of software known as a constraint solver, it can then try to solve it. It will do this in much the same way as we solved the puzzle: it combines guessing (we call this “search”) with deduction, ruling out possibilities that cannot be part of a solution based on the decisions made so far.
The greatest challenge for programmers in this field is that as you increase the size of the input problem, it quickly becomes much harder to find solutions. This is directly related to how the software “guesses” the answer. Although our guesses proved correct in our simple puzzle, in AI they can often lead us down blind alleys. With large problems there can be a vast number of possibilities and a similarly vast number of dead ends.
One key question is whether there is some way of reaching solutions without going down these alleys. As yet, we don’t know. This directly relates to one of the most important open questions in computer science, the P vs NP problem, for which the Clay Mathematics Institute in the US is offering Us$1m (£657,000) for a solution. It essentially asks whether every problem whose answer can be checked quickly by a computer can also be quickly solved by a computer.
Until someone solves it, the prevailing view is that it cannot. If so, our software does have to search through all the possible guesses, in which case we need to make it as efficient as possible. One important factor here is the search strategy – which decision we tell the computer to focus on next and which value we assign to it. Also very important is what we decide are the requirements for the particular problem. Mapping our puzzle to a constraint satisfaction template was straightforward, but in real life there are often many different options. Choosing the right strategy and model can be the difference between finding a quick solution and failing in any practical amount of time.
We have now reached the stage where the latest constraint-solving software can solve far more complex practical problems than, say, ten years ago. It was used to plan the scientific activities of the Philae comet lander last year, for instance. It also offers a better way of organising evacuation schedules for large-scale disasters.
Constraint solving has found most success with scheduling problems, but there are other similar AI tools that are more useful for other types of questions. We won’t go into them here, but they include the likes of propositional satisfiability, evolutionary algorithms and mathematical programming techniques. The job of specialists is to analyse a problem, identify which combination of tools will be the most successful for a particular case, and put together a bespoke piece of software. Once computers can do this analysis and identification, hopefully only a few years in the future, we will have made a huge leap forward. Meanwhile, the battle to make each of these tools as powerful as possible continues.
Deep Blue gained world-wide attention in 1997 when it defeated the then chess world champion Garry Kasparov. But playing chess was all that Deep Blue could do. Ask it to play another game, even a simpler one, such as checkers, and Deep Blue would not even know how to play at beginner level. The same is also true of many other programs that can beat humans. Computers that can play poker cannot play bridge.
This type of tailored software development is also apparent in systems that we rely on every day. A system that produces nurse rosters may not be able to cope with producing shift patterns for a factory, even though they are both personnel scheduling systems. Programs that plan delivery routes of an online supermarket cannot usually be used to schedule appointments for servicing home appliances, even though they are both examples of a Vehicle Routing Problem.
In recent years there has been a growing interest in a field called hyper-heuristics, which aims to develop more general computer systems. The idea is to build systems that are not tailored for just one type of problem, but which can be reused for a wide range of problems.
The figure below shows a typical hyper-heuristic framework. Let’s assume that this framework is being used to tackle a nurse rostering problem, where we have to assign nurses to work a certain number of shifts over a certain time period, say a week.
If we start with a possible shift pattern (perhaps from the previous week), we can do certain things to improve it. For example, we could move a nurse from one shift to another, we could swap two nurses or we could remove all nurses from a certain shift (say the Wednesday evening shift) and replace them with nurses that do not meet their contractual arrangements, just to give a few examples. These changes to the shift pattern are usually called heuristics.
The important thing is that we have a number of these low-level heuristics that we can use to improve the current roster. All these heuristics are placed in the bottom of the framework. We now choose one of these heuristics and execute it (for instance, swap one nurse with another). We repeat the process of choosing and executing a heuristic over and over again, in the hope that we will gradually get a better roster. The quality of the roster is measured by the evaluation function, which checks the outcome.
The key to this approach is to decide in which order to execute the low-level heuristics. This is where the top part of the framework comes into play. The hyper-heuristic looks at the state of the system and decides which heuristic to execute. This is repeated until we decide to stop (maybe after a certain period of time, or after we have executed the low-level heuristics a certain number of times).
What makes a hyper-heuristic different, from other heuristic-selecting algorithms, is the “domain barrier”. This stops the higher level hyper-heuristic knowing anything about the problem it is trying to solve. The hyper-heuristic only has access to data that is common to any problem. This includes how long each low-level heuristic took to execute, the track record of each low-level heuristic (how well it has performed), how pairs of low-level heuristics work with each other, to give just a few examples.
The benefit of the domain barrier is that we can replace the low-level heuristics, and the evaluation function, with another type of problem. As the hyper-heuristic has no knowledge of the problem being tackled we would hope that we can use the same higher level algorithm to tackle this new problem. And, indeed, this has been shown to be the case in a large number of scientific problems.
The challenge in hyper-heuristics lies in developing a robust high-level strategy that is able to adapt to as many different problems as possible. We are still some way off having a hyper-heuristic that is able to produce nurse rosters, plan deliveries and play poker, but, given the pace of progress in this field, we hope to achieve this goal in the not-too-distant future.