AI's ability to make - or assist with - important decisions is fraught: on one hand, AI can *often* classify things very well, at speed and scale that outstrips the ability of any reasonably resourced group of humans. On the other, AI is sometimes *very* wrong, in ways that can be terribly harmful.
-
If you'd like an essay-formatted version of this thread to read or share, here's a link to it on pluralistic.net, my surveillance-free, ad-free, tracker-free blog:
https://pluralistic.net/2024/10/30/a-neck-in-a-noose/#is-also-a-human-in-the-loop
1/
Lisa Melton reshared this.
Dave Neary
in reply to Cory Doctorow • • •Locksmith
in reply to Dave Neary • • •Someone will suggest putting up a court of computers and jury composed of computers to judge computers.
bovaz
in reply to Locksmith • • •Red
in reply to Locksmith • • •Dave Pease ☑️
in reply to Red • • •MrC
in reply to Dave Neary • • •@dneary
"Machines can do the work so that people have time to think"
"Machines should do the work that's what they're best at, people should do the thinking that's what they're best at"
I first heard this as a sample on a mashup album but I tracked it down to this short film Henson made for IBM in 1968
https://www.youtube.com/watch?v=_IZw2CoYztk
- YouTube
www.youtube.compowersoffour
in reply to MrC • • •@ScotttSee @dneary that mashup album just had its 20th anniversary!
https://kleptones.bandcamp.com/album/a-night-at-the-hip-hopera-20th-anniversary-remaster
A Night At The Hip-Hopera (20th Anniversary Remaster), by The Kleptones
The KleptonesCaptain Superfluous
in reply to Cory Doctorow • • •n8chz
in reply to Captain Superfluous • • •Captain Superfluous
in reply to n8chz • • •@n8chz
Until Doctorow comes up with something better, maybe I'll just rearrange the words from "human-in-the-loop" to "inhuman-loop".
@pluralistic
Cory Doctorow
in reply to Captain Superfluous • • •Khleedril
in reply to Cory Doctorow • • •Rick Gaehl
in reply to Cory Doctorow • • •Rick Gaehl (@RickGaehl@mstdn.social)
Mastodon 🐘Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/2
Bureaucracies and the AI pitchmen who hope to sell them algorithms are very excited about the cost-savings they could realize if algorithms could be turned loose on thorny, labor-intensive processes. Some of these are relatively low-stakes and make for an easy call: @brewsterkahle recently told me about the @internetarchive's project to scan a ton of journals on microfiche they bought as a library discard.
2/
Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/3
It's pretty easy to have a high-res scanner auto-detect the positions of each page on the fiche and to run the text through OCR, but a human would still need to go through all those pages, marking the first and last page of each journal and identifying the table of contents and indexing it to the scanned pages.
3/
Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/4
This is something AI apparently does *very* well, and instead of scrolling through endless pages, the Archive's human operator now just checks whether the first/last/index pages the AI identified are the right ones. A project that could have taken years is being tackled with never-seen swiftness.
4/
Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/5
The operator checking those fiche indices is something AI people like to call a "human in the loop" - a human operator who assesses each judgment made by the AI and overrides it should the AI have made a mistake. "Humans in the loop" present a tantalizing solution to algorithmic misfires, bias, and unexpected errors, and so "we'll put a human in the loop" is the cure-all response to any objection to putting an imperfect AI in charge of a high-stakes application.
5/
Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/6
But it's not just AIs that are imperfect. Humans are *wildly* imperfect, and one thing they turn out to be *very* bad at is supervising AIs. In a 2022 paper for *Computer Law & Security Review*, the mathematician and public policy expert Ben Green investigates the empirical limits on human oversight of algorithms:
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3921216
6/
The Flaws of Policies Requiring Human Oversight of Government Algorithms
papers.ssrn.comCory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/7
Green situates public sector algorithms as the latest salvo in an age-old battle in public enforcement. Bureaucracies have two conflicting, irreconcilable imperatives: on the one hand, they want to be fair, and treat everyone the same. On the other hand, they want to exercise discretion, and take account of individual circumstances when administering justice. There's no way to do both of these things at the same time, obviously.
7/
Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/8
But algorithmic decision tools, overseen by humans, seem to hold out the possibility of doing the impossible and having both objective fairness *and* subjective discretion. Because it is grounded in computable mathematics, an algorithm is said to be "objective": given two equivalent reports of a parent who may be neglectful, the algorithm will make the same recommendation as to whether to take their children away.
8/
Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/9
But because those recommendations are then reviewed by a human in the loop, there's a chance to take account of special circumstances that the algorithm missed. Finally, a cake that can be both had, *and* eaten!
For the paper, Green reviewed a long list of policies - local, national, and supra-national - for putting humans in the loop and found several common ways of mandating human oversight of AI.
9/
Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/10
First, policies specify that algorithms *must* have human oversight. Many jurisdictions set out long lists of decisions that *must* be reviewed by human beings, banning "fire and forget" systems that chug along in the background, blithely making consequential decisions without anyone ever reviewing them.
10/
Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/11
Second, policies specify that humans can exercise *discretion* when they override the AI. They aren't just there to catch instances in which the AI misinterprets a rule, but rather to apply human judgment to the rules' applications.
Next, policies require human oversight to be "meaningful" - to be more than a rubber stamp. For high-stakes decisions, a human has to do a thorough review of the AI's inputs and output before greenlighting it.
11/
Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/12
Finally, policies specify humans *can* override the AI. This is key: we've all encountered instances in which "computer says no" and the hapless person operating the computer just shrugs their shoulders apologetically. Nothing I can do, sorry!
All of this *sounds* good, but unfortunately, it doesn't work. The question of how humans in the loop *actually* behave has been thoroughly studied, published in peer-reviewed, reputable journals, and replicated by other researchers.
12/
Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/13
The measures for using humans to prevent algorithmic harms represent theories, and those theories are testable, and they have been tested, and they are wrong.
For example, people (including experts) are highly susceptible to "automation bias." They defer to automated systems, even when those systems produce outputs that conflict with their own expert experience and knowledge.
13/
Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/14
A study of London cops found that they "overwhelmingly overestimated the credibility" of facial recognition and assessed its accuracy at 300% better than its actual performance.
Experts who are put in charge of overseeing an automated system get out of practice, because they no longer engage in the routine steps that lead up to the conclusion.
14/
Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/15
Presented with conclusions, rather than problems to solve, experts lose the facility and familiarity with how all the factors that need to be weighed to produce a conclusion fit together. Far from being the easiest step of coming to a decision, reviewing the final step of that decision without doing the underlying work can be *much harder* to do reliably.
15/
Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/16
Worse: when algorithms are made "transparent" by presenting their chain of reasoning to expert reviewers, those reviewers become *more* deferential to the algorithm's conclusion, not less - after all, now the expert has to review not just one final conclusion, but several sub-conclusions.
16/
Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/17
Even worse: when humans *do* exercise discretion to override an algorithm, it's often to inject the very bias that the algorithm is there to prevent. Sure, the algorithm might give the same recommendation about two similar parents who are facing having their children taken away, but the judge who reviews the recommendations is more likely to override it for a white parent than for a Black one.
17/
Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/18
Humans in the loop experience "a diminished sense of control, responsibility, and moral agency." That means that they feel less able to override an algorithm - and they feel less morally culpable when they sit by and let the algorithm do its thing.
All of these effects are persistent even when people know about them, are trained to avoid them, and are given explicit instructions to do so.
18/
Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/19
Remember, the whole reason to introduce AI is because of human imperfection. Designing an AI to correct human imperfection that only works when its human overseer is perfect produces predictably bad outcomes.
19/
Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/20
As Green writes, giving an AIhigh-stakes decisions, using humans in the loop to prevent harm, produces a "perverse effect": "alleviating scrutiny of government algorithms without actually addressing the underlying concerns." A human in the loop creates "a false sense of security" so algorithms are deployed for high-stakes tasks, and it shifts responsibility for algorithmic failures to the human, creating what Dan Davies calls an "accountability sink":
https://profilebooks.com/work/the-unaccountability-machine/
20/
The Unaccountability Machine - Profile Books
Profile BooksCory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/21
The human in the loop is a false promise, a "salve that enables governments to obtain the benefits of algorithms without incurring the associated harms."
So why are we still talking about how AI is going to replace government and corporate bureaucracies, making decisions at machine speed, overseen by humans in the loop?
21/
Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/22
Well, what if the accountability sink is a feature and not a bug. What if governments, under enormous pressure to cut costs, figure out how to also cut corners, at the expense of people with very little social capital, and blame it all on human operators? The operators become, in the phrase of Madeleine Clare Elish, "moral crumple zones":
https://estsjournal.org/index.php/ests/article/view/260
22/
Moral Crumple Zones: Cautionary Tales in Human-Robot Interaction
estsjournal.orgCory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/23
As Green writes:
> The emphasis on human oversight as a protective mechanism allows governments and vendors to have it both ways: they can promote an algorithm by proclaiming how its capabilities exceed those of humans, while simultaneously defending the algorithm and those responsible for it from scrutiny by pointing to the security (supposedly) provided by human oversight.
23/
Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/24
I'll be in Tucson, AZ from November 8-10: I'm the Guest of Honor at the TusCon science fiction convention:
https://tusconscificon.com/
--
Tor Books has just published two new, free "Little Brother" stories: "Vigilant," about creepy surveillance in distance education:
https://reactormag.com/vigilant-cory-doctorow/
And "Spill," about oil pipelines and indigenous landback:
https://reactormag.com/spill-cory-doctorow/
24/
Vigilant - Reactor
Cory Doctorow (Reactor)Cory Doctorow
in reply to Cory Doctorow • • •Content warning: Long thread/eof
Image:
Cryteria (modified)
https://commons.wikimedia.org/wiki/File:HAL9000.svg
CC BY 3.0
https://creativecommons.org/licenses/by/3.0/deed.en
eof/
File:HAL9000.svg - Wikimedia Commons
commons.wikimedia.orgSpooky McBoneyface
in reply to Cory Doctorow • • •As a #UX designer, instead of focusing on the specific LLM model, I'm fascinated by *our* models of the situation. The TechBoi crowd continually reduces nuanced human decisions into a simplistic procrustean bed.
We've seen this throughout tech's history, from early language translation software to 'smart' hand soap dispensers that don't recognize black hands. Tech's vision is nearly always myopic. We'll get there, but only after 100s of bad assumptions.
Cory Doctorow reshared this.
David LaFontaine
in reply to Spooky McBoneyface • • •Content warning: Long thread/eof
@scottjenson Hm. I’m not sure that we will “get there” in time to avoid catastrophic harms. I’m seeing a pervasive quasi-religious faith in the infallibility of Big Data, in all its myriad forms.
But there is also an increase in the stubborn “Eppur se muove” from what I refer to as “Gutter Galileos”
Those users who persist in pointing out how the product/experience is broken, no matter what the pretty squiggly line charts insist
David LaFontaine
in reply to David LaFontaine • • •Content warning: Long thread/eof
@scottjenson For example, at the end of this long and more than a little outlandish rant about the failure of the Skull & Bones video game from Ubisoft (they sank 11 years and $200MM into that turkey) … at about 23:33, comes an unexpected insight into how and why massive failures like this (and soon, so very many more) occur
Precisely because of the “Automation Bias” that dominates decision-making
https://youtu.be/fYMVFvk2K4g
- YouTube
youtu.beDavid LaFontaine
in reply to David LaFontaine • • •Content warning: Long thread/eof
@scottjenson but before that, it’s a long discourse on how user experience is so very valuable, and how focusing on all the other things besides whether or not a game is actually, you know, fun to play?
Contributed to this utter disaster
Matthew Maybe
in reply to Cory Doctorow • • •Cory Doctorow
in reply to Matthew Maybe • • •@matthewmaybe Did you follow any of the cited, replicated, peer-reviewed research that documents the problems with this approach, and the fact that experts consistently overrate their own ability to override an algorithmic judgment for both accuracy and fairness?
it's possible that you're the expert who is immune to this well-documented effect, but it seems likely that every experimental subject in the data believed this about themselves, and were demonstrably incorrect in that belief.
Matthew Maybe
in reply to Cory Doctorow • • •Kristian Nielsen
in reply to Cory Doctorow • • •