AI Ethics & Archives
This week I have been watching the recording of the UK National Archive’s Annual Digital Lecture, which was titled “Turning over a new leaf: AI ethics in/through the archive” with speakers Dr Eleanor Drage and Dr Kerry McInerney. It’s a really interesting talk and I definitely encourage everyone to watch it. I would characterize the tone as kind of AI neutral, they’re not negative about AI but they are wanting to ask questions and talk about the ways to be ethical with AI use.
The most interesting thing I think was Dr Drage’s toolkit she developed for abiding by the EU’s high risk AI tools regulations, which was called HEaT (High Risk EU Ai Toolkit). I couldn’t find it on the web myself, but Dr Drage explained some of the aspects of it, and the very first thing the toolkit asks is “Is AI the right tool for the desired outcome”. And I think that’s really important, especially in higher ed but in lots of other domains too. Higher ed and other sectors have found AI as this shiny new toy and the powers that be have subsequently been throwing it at every problem and non-problem, regardless of whether it’s really the best tool for the job. Things like the Library of Congress’s experiments with using AI to catalog records, which set the cataloging community in a bit of an uproar, are an example. Nobody seems to be asking “is AI really the best tool for this job” and instead they’re all so caught up in the shiny new thing that they’re throwing it at everything, without any real consideration for the consequences and ethics of the situation. And I really think there needs to be more conversation around what ARE the proper uses of AI? Many of them, we’ve already been using them for ages, such as for analyzing data or for natural language processing in search engines. The advent of generative AI on the scene has excited non-tech folks in a way that previous uses of AI haven’t. But that’s no excuse for not considering whether or not a particular use-case is really the best use of resources. And make no mistake, generative AI uses A LOT of resources, from massive amounts of water and electricity unsustainable to the environment to massive amounts of low-paid labor for training the AI (a wrinkle I only just learned about myself).
So anyway, how does this relate to archives? I think it’s an important conversation to have as archives, particularly digital archives, are going to start being pressured to use AI in a host of ways, from cataloguing to interpretation. But is AI really better than a human at these things? I would argue strongly that it is not. And really, it’s just putting a middle-man between researcher/archivist and the data. Because there’s still going to be many humans involved in training the AI to even do these jobs, and there’s still going to be a requirement for humans to quality check the output, especially while AI is so notably full of errors. We can’t even get AI to stop giving us false information based on algorithmic predictive text yet. How can we possibly expect it to do complex higher thought processes such as those needed to accurately and ethically categorize disparate objects or interpret those objects. Humans don’t even agree universally on lots of archival interpretations, adding a machine to the mix isn’t really going to make that better. In the talk, they also discussed how AI throws out the unexpected when looking at archives, while a human would look at the unexpected and derive meaning from it. We cannot wholesale replace archivists and researchers with machine learning, and I very much fear that is the next “great innovation” coming down the pipeline.