At a time when the fragility and vulnerability of digital records are increasingly evident, maintaining the trustworthiness of public archives is more important than ever.
Video and sound recordings can be manipulated to put words into mouths of people who never said them, photographs can be doctored, content added to or removed from videos and recently, AI technology has “written” news articles that can mimic any writer’s style. All of these media and many other “born-digital” formats will come to form the public record. If archives are to remain an essential resource for democracy, able to hold governments to account, the records they hold must be considered trustworthy.
But is this really a problem for archives?
Until recently, this has not been a concern for archives. People trust archives, especially public archives. We are seen as experts, preserving and providing access to our holdings freely and over a lengthy period (since 1838 in the case of The National Archives in the UK). We could rest on our laurels. But the challenges to our practice brought by digital technologies have to lead us to question whether this institutional or inherited trust is enough when faced with the forces of fakery that have emerged in the 21st century.
In 2017, The National Archives of the UK, partnered with the Centre for Vision, Speech and Signal Processing (CVSSP) at the University of Surrey and Tim Berners-Lee’s non-profit Open Data Institute, started to research how a new technology could be harnessed to serve on the side of archives. The ARCHANGEL project is investigating how blockchain can provide a genuine guarantee of the authenticity of the digital records held in archives. A way of publicly demonstrating our trustworthiness by proving that the digital records held in archives are authentic and unchanged.
Often considered synonymous with Bitcoin, blockchain is the technology that underpins a number of digital currencies but it has the potential for far wider application. At root, it is the digital equivalent of a ledger, like a database but with two features that set it apart from standard databases. Firstly, the blockchain is append only, meaning that data cannot be overwritten, amended or deleted; it can only be added. Secondly, it is distributed. No central authority or organisation has sole possession of the data. Instead, a copy of the whole database is held by each member of the blockchain and they collaborate to validate each new block before it is written to the ledger. As a result, there is no centralised authority in control of the data and each participant has an equal status in the network: equal responsibility, equal rights and an equal stake.
As with any new technology, there are issues to be researched and resolved. The most common criticism is that 51% of the participants could collude to change the data written on the blockchain. This is less likely in the case of ARCHANGEL because it is a permissioned blockchain. This means that every member has been invited and their identity is known, unlike bitcoin networks where many of the members are anonymous.
A more practical issue that arose early on was around what information could be shared on an immutable database that would be available to the public, to prove that they were unchanged from the point of receipt by the archives. Every public archive holds records closed due to their sensitive content. This sensitivity sometimes extends to their filenames or descriptions so adding these metadata fields to the blockchain would not be appropriate. We settled on a selection of fields that included an archival reference and the checksum, a unique alphanumeric string generated by a mathematical algorithm that changes completely if even one byte is altered in the file. In this way, a researcher can compare the checksum of the record they download against the checksum on the blockchain (written when the record was first received, potentially many years previously) and see for themselves that the checksums match. As archives sometimes convert formats in order to preserve or present records to the public, the project has also developed a way of generating a checksum based on the content of a video file rather than its bytes. This enables the user to check that the video has not been altered for unethical reasons while in the archive’s custody.
So, the ARCHANGEL blockchain enables an archive to upload metadata that uniquely identifies specific records, have that data sealed into a “block” that cannot be altered or deleted without detection, and share a copy of the data with each of the other trusted members of the network for as long as the archives (some of the oldest organisations in the world) maintain it.
In the prototype testing, we found that the key to engaging other archives is in emphasising the shared nature of the network. Only by collaborating with partners can the benefits of an archival blockchain be realised by any of us. It is blockchain’s distributed nature that underpins the trustworthiness of the system; that enables it to be more reliable, more transparent and more secure, and therefore effective in providing a barrier against the onslaught of synthetic content.
At the same time, the effort of the organisations to make the prototype work demonstrates their trustworthiness: in wanting to share the responsibility for proving the authenticity of the records they hold, they demonstrate their expertise and honesty.
The arms race with the forces of fakery that archives find themselves in is the reason why The National Archives is thinking about trust. We do not want people to trust archives only because of their longevity and expertise. Instead, we want to demonstrate their trustworthiness. We want to provide what Baroness Onora O’Neill said was needed in the BBC Reith Lectures in 2002:
“In judging whether to place our trust in others” words or undertakings, or to refuse that trust, we need information and we need the means to judge the information.” O’Neill, A Question of Trust
This is what we think blockchain gives us as a profession: by being part of a network of trusted organisations which assure the authenticity of each other’s records, we demonstrate the trustworthiness of all of our records.
Acknowledgements
The ARCHANGEL Project would like to acknowledge the funding received from the ESPRC Grant Ref EP/P03151X/1.
Copyright
Header image: ‘Crown copyright 2019 courtesy of The National Archives’
Further details:
The project website is here: https://www.archangel.ac.uk/
For a more detailed paper about the project see: https://arxiv.org/pdf/1804.08342.pdf