Recordkeeping

The journey from a records management system to a digital preservation system

“People have had a lot of trouble getting stuff out of RecordPoint.”

This sentence was a little worrying to hear. It was 2015, and our archive was contemplating digital preservation for the first time. We didn’t really know what it was, or how it worked. Neither did anyone else: the idea of having a “digital preservation system” received blank stares around the office. “Is it like a database? Why not use one of our CMS’s instead? Why do we need this?”

And so it was that I realised I was in over my head and needed outside help. I looked up state records offices to find out what they were doing, and realised there is such a thing as the job title “Digital Preservation Officer”. I contacted one of these “Digital Preservation Officers” to get on the right path.

The Digital Preservation Officer’s knowledge in that early conversation was invaluable, and helped us get over those early hurdles. She explained the basics: why digital preservation is important for an archive. How to get started. Breaking down jargon. Convincing non-archivists that yes, it is necessary. And – the importance of figuring out what you want to preserve.

“We will need to preserve digital donations,” I listed, “and digitizations of our physical inventory. Plus, I manage our digital records management system, RecordPoint – if we’re serious about our permanent records we will need to preserve those as well.” (The international digital records management system standard, ISO 16175 Part 2, says that “long-term preservation of digital records… should be addressed separately within a dedicated framework for digital preservation or ‘digital archiving’”.)

It was at this point that the Digital Preservation Officer replied with the quote that began this article.

I don’t think she was quite right – getting digital objects and metadata out of RecordPoint was quite easy. The challenge, it turned out, would be getting the exported digital objects into our digital preservation system, Archivematica.

In the image shown below, the folders on the left represent the top level of a RecordPoint export of two digital objects. The folders on the right are what Archivematica expects in a transfer package.

In the example above, there are three folders for ‘binaries’ (digital objects) and two folders for ‘records’ (metadata). Immediately something doesn’t make sense – why are there three binary folders for two objects?

The reason is that the export includes not only the final version of the digital object but also all previous drafts. In my example there is only a single draft, but if a digital object had 100 drafts, they would all be included here. This is great for compliance, but not so great for digital preservation where careful appraisal is necessary. The priority when doing an ‘extract, transform, load’ (ETL) from RecordPoint to Archivematica would be to ensure that the final version of each binary made it across to the ‘objects’ folder on the right.

An Archivematica transfer package should not only consist of digital objects themselves, of course – you are not truly preserving digital objects unless you also preserve their descriptive metadata. This is why the ‘metadata’ folder on the right exists: you can optionally create a single CSV file, ‘metadata.csv’, which contains the metadata for every digital object in the submission as a separate line. Archivematica uses this CSV file as part of its metadata preservation process.

In contrast, RecordPoint creates a metadata file for every one of the digital objects it exports. If you wanted to pull metadata across into the metadata CSV file for the Archivematica submission, you would need to go through every single metadata XML in the export and copy and paste each individual metadata element. Based on a test, sorting the final record from the drafts and preparing its metadata for Archivematica might take two to four minutes per record. Assuming we have 70,000 records requiring preservation, the entire process of transforming these records manually would take over 6,000 hours. Although technically possible, this is too much work to be achievable, and there would be a high likelihood of errors due to the tedious, detail-oriented work.

Fortunately, I knew the R programming language. R is used by statisticians to solve data transformation problems – and this was a data transformation problem! I created an application using a tool called R Shiny, providing a graphical interface that sits on the Archivematica server. I creatively called it RecordPoint Export to Archivematica Transfer (RPEAT). After running a RecordPoint export, you select the export to be transformed from a drop-down list in RPEAT and select the metadata to be included from a checklist. RPEAT then copies the final version of each digital object from the export into an ‘objects’ folder and trawls through each XML file to extract the required metadata. Finally, RPEAT creates a CSV file that contains all of the required metadata, and moves it into the ‘metadata’ folder. Everything is then ready for transfer into Archivematica.

Pushing 212 records exported from RecordPoint through RPEAT, selecting the correct metadata from the checklist, and doing some quick human quality assurance took 7 minutes. Scaled up, transforming all 70,000 records this way would take fewer than 39 hours. RPEAT reduces the time taken to prepare records for Archivematica by over 99% compared to manual processes.

The advice that the Digital Preservation Officer provided all those years ago was invaluable, and I think in particular the warning on “getting stuff out of RecordPoint” was pertinent – but I wish to expand on her point. The challenge is not unique to RecordPoint – the challenge is ETL in general. At a meeting of Australia and New Zealand’s digital preservation community of practice, Australasia Preserves, in early 2019, other archivists shared their struggle to do ETL from records management systems into their digital archive. This ability is an important addition to the growing suite of technical skills valuable to us digital preservation practitioners.

References

International Organisation for Standardisation. (2011). Information and documentation —

Principles and functional requirements for records in electronic office environments — Part 2:  Guidelines and functional requirements for digital records management systems  (ISO 16175-2). Retrieved from https://www.saiglobal.com/.

Header image

Artem Sapegin on Unsplash

Recordkeeping and museum professionals – the same but different? A retrospective musing on the Archives and Records Association annual conference 2017

The author of this article co-planned and participated in a panel presentation and debate on this topic in August 2017, and now in spring 2019, it seems an excellent opportunity to look back at the professional climate as it was 18 months ago and how professional activities in this area have progressed since then.

‘Everybody is a Heritage Professional Nowadays: Should Archivist and Curator Remain as Separate Professions?’ – this was the title of the panel session which took place at the ARA’s annual conference in London, chaired by Adrian Steel (then Director of the Postal Museum), with Charlotte Berry (then Hereford Cathedral Archivist) and Iain Watson (Director of Tyne and Wear Archives & Museums) as co-panelists.

Each of the three panellists presented their response to the question of whether archivists and curators should remain separate professions or not. Suggested topics included whether:

  • each profession had skills that were unique
  • the job titles of archivist and curator empower or stifle professionals
  • it is just the professionals who retain this distinction, whereas the public see archivists and curators as much the same thing
  • two very similar and overlapping professional roles are necessary in times of increasing economic pressures
  • there is a need now for the ‘super’ heritage professional who can do both roles archivists and curators are managers of resources or producers/editors of content?

The viewpoints of the three panellists were diverse and wide-ranging, reflecting their own varied individual professional experiences – two as qualified archivists who now work widely with object collections and museum professional colleagues, and the third as a widely experienced museum professional who now manages a joint service employing museum and archive professionals in tandem.

Charlotte’s slot focused on the very many areas of professional overlap – the importance of collection expertise, understanding provenance and interconnectivity, and cross-sectoral standards of best practice and excellence. Technology and digitisation offer increasing opportunities for public access but also potentially can erode the unique skillset of the archivist – where thorough training in and understanding of legal history, palaeography/diplomatic, administrative history, original order and record types come under pressure as budgets buckle and services shrink. There are also fundamental differences between the two sectors, partly reflecting differences in the development of museums and archives, governance at a national level following the split of MLA and also in different routes for education, qualification and entry to the sectors. Charlotte feels strongly that ongoing workforce and professional development should celebrate the key differences and core skills within our two sectors, whilst encouraging archive professionals to learn from their museum colleagues in areas such as sustainability and resilience, engagement and advocacy. “The same but different” is a useful catchphrase embracing the numerous synergies, encouraging expertise within each profession and recognising the overly generic and homogenous nature of the ‘heritage professional’. With two feet placed firmly within both textual and material culture, archives are well placed to bridge the gaps between the two and to act as a conduit between the museum and library sectors.

Adrian’s viewpoint developed from a wealth of experience working in a trailblazing joint heritage organisation navigating complex governance and legal requirements, where curators and archivists use one Collections Management system which enables one single public access catalogue – a huge benefit to both staff and the public alike. Definitions of the material being cared for can create both synergies and problems – paper material increasingly appears in both archive and museum collections, but is catalogued differently according to existing best practice – for example, a greetings card would be catalogued by colour, dimension, weight etc by curators, but only by recipient/sender by archivists. Handling the original collections is another area of different professional practice – although handling collections enable some museum object duplicates to be handled by the public, most items are accessed via exhibitions or viewing digital surrogates online and it remains the curators who can handle the originals. Conversely, archivists will typically encourage readers to come to the archive and do their own research, or to use digital surrogates and do their research from the comfort of their own desk at home. Professional approaches also differ within interpretation – Adrian suggested that archivists are trained to be more neutral and detached from the narratives held in their collections, and opt to leave the user to take what they will from their archival research and to put it into a wider historical or social context. Curators often have a stronger sense of duty to interpret on behalf of an object, to engage with wider campaigns which increase the social impact of the sector’s work and to engage in museum activism. Although often co-existing happily in mutual contradiction, these distinct aspects of the two professions should not be ironed out but should be facilitated and embraced through increasing collaboration and cross-sectoral working.

Iain explored how using physical definitions to create professional distinctions between curators (objects), libraries (published material) and archives (documentary materials) can be problematic in practice. The dividing lines between all three sectors are becoming increasingly blurred and indistinct at institutional levels, but the fact remains that archivists and curators’ shared responsibility is to make evidence available – “bad archivists write hiding aids”, not finding aids. Iain advocated strongly for introducing broad generic roles in a professional context where specialist skills, knowledge and experience can co-exist and be valued. Three roles would cover the principal functions – information/knowledge manager (holding knowledge about what the item is, what it contains and its significance), the conservator (responsible for physical care and preservation of the item) and the interpreter/learning officer/producer (interpreting and engaging with the item). He urged the archive sector to embrace a more proactive user-based and user-generated approach, where the recordkeeping professionals renounce their expert role and hand some of their power to the public. Archivist and curator are inherently inward-looking terms – instead, the key concern for us all is the user and now to find new ways and means of creative engagement within museums and archives.

Since summer 2017, archivists continue to develop professional and academic interests in managing museum collections. The spring 2018 issue of the ARA’s journal Archives and Records was widely oversubscribed and featured a wide range of international articles looking at sharing best practice and theory across archives and museums. Sessions on museums continue to appear at the 2018 and 2019 ARA annual conferences. A special issue of the ARA’s monthly membership magazine Arc celebrated object-centred engagement and projects in January 2019, and in the magazine, a call went out to assess membership need for training in managing object collections and to set up a new ARA Section. The first training day will take place in May 2019 and Charlotte is currently setting up a new Section for Archives and Museums for the Archives and Records Association (UK and Ireland).

Please contact Charlotte for further information if you’d like to find out more on what is proving to be an area of developing professional interest: archives@magd.ox.ac.uk.

Copyright

Banner image: MC: MP/1/24 Map of Romney Estates, Kent, 1614. With kind permission of the President and Fellows of Magdalen College, Oxford. ©Magdalen College Oxford