[Archivesspace_Users_Group] Component unique identifiers

Rees, John (NIH/NLM) [E] reesj at mail.nlm.nih.gov
Tue Nov 6 10:59:28 EST 2018

Wow, this is super helpful.

We’ve been noodling on a similar use case. All our existing repository projects leverage our MARC 035 field ids (Voyager ILS-supplied) to mint ids/Fedora PIDs, but now we’re embarking on ASpace projects that don’t always have a Voyager record, or have ID minting practices from other external systems that we can’t replicate in ASpace - or maybe don’t want to.

We’re still struggling with what the ID should actually be – we’re wary of using internally-generated IDs.


John P. Rees
Archivist and Digital Resources Manager
History of Medicine Division
National Library of Medicine

From: Rachel Aileen Searcy <rachel.searcy at nyu.edu>
Sent: Tuesday, November 06, 2018 9:38 AM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org>
Subject: Re: [Archivesspace_Users_Group] Component unique identifiers

Hi Adrienne,

We had a similar issue here at NYU. Previous digitization projects relied on the shorter Archivist's Toolkit refids for file naming, but this became untenable with those created by ArchivesSpace. We didn't want to change our inter-departmental workflows too radically, so we contracted with HM to develop a plugin called the Digitization Work Order plugin (here on github<https://github.com/hudmol/digitization_work_order>). This plugin allows the user to select individual components from a resource record (including all if desired), which are then assigned sequential component unique identifiers that can be used for file naming or other purposes. The plugin also produces a csv of descriptive information of those components, and automatically inserts this newly created identifier into the components Component Unique Identifier field. We have some demo slides here<https://guides.nyu.edu/ld.php?content_id=26740399>, as well as instructions<https://docs.google.com/document/d/11kWxbFTazB6q5fDNBWDHJxMf3wdVsp8cd7HzjEhE-ao/edit> in our local ArchivesSpace manual. I'd also be happy to talk further to answer any questions.


On Mon, Nov 5, 2018 at 2:54 PM Chris Mayo <mayoc at bc.edu<mailto:mayoc at bc.edu>> wrote:
Hi Adrienne,

We ran into a similar issue at Boston College when we migrated from to ASpace from Toolkit. Our practice had been to combine the collection ID with an auto-generated refID to create component unique identifiers, but the auto-generated refIDs in Aspace were much too long for our needs.

What we eventually wound up doing is using the database primary key for a given archival object as the unique part of its component unique ID, so that any given for an archival object we're planning to digitize gets a CUI with the format of 'mmsID_NNNNN" where the numerical portion is pulled from the 'archival_object_NNNNN' at the end of the archival object's URL. The really handy part of this is that it lets us parse our CUIs to make API calls. It's also robust to rearrangement, if you are only moving the archival object around within the collection hierarchy - the database key remains the same. It doesn't survive reprocessing, however, if you are deleting/rebuilding/combining archival objects, so we always make sure to begin the process of digitization after a collection has been processed or reprocessed. It makes the CUIs somewhat semantically meaningful - but only if you know what you are looking at. We're still not sure how we feel about that, but it's what works for us for now.

Hope that helps!

On Mon, Nov 5, 2018 at 11:00 AM Pruitt, Adrienne <Adrienne.Pruitt at tufts.edu<mailto:Adrienne.Pruitt at tufts.edu>> wrote:
Hello, all,

We’re hoping to move away from semantically meaningful component unique identifiers, but need some way to be able to easily auto-generate a unique identifier that could be used for file-naming purposes in digitization projects. Working with legacy data, we have seen that there can be value in being able to easily associate a binary file floating around on a server somewhere with a relatively easily parsed identifier that links it to its related metadata. However, semantically meaningful identifiers based on collection structure  are a rather brittle system prone to breaking when collections are rearranged or reprocessed and easy to mis-enter when working with so many digits. We’re interested to hear how others are handling their identifiers (particularly in regards to digitization workflows.)

Thank you!

Adrienne Pruitt | Collections Management Archivist
Digital Collections and Archives
Tufts University
adrienne.pruitt at tufts.edu<mailto:adrienne.pruitt at tufts.edu> |617-627-0957
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org<mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>

Chris Mayo
Digital Production Librarian
Thomas P. O'Neill, Jr. Library
Boston College
chris.mayo at bc.edu<mailto:chris.mayo at bc.edu>
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org<mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>

Rachel Searcy
Accessioning Archivist, Archival Collections Management
New York University Libraries
212.998.2539 | rachel.searcy at nyu.edu<mailto:rachel.searcy at nyu.edu>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20181106/ad7ac00b/attachment.html>

More information about the Archivesspace_Users_Group mailing list