[Archivesspace_Users_Group] Archivists' Toolkit to ArchivesSpace Migration Workflow

Corey Schmidt Corey.Schmidt at uga.edu
Wed Dec 11 15:50:54 EST 2019


Thank you for the advice!

I have copies of databases made, but I do not have access to the MySQL databases directly, as that is being hosted by a third-party. If I can get access to them, I should be able to experiment with cleaning them up directly with SQL. I have some experience working with SQLite, so it shouldn’t be too hard to learn. I’ll look into using Navicat too, as it seems like a pretty handy tool.



From: archivesspace_users_group-bounces at lyralists.lyrasis.org <archivesspace_users_group-bounces at lyralists.lyrasis.org> On Behalf Of Noah Huffman
Sent: Wednesday, December 11, 2019 12:13 PM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org>
Subject: Re: [Archivesspace_Users_Group] Archivists' Toolkit to ArchivesSpace Migration Workflow

[External Sender]
Hi Corey,

I’d highly recommend doing as much data cleanup as possible directly in your AT database, particularly extent type values, container type values, etc. Of course, be sure to back up your database☺

If you’re comfortable writing/running SQL queries, you can connect a free database management tool like MySQL Workbench to your AT backend database to review/clean your data in bulk.

Alternatively, there are other database management tools like Navicat that let you review and edit the backend database tables in a graphical spreadsheet-like editor (without having to write SQL). I used Navicat to clean up a lot of AT data prior to migrating to ASpace. If you can do all your cleanup in the 14-day trial period, then you don’t have to buy the software!

Best of luck,

Noah Huffman
Archivist for Metadata, Systems, and Digital Records
David M. Rubenstein Rare Book & Manuscript Library
Duke University | 919-660-5982

From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> <archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org>> On Behalf Of Corey Schmidt
Sent: Wednesday, December 11, 2019 10:20 AM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org<mailto:archivesspace_users_group at lyralists.lyrasis.org>>
Subject: Re: [Archivesspace_Users_Group] Archivists' Toolkit to ArchivesSpace Migration Workflow


Thank you so much for the wonderful info!

I am just beginning to dive into the data, but I will look out for linked cross references and the creation of new data in ArchivesSpace. We don’t have any EADs that exist outside of AT so far as I’m aware, so the process of exporting from AT, cleaning up, then reimporting might be unnecessary as you pointed out. If it’s possible to update the AT database directly, that would save so much time and effort. The articles you linked were very helpful and should give me a basis for how to approach this and other potential problems.

Thanks again!

From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> <archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org>> On Behalf Of Custer, Mark
Sent: Tuesday, December 10, 2019 6:33 PM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org<mailto:archivesspace_users_group at lyralists.lyrasis.org>>
Subject: Re: [Archivesspace_Users_Group] Archivists' Toolkit to ArchivesSpace Migration Workflow

[External Sender]

First of all, welcome to the ArchivesSpace community!  Your project will definitely make you an ArchivesSpace expert in no time.

Glancing at your workflow document, it looks like you’ve got all of your bases covered.  I would strongly suggest running the AT to ASpace migration a couple of times (just to get familiar with it), and also to get a few other folks to help spot-check the results (the more eyes / institutional experience the better), before you go ahead and do the final migration.  It’s a been a long while since I used that tool.  I do remember, though, that when we started testing it out, the tool couldn’t be used to migrate multiple AT databases into a single ASpace database.  That feature was added a long time back, though, so you *should* be good to go with migrating your two AT databases into a single version of ASpace, which I would highly recommend.

Regarding the AT to ASpace migration tool, just off the top of my head, I’d also add:

  1.  if you have linked cross references in your current EAD files, you’ll either want to make sure to run the migration with the “-refid_original” parameter, or set aside some time after the migration to go back and update those linked cross references (during the migration, all of the IDs in the AT for the notes and components will be replaced with ASpace IDs, so if you have cross references embedded in your finding aids, those will be broken).  We had a lot of cross references in our source files in the AT, so we opted with that first option, as you can see here:  https://cpb-us-w2.wpmucdn.com/campuspress.yale.edu/dist/3/30/files/2015/06/Unknown-272cqj2.png<https://urldefense.proofpoint.com/v2/url?u=https-3A__cpb-2Dus-2Dw2.wpmucdn.com_campuspress.yale.edu_dist_3_30_files_2015_06_Unknown-2D272cqj2.png&d=DwMGaQ&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=F0JE8U9-xhxe_nE7d7aEUi-uqfKqvYJ222bS0oz9mko&m=vo2cEUnvLpW5VnkjGgT51bFLOVXx20ZT_riutVaGWdk&s=wAFpobhQCHykVKaA3PvxMD1LV9nuhYN6NlBik8fLndQ&e=>  (that screenshot was taken by Maureen Callahan, and described in a lot of more detail and other invaluable advice in this blog post: https://campuspress.yale.edu/yalearchivesspace/2015/06/14/migration-step-by-step/<https://urldefense.proofpoint.com/v2/url?u=https-3A__campuspress.yale.edu_yalearchivesspace_2015_06_14_migration-2Dstep-2Dby-2Dstep_&d=DwMGaQ&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=F0JE8U9-xhxe_nE7d7aEUi-uqfKqvYJ222bS0oz9mko&m=vo2cEUnvLpW5VnkjGgT51bFLOVXx20ZT_riutVaGWdk&s=j8ov-2Tv_zGRR7Hv0uGKmvXsigJAHo14quVlYe2hYTQ&e=>)
  2.  unless things have changed, be prepared for that migration tool to occasionally add data where no data existed before (even when it’s not required to do so).  E.g. for our lists in the AT that did not have a title, after the migration we wound up with lists in ASpace that had a title of “Missing Title”.  That was something we missed early on, but later on we were able to delete all of those values from ASpace.  Ditto for other unexpected things, like if in the AT someone selected a container type of “Box” but level the container value blank, the AT to ASpace migration would create an auto-generated container number so that it could keep that container type of “Box” around without dropping it during the migration.  We had quite a few instances in the AT where someone would add “Box X, Folder”. That behaved fine in the AT, but once we migrated, that would turn into something like “Box X, Folder someweirdnumber” (ideally it would’ve just come over as “Box X”).

And unless you have EAD files that aren’t in the AT already, I don’t think that you should need to go the route of importing EAD into ASpace.  If you do, there are more issues and tricks involved there, not to mention the fact that the more import options you allow during the migration, the more variety you’ll have in the results.  That’s not to say that EAD imports are bad ASpace, since they’ve actually improved on the AT’s EAD import process.  Also, long after our migration, we still import EAD files into ASpace, especially when adding newly-created finding aids that are so large or complex that they’d be hard to create directly in ASpace. However, you’ll have to be mindful of all of the ways that those EAD imports can muddy up your database (the primary offender being the fact that the importer will add values to controlled value lists during the import process when it tries to parse them from free text data….  See the comment here https://github.com/archivesspace/archivesspace/blob/bc675bc12b72f6fb7818aae646958c80d54ff4de/backend/app/model/backend_enum_source.rb#L16<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_archivesspace_archivesspace_blob_bc675bc12b72f6fb7818aae646958c80d54ff4de_backend_app_model_backend-5Fenum-5Fsource.rb-23L16&d=DwMGaQ&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=F0JE8U9-xhxe_nE7d7aEUi-uqfKqvYJ222bS0oz9mko&m=vo2cEUnvLpW5VnkjGgT51bFLOVXx20ZT_riutVaGWdk&s=Ye49b0aqHVOMrCWJkMs8rMyxwWVhfMgMtHgnIpr4dTw&e=>… In other words, free text like “2.5 Linear feet (4 boxes)” would result in a new extent type with a value of “Linear Feet (4 boxes)”, which is not a controlled extent type that anyone wants to have around.

Okay, that’s probably more info than you wanted, but hopefully some of it’s helpful 😊

There are lots of others who have been through the same journey, so lots of folks to get advice from, including this recent article in the code4lib journal, https://journal.code4lib.org/articles/14871<https://urldefense.proofpoint.com/v2/url?u=https-3A__journal.code4lib.org_articles_14871&d=DwMGaQ&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=F0JE8U9-xhxe_nE7d7aEUi-uqfKqvYJ222bS0oz9mko&m=vo2cEUnvLpW5VnkjGgT51bFLOVXx20ZT_riutVaGWdk&s=qVHP4W8r7zhBScX6X5JXP077O3J_1CSgOY6NYJKX7Pk&e=>, which describes the migration to ASpace at Columbia University.

Keep us posted with how everything goes,


From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of Corey Schmidt
Sent: Tuesday, 10 December, 2019 2:09 PM
To: archivesspace_users_group at lyralists.lyrasis.org<mailto:archivesspace_users_group at lyralists.lyrasis.org>
Subject: [Archivesspace_Users_Group] Archivists' Toolkit to ArchivesSpace Migration Workflow

Dear ArchivesSpace Community,

Hello, my name is Corey Schmidt. I am the ArchivesSpace Project Manager at the University of Georgia Libraries. I am responsible for managing the migration from Archivists’ Toolkit to ArchivesSpace and revamping our public interface to include resources from both the Hargrett Rare Book and Manuscript Library and the Richard B. Russell Library for Political Research and Studies.

I created a project workflow of how I expect the project to unfold over the next year and I was hoping to get people’s feedback on it. Specifically, I’m looking for help identifying any missing processes, changing the order of the steps involved, or anything else that should be advised moving forward. I derived a lot of information from publications by the University of Minnesota, Harvard University, Yale University, the Rockefeller Archive Center, and the Bentley Historical Library at the University of Michigan.

Thank you for your time and assistance! I have attached the workflow to this email.



Corey Schmidt
ArchivesSpace Project Manager | University of Georgia Libraries
Email: Corey.Schmidt at uga.edu<mailto:Corey.Schmidt at uga.edu>
Phone: +1-706-542-8151

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20191211/18c3c5aa/attachment.html>

More information about the Archivesspace_Users_Group mailing list