[Archivesspace_Users_Group] [EXTERNAL] Re: ASpace database constraints

Mayo, Dave dave_mayo at harvard.edu
Mon Jan 6 10:48:30 EST 2020

You’re very welcome, and I’m real glad ASnake is working well for you.

Re: JSONModel vs EAD export – the JSONModel version is authoritative – that is, it’s “the record as the system understands it.”  Import/export to EAD are best effort and defined by the code doing the conversion.  So there can be differences, and those differences can be lossy; if you’re writing code to make scripted changes using the API, you’re much better off altering the JSON and reuploading it than trying to do so via EAD.


If you do want/need to upload EAD, there’s a plugin that provides an API route for doing so: https://github.com/lyrasis/aspace-jsonmodel-from-format

And if you want to get EAD out of the system:

Looking at the script, it appears to be just calling an API method, specifically this one: https://archivesspace.github.io/archivesspace/api/#get-export-metadata-for-a-resource-description

The definition of “export” in the ruby file called by the scripts gives us the params being used:

def export(id)
    params = "include_unpublished=false&include_daos=true&numbered_cs=true"
    url = URI("#{AppConfig[:backend_url]}/repositories/#{repo_id}/resource_descriptions/#{id}.xml?#{params}")
    get(url, :xml)

So, using the example values of 2 for repo_id and 42 for resource ID, it in ASnake this looks like:

repo_id = 2
resource_id = 42
resp = asnake_client.get(f’repositories/{repo_id}/resource_descriptions/{resource_id}.xml’, params={‘include_unpublished’: False, ‘include_daos’: True, ‘numbered_cs’: True})
if (resp.status_code == 200) {
  # resp.text or resp.content to get EAD as either string or bytes

Dave Mayo (he/him)
Senior Digital Library Software Engineer
Harvard University > HUIT > LTS

From: <archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of "Huebschen, Alan M" <ahueb2 at uis.edu>
Reply-To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org>
Date: Monday, December 23, 2019 at 10:42 AM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org>
Subject: Re: [Archivesspace_Users_Group] [EXTERNAL] Re: ASpace database constraints

Thanks Dave!

ASnake is incredibly helpful, I've written scripts to automatically sort through our records to delete what needs to be removed for our database merge.

It looks like it is possible to upload json records through the ASpace API, initially I was planning on uploading our modified records in EAD format but it would be nice if I can streamline the entire process through the API.

Does anyone know if the json records obtained through the API differ from EAD records obtained using the ead_export script that comes with ASpace? At first glance it appears there might be some differences.

Or might there be a way to incorporate EAD export/import using ASnake and the ASpace API?

-Alan Huebschen

University of Illinois at Springfield
Brookens Library Information Systems

From: archivesspace_users_group-bounces at lyralists.lyrasis.org <archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of Mayo, Dave <dave_mayo at harvard.edu>
Sent: Friday, December 20, 2019 9:02 AM
To: Archivesspace Users Group
Subject: [EXTERNAL] Re: [Archivesspace_Users_Group] ASpace database constraints

Hi Alan,

So, it’s _possible_ to do a cascade delete of records in ASpace, but for a couple of reasons, I think it’s not necessarily a real good idea.  Solr is unhappy not just because there’s leftover info in other tables, but also probably because it can’t inherently see the bulk deletions in MySQL.

I think it’d probably be safer to delete things via the ArchivesSpace API; this way Solr will stay consistent throughout, and subsidiary records that depend on Resource _should_ all get deleted as well (I’m not quite comfortable saying “will,” but if not, you can also clean those up via the API.

There’s an actively maintained API client for ArchivesSpace, ArchivesSnake - https://github.com/archivesspace-labs/archivessnake<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_archivesspace-2Dlabs_archivessnake&d=DwMF-g&c=WO-RGvefibhHBZq3fL85hQ&r=_Mv1dY22K7jvT5MD7xjbvGVzRDOUMhx4WYcnPSIzYnE&m=aqeGmj6cf7ys9vbiRA1UMRmCcUpEl8tbwlqIi9jCiD0&s=wmtkX2JZs_Fu2S0CcbyLWtnM2ceF4Eexlj2X1XYEceQ&e=>; and a lot of example scripts are linked in the documentation.  I think in principle, this would be fairly straightforward; you’d need to collect the ids of the resources you want to delete, and then delete them.

So, just as a simplified example, if you wanted to delete all resources in repository 2:

from asnake.client import ASnakeClient
client = ASnakeClient(username=”admin”, password=”admin”, baseurl=”http://path.to.backend”)
repos_response = client.get(‘repositories/2/resources’, params={“all_ids”: True})
if (repos_response.status_code != 200): raise “Something went wrong!”
for res_id in repos_response.json():
    delete_response = client.delete(‘repositories/2/resources/{}’.format(res_id))
    if (delete_response.status_code != 200): print(“Failed to delete {}“.format(res_id))

If you decide to try this and run into trouble, please feel free to email me, I’d be happy to help walk you through setup/troubleshooting.
Dave Mayo (he/him)
Senior Digital Library Software Engineer
Harvard University > HUIT > LTS

From: <archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of "Huebschen, Alan M" <ahueb2 at uis.edu>
Reply-To: Archivesspace Users Group <Archivesspace_Users_Group at lyralists.lyrasis.org>
Date: Friday, December 20, 2019 at 9:44 AM
To: Archivesspace Users Group <Archivesspace_Users_Group at lyralists.lyrasis.org>
Subject: [Archivesspace_Users_Group] ASpace database constraints

Good morning all,

I've been working with some test instances of our database and I need to remove some records in bulk. We are currently running ASpace against a MySQL instance and I am attempting to remove all traces of specific records from the tables.

Easily enough I can delete the records from the resource table after disabling foreign key checks, however it appears that there is information left over in other tables making Solr an unhappy camper. I created an EER diagram in MySQL Workbench to try and figure out which records are tied together, but as someone who is fairly new to database work it's a bit of a headache to wrap my mind around.

From the research I've done, some records can be set as a parent and with a cascade setting the child records in other tables will be removed when the parent is removed. I've looked at some of the table settings but I haven't been able to figure out what needs to be removed to clean up the db or what the proper order of removal would be.

Has anyone here removed resource table entries and their associated records with success? How can I go about figuring out what I need to remove and/or how to remove it?

-Alan Huebschen

University of Illinois at Springfield

Brookens Library Information Systems
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20200106/30abd159/attachment-0001.html>

More information about the Archivesspace_Users_Group mailing list