[Archivesspace_Users_Group] [EXTERNAL] Re: ASpace database constraints

Huebschen, Alan M ahueb2 at uis.edu
Mon Dec 23 10:42:25 EST 2019


Thanks Dave!


ASnake is incredibly helpful, I've written scripts to automatically sort through our records to delete what needs to be removed for our database merge.

It looks like it is possible to upload json records through the ASpace API, initially I was planning on uploading our modified records in EAD format but it would be nice if I can streamline the entire process through the API.


Does anyone know if the json records obtained through the API differ from EAD records obtained using the ead_export script that comes with ASpace? At first glance it appears there might be some differences.

Or might there be a way to incorporate EAD export/import using ASnake and the ASpace API?


-Alan Huebschen

University of Illinois at Springfield

Brookens Library Information Systems


________________________________
From: archivesspace_users_group-bounces at lyralists.lyrasis.org <archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of Mayo, Dave <dave_mayo at harvard.edu>
Sent: Friday, December 20, 2019 9:02 AM
To: Archivesspace Users Group
Subject: [EXTERNAL] Re: [Archivesspace_Users_Group] ASpace database constraints

Hi Alan,

So, it’s _possible_ to do a cascade delete of records in ASpace, but for a couple of reasons, I think it’s not necessarily a real good idea.  Solr is unhappy not just because there’s leftover info in other tables, but also probably because it can’t inherently see the bulk deletions in MySQL.

I think it’d probably be safer to delete things via the ArchivesSpace API; this way Solr will stay consistent throughout, and subsidiary records that depend on Resource _should_ all get deleted as well (I’m not quite comfortable saying “will,” but if not, you can also clean those up via the API.

There’s an actively maintained API client for ArchivesSpace, ArchivesSnake - https://github.com/archivesspace-labs/archivessnake; and a lot of example scripts are linked in the documentation.  I think in principle, this would be fairly straightforward; you’d need to collect the ids of the resources you want to delete, and then delete them.

So, just as a simplified example, if you wanted to delete all resources in repository 2:

from asnake.client import ASnakeClient
client = ASnakeClient(username=”admin”, password=”admin”, baseurl=”http://path.to.backend”)
repos_response = client.get(‘repositories/2/resources’, params={“all_ids”: True})
if (repos_response.status_code != 200): raise “Something went wrong!”
for res_id in repos_response.json():
    delete_response = client.delete(‘repositories/2/resources/{}’.format(res_id))
    if (delete_response.status_code != 200): print(“Failed to delete {}“.format(res_id))

If you decide to try this and run into trouble, please feel free to email me, I’d be happy to help walk you through setup/troubleshooting.

--
Dave Mayo (he/him)
Senior Digital Library Software Engineer
Harvard University > HUIT > LTS

From: <archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of "Huebschen, Alan M" <ahueb2 at uis.edu>
Reply-To: Archivesspace Users Group <Archivesspace_Users_Group at lyralists.lyrasis.org>
Date: Friday, December 20, 2019 at 9:44 AM
To: Archivesspace Users Group <Archivesspace_Users_Group at lyralists.lyrasis.org>
Subject: [Archivesspace_Users_Group] ASpace database constraints


Good morning all,



I've been working with some test instances of our database and I need to remove some records in bulk. We are currently running ASpace against a MySQL instance and I am attempting to remove all traces of specific records from the tables.



Easily enough I can delete the records from the resource table after disabling foreign key checks, however it appears that there is information left over in other tables making Solr an unhappy camper. I created an EER diagram in MySQL Workbench to try and figure out which records are tied together, but as someone who is fairly new to database work it's a bit of a headache to wrap my mind around.



>From the research I've done, some records can be set as a parent and with a cascade setting the child records in other tables will be removed when the parent is removed. I've looked at some of the table settings but I haven't been able to figure out what needs to be removed to clean up the db or what the proper order of removal would be.



Has anyone here removed resource table entries and their associated records with success? How can I go about figuring out what I need to remove and/or how to remove it?



-Alan Huebschen

University of Illinois at Springfield

Brookens Library Information Systems
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20191223/182da993/attachment.html>


More information about the Archivesspace_Users_Group mailing list