[Archivesspace_Users_Group] AT Migration problem with repository processing note

Custer, Mark mark.custer at yale.edu
Thu Feb 4 11:32:48 EST 2016


We had a need to do something similar as part of our migration (the migration was last year, which I only mention because my memory is a bit foggy about this now).  We used the second approach that Noah described:  creating match points by concatenating the resourceIdentifier + refId from the AT.  With this approach, we didn’t have any false matches, because the resource identifiers have to be unique in the AT, and the reference IDs have to be unique per resource, so I’d say that’s the way to go.

Here’s the function that we used for this in SQL, which someone in our IT department wrote for us in a few minutes after hearing about our need to do something recursive-like in our AT database (Thanks, Steelsen!):

DELIMITER $$
DROP FUNCTION IF EXISTS `your_at_database_name_here`.`getResourceFromComponent` $$
CREATE FUNCTION `your_at_database_name_here`.`getResourceFromComponent` (GivenID INT) RETURNS VARCHAR(1024)
DETERMINISTIC
BEGIN
                DECLARE rv INT;
    DECLARE tp INT;
    DECLARE ch INT;
SET tp = GivenID; /*There is no component 0 so this will be returned if first hit is top level*/
    SET ch = GivenID;
    WHILE ch > 0 DO
        SELECT IFNULL(parentResourceComponentId,-1) INTO ch FROM
        (SELECT parentResourceComponentId FROM resourcescomponents WHERE resourceComponentId = ch) A;
        IF ch > 0 THEN
            SET tp = ch; /*Keep replacing with the next value up the tree until you hit -1 which means the parent was null*/
        END IF;
    END WHILE;
select resourceId into rv from resourcescomponents where resourceComponentId = tp;
    RETURN rv;
END $$
DELIMITER ;

With that, you can then do something like this:

select getResourceFromComponent(4444);

…which will give you the AT resourceIdentifier for the resource component that has an id = 4444.

Hopefully that (or a similar approach) will help,

Mark




From: archivesspace_users_group-bounces at lyralists.lyrasis.org [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of Noah Huffman
Sent: Thursday, February 04, 2016 11:12 AM
To: Archivesspace Users Group
Subject: Re: [Archivesspace_Users_Group] AT Migration problem with repository processing note

Hi Ian,

I’m not sure I have a great solution for this…

The refIDs in AT should be unique within the context of a resource and those refIDs should match the ASpace refID values before the underscore and extra characters assigned by the migarator (e.g. ‘ref64’ in AT becomes ‘ref64_h5n’ in ASpace).

I wonder if you could try matching on a combination of things, like the first part of the refID before the underscore and the ResourceComponent title?  These data elements are both in ATs ResourcesComponents table.  There might be some false matches here if you have lots of common titles like “Correspondence,” but if your titles are somewhat unique it might be a good strategy.  Depending on how many Repository Processing notes you’re trying to move, you could also review the matches before pushing the updates to ASpace.

A better strategy would be to match on the first part of the refID string and also the resource identifier (‘resourceIdentifier’ field in AT’s Resource table and ‘identifier’ field in AS’s resource table).  Determining the resource identifier based on a component’s refID might require some more advanced SQL.  I haven’t done this, but others on the list might have.

Any ideas?

-Noah

From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of Hardy, Ian
Sent: Thursday, February 04, 2016 9:50 AM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org<mailto:archivesspace_users_group at lyralists.lyrasis.org>>
Subject: Re: [Archivesspace_Users_Group] AT Migration problem with repository processing note

Hi Noah and others, one problem we're running into in moving these repository processing notes is that there doesn't appear to be a consistent identifier shared by the toolkit ResourcesCompoenents and Aspace archival_object table. In particular the persitentIDs in toolkit are not unique in Aspace, so the Aspace migrator adds some extra characters at the end to create it's identifier, the ref_ID. Anyone have a recommended methodology for matching between the platforms?

Thanks,

Ian

On Mon, Feb 1, 2016 at 4:37 PM, Hardy, Ian <ihardy at email.gwu.edu<mailto:ihardy at email.gwu.edu>> wrote:
Thanks Noah and Maureen, I was able to update some test repository processing notes using Noah's scripts as a starting point for interacting with the API. I think this will do the trick for us.

On Mon, Feb 1, 2016 at 10:52 AM, Noah Huffman <noah.huffman at duke.edu<mailto:noah.huffman at duke.edu>> wrote:
Christie,

I have a script that sort of does what Maureen suggests.  Was going to mention it earlier, but it's a bit undercooked....

https://github.com/noahgh221/archivesspace-duke-scripts/blob/master/duke_archival_object_metadata_adder.py<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_noahgh221_archivesspace-2Dduke-2Dscripts_blob_master_duke-5Farchival-5Fobject-5Fmetadata-5Fadder.py&d=AwMGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=KdUdoHRjsxlxxbhvIQVs_HjNucajSMLTQKfZJigxEoo&s=jp009w-j5oaIIwIn_q56kxhqnS8C5zrlBnjseo8C2tc&e=>

It can read a two-column spreadsheet (as TSV) and batch add Repository Processing notes to archival objects via the API based on ref_ID values in the spreadsheet:

The example above is a modified version of a script that folks at the Bentley wrote:
https://github.com/djpillen/bentley_scripts/blob/master/update_archival_object.py<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_djpillen_bentley-5Fscripts_blob_master_update-5Farchival-5Fobject.py&d=AwMGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=KdUdoHRjsxlxxbhvIQVs_HjNucajSMLTQKfZJigxEoo&s=FcmOPa2NB0nuOlkdbcSOotdEKfi3YBlfoBIDiLhRY7A&e=>

The comments in the script should help you figure out what you might need to modify.  For full disclosure, I'm a Python noob, so this is probably terribly written, but I can confirm that it works for my use case.

-Noah

-----Original Message-----
From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org>] On Behalf Of Callahan, Maureen
Sent: Monday, February 01, 2016 10:43 AM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org<mailto:archivesspace_users_group at lyralists.lyrasis.org>>
Subject: Re: [Archivesspace_Users_Group] AT Migration problem with repository processing note

Hey Christie,

Doing it now, we would probably write a script to do an update using the API (better built-in validation, fewer opportunities to do something stupid). During the migration, there's a choice to keep or re-assign refids - you're definitely going to want to keep those to help match up the components.

It's worth noting that we made those SQL updates because of mistakes in the migrator. I'm interested to know if those ever got fixed.

MC


> On Feb 1, 2016, at 10:36 AM, Peterson, Christie <cspeterson at email.gwu.edu<mailto:cspeterson at email.gwu.edu>> wrote:
>
> Hi Noah,
>
> Yep, that's the kind of thing we're probably going to end up doing.
>
> Many thanks!
>
> Christie
> _______________________________________________
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group at lyralists.lyrasis.org<mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lyralists.lyrasis.
> org_mailman_listinfo_archivesspace-5Fusers-5Fgroup&d=AwICAg&c=-dg2m7zW
> uuDZ0MUcV7Sdqw&r=JgH2YCQ8D3P9-Lm_x4bv3d2CZBYlbx6hxnLFHtfovi8&m=n1N3sMK
> X9kb8hCXfWyWw-9rmsPmqB9BhN_6Kckdfo5g&s=XXk6iOoNajjfK6Ebn6n3Oe4cvYGTMPW
> o6wDb_Hto0q8&e=

_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org<mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group<https://urldefense.proofpoint.com/v2/url?u=http-3A__lyralists.lyrasis.org_mailman_listinfo_archivesspace-5Fusers-5Fgroup&d=AwMGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=KdUdoHRjsxlxxbhvIQVs_HjNucajSMLTQKfZJigxEoo&s=5IlEg6G6_5s7Hwonjt7wZtU5RGdvO4vntjYAIgV7pcU&e=>
_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org<mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group<https://urldefense.proofpoint.com/v2/url?u=http-3A__lyralists.lyrasis.org_mailman_listinfo_archivesspace-5Fusers-5Fgroup&d=AwMGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=KdUdoHRjsxlxxbhvIQVs_HjNucajSMLTQKfZJigxEoo&s=5IlEg6G6_5s7Hwonjt7wZtU5RGdvO4vntjYAIgV7pcU&e=>



--
Ian Hardy
Systems Specialist
GW Libraries
ihardy at gwu.edu<mailto:ihardy at email.gwu.edu>
helpdesk: (202) 994-8278



--
Ian Hardy
Systems Specialist
GW Libraries
ihardy at gwu.edu<mailto:ihardy at email.gwu.edu>
helpdesk: (202) 994-8278
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20160204/06412236/attachment.html>


More information about the Archivesspace_Users_Group mailing list