[Archivesspace_Users_Group] EAD Importer and DAOs

Mayo, Dave dave_mayo at harvard.edu
Fri May 25 20:55:04 EDT 2018


IIRC, also, a lot of the smaller DB-level defaults (and some of the larger ones) are the result of various defaults or arbitrary guesses - before just saying that it's part of the data model and we should just live with it, one should always positively verify that it's legitimately part of the data model.  

In this case, I can actually negatively verify this: in the commit where the file_version:caption field is added, the DB migration doesn't specify a length (and so gets the default of 255), but the schema, which is specified, reads:

> "caption" => {"type" => "string", "maxLength" => 16384},

So it's definitely a bug - I'll file a ticket, and I'm planning to make a pull request to fix this, though it won't really be a solution to your problem until it gets accepted and you update.  

If you have DB access, I think you'd be fine to just go directly in and run:

ALTER TABLE `file_version` MODIFY caption VARCHAR(16384);

Since you're expanding it, you won't risk truncating data.

- Dave Mayo
ASpace Core Committer's Group

On 5/25/18, 9:34 AM, "archivesspace_users_group-bounces at lyralists.lyrasis.org on behalf of Brian Harrington" <archivesspace_users_group-bounces at lyralists.lyrasis.org on behalf of brian.harrington at lyrasis.org> wrote:

    Hi Tim,
    
    I agree it seems a bit backwards to change the data model to suit the importer, which is one of the reasons I decided to pose the question to the list.  There could be valid reasons (display issues?) for limiting the length of the caption, but these things are often assigned somewhat arbitrarily, so I thought I would ask.  If there are reasons for keeping  for keeping the caption at 255, then I think it makes sense to truncate it in the importer, rather than just having things die on a database error.
    
    I currently use a modified version of Mark Custer’s schematron <https://github.com/fordmadox/schematrons/blob/master/ArchivesSpace-EAD-validator.sch> to check EADs prior to import, and can certainly add code to check <dao> @titles.  The problem with doing that is the double use of @title for both digital_object:title and file_version:caption.  Since ASpace supports long titles, and the archivist presumably assigned a long title for a reason, I would hate to shorten it before import just to make sure that it fits when re-used as a caption.
    
    Thanks,
    
    Brian
    
    > On May 24, 2018, at 5:25 PM, Timothy Dilauro <timmo at jhu.edu> wrote:
    > 
    > Hi Brian,
    > 
    > I don't think it's a good idea to change the data model just to avoid imports failing, though there may be other rationales that result in such a change.
    > 
    > In the mean time, it might be useful to write some XSLT or some other custom code to perform sanity checks relative to ASpace restrictions ahead of EAD import attempts. In that manner, those non-conformant captions (and anything else you check on) could be tweaked before import.
    > 
    > Cheers,
    > ~Tim
    > 
    >> On May 23, 2018, at 2:39 PM, Brian Harrington <brian.harrington at lyrasis.org> wrote:
    >> 
    >> 
    >> Currently when importing an EAD, <dao>s are used to create digital objects.  As part of this process, the @title attribute is used for both the digital object title, and the caption under file versions.  I've recently run into a fun issue with <dao>s with @titles longer than 255 characters.  These titles are OK for digital_object:title, which is VARCHAR(8704) but too long for file_version:caption, which is VARCHAR(255).  So the import fails.
    >> 
    >> Should this be considered a bug?  If it is, and if one were theoretically considering a PR, would it make more sense to harmonize the length of the title and caption, or truncate the caption to 255 characters?  My inclination is just to increase the maximum length of captions, and rely on people to show restraint, but I know that other people might have different opinions.
    >> 
    >> Thanks,
    >> 
    >> Brian
    >> 
    >> --
    >> Brian Harrington
    >> Migration Specialist
    >> LYRASIS
    >> brian.harrington at lyrasis.org
    >> skype: abbistani
    >> 
    >> 
    >> _______________________________________________
    >> Archivesspace_Users_Group mailing list
    >> Archivesspace_Users_Group at lyralists.lyrasis.org
    >> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
    >> 
    > 
    > _______________________________________________
    > Archivesspace_Users_Group mailing list
    > Archivesspace_Users_Group at lyralists.lyrasis.org
    > http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
    
    _______________________________________________
    Archivesspace_Users_Group mailing list
    Archivesspace_Users_Group at lyralists.lyrasis.org
    http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
    



More information about the Archivesspace_Users_Group mailing list