[Archivesspace_Users_Group] Invalid EAD export

Chris Fitzpatrick Chris.Fitzpatrick at lyrasis.org
Tue Sep 15 09:30:56 EDT 2015



Hi Chris,


No, unfortunately it won't.

The problem is that there are so many edges cases with how people use EAD, that it's pretty much impossible to predict them all. I've probably devoted over 30 hours to the problems with <p> tags alone.


The best bet if for you to either make a plugin or send us a pull-request with what you think the changes should be.


best,  Chris.



Chris Fitzpatrick | Developer, ArchivesSpace
Skype: chrisfitzpat  | Phone: 918.236.6048
http://archivesspace.org/


________________________________
From: archivesspace_users_group-bounces at lyralists.lyrasis.org <archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of Chris Powell <sooty at umich.edu>
Sent: Tuesday, September 15, 2015 3:25 PM
To: Archivesspace Users Group
Subject: Re: [Archivesspace_Users_Group] Invalid EAD export

Aside from the theoretical question of export validity, this is a case where ArchivesSpace itself is importing a valid EAD and exporting invalid EAD when no changes have been made to the resource at all.  Is it safe to assume that the

( content.strip.start_with?("<p")

will be modified for the next release?

On Tue, Sep 15, 2015 at 9:18 AM, Chris Fitzpatrick <Chris.Fitzpatrick at lyrasis.org<mailto:Chris.Fitzpatrick at lyrasis.org>> wrote:



Hi All,

I actually think it's probably more important that people have a way to get their data out rather than attempting to disallow export based on the validation to EAD. If you have problems in your records, how would you know where to find them if you were not allowed to export?

Right, so if the exporter detects the note starts with markup, it assumes you've already inserted <p> in there, and so it leaves them. This is because a lot of folks put the <p> in their notes manually.

If you have a different use case, it's pretty easy to modify the exporter.



best, chris.



Chris Fitzpatrick | Developer, ArchivesSpace
Skype: chrisfitzpat  | Phone: 918.236.6048<tel:918.236.6048>
http://archivesspace.org/


________________________________
From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> <archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org>> on behalf of Chris Powell <sooty at umich.edu<mailto:sooty at umich.edu>>
Sent: Tuesday, September 15, 2015 3:07 PM
To: Archivesspace Users Group
Subject: Re: [Archivesspace_Users_Group] Invalid EAD export

Good eye, Dallas! So yes, it looks like it just needs a better match -- on either "<p>" or "<p " -- to resolve the issue.

I think, just on principle, ASpace should never export invalid EAD.

On Tue, Sep 15, 2015 at 8:58 AM, Dallas Pillen <djpillen at umich.edu<mailto:djpillen at umich.edu>> wrote:
Could this be what's causing the issue?

https://github.com/archivesspace/archivesspace/blob/master/backend/app/exporters/serializers/ead.rb#L18-L28

if ( content.strip.start_with?("<p") will match both "<p>" and "<persname>", so if a note starts with either of those (or any other tag that begins with p) the content will not get wrapped in a <p> tag on export.

On Tue, Sep 15, 2015 at 8:36 AM, Chris Powell <sooty at umich.edu<mailto:sooty at umich.edu>> wrote:
I am suspicious -- the ONLY instance where this occurs is when the initial tag is persname.  It seems to me this could be some sort of failed test to see if <p> is already there.

On Tue, Sep 15, 2015 at 3:10 AM, Chris Fitzpatrick <Chris.Fitzpatrick at lyrasis.org<mailto:Chris.Fitzpatrick at lyrasis.org>> wrote:

Hi Chris,


Yes, the <p> is one of the more unfortunate aspects of EAD.

For this use case ( where you start the note with markup ), you have to add your own <p> tags to wrap the note.

b,chris.



Chris Fitzpatrick | Developer, ArchivesSpace
Skype: chrisfitzpat  | Phone: 918.236.6048<tel:918.236.6048>
http://archivesspace.org/


________________________________
From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> <archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org>> on behalf of Chris Powell <sooty at umich.edu<mailto:sooty at umich.edu>>
Sent: Monday, September 14, 2015 5:42 PM
To: archivesspace_users_group
Subject: Re: [Archivesspace_Users_Group] Invalid EAD export

Please disregard the hyphen in the example EAD import! The hazards of cutting and pasting out of Internet Explorer.

<bioghist encodinganalog="545"><p><persname>Francis Steiner</persname>was born January 16, 1895 in New Jersey, to German parents. He was the oldest of three children.</p><p>A communist and conscientious objector [etc.] </p><p>There is no information regarding Francis Steiner after his last letter of November 7, 1920. </p></bioghist>

On Mon, Sep 14, 2015 at 11:15 AM, Chris Powell <sooty at umich.edu<mailto:sooty at umich.edu>> wrote:
Hello --

It appears that if the first word or phrase in any of the "notes" elements that contain a text block and support mixed content, like abstract, bioghist or scopecontent is wrapped in a persname, the EAD export is invalid as all paragraphs lack p element wrappers.

I've tested this with other elements to start the first paragraph and persname to start the second paragraph and those do not cause problems, only persname to start the first paragraph.

Example bioghist EAD prior to import:

<bioghist encodinganalog="545">-<p><persname>Francis Steiner</persname>was born January 16, 1895 in New Jersey, to German parents. He was the oldest of three children.</p><p>A communist and conscientious objector [etc.] </p><p>There is no information regarding Francis Steiner after his last letter of November 7, 1920. </p></bioghist>

Example bioghist EAD after export:

<bioghist id="aspace_6e16003b2d18f8ad6c487cd5712fc162"><head>Biographical / Historical</head><persname>Francis Steiner</persname>was born January 16, 1895 in New Jersey, to German parents. He was the oldest of three children. A communist and conscientious objector [etc.] There is no information regarding Francis Steiner after his last letter of November 7, 1920. </bioghist>


Chris Powell
University of Michigan
Digital Library Production Service


_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org<mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group



_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org<mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group




--
Dallas Pillen
Project Archivist

[X]
  Bentley Historical Library<http://bentley.umich.edu/>
  1150 Beal Avenue
  Ann Arbor, Michigan 48109-2113
  734.647.3559
  Twitter<https://twitter.com/umichBentley> Facebook <https://www.facebook.com/bentleyhistoricallibrary>

_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org<mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group



_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org<mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20150915/6b5b0ff9/attachment.html>


More information about the Archivesspace_Users_Group mailing list