[Archivesspace_Users_Group] Invalid EAD export

Chris Powell sooty at umich.edu
Tue Sep 15 09:35:00 EDT 2015


Thanks, we will do that.  As Robin Wendler's message points out, this is
likely a widespread issue and one that is difficult to fix.

On Tue, Sep 15, 2015 at 9:30 AM, Chris Fitzpatrick <
Chris.Fitzpatrick at lyrasis.org> wrote:

>
>
> Hi Chris,
>
>
> No, unfortunately it won't.
>
> The problem is that there are so many edges cases with how people use EAD,
> that it's pretty much impossible to predict them all. I've probably devoted
> over 30 hours to the problems with <p> tags alone.
>
>
> The best bet if for you to either make a plugin or send us a pull-request
> with what you think the changes should be.
>
>
> best,  Chris.
>
>
> Chris Fitzpatrick | Developer, ArchivesSpace
> Skype: chrisfitzpat  | Phone: 918.236.6048
> http://archivesspace.org/
>
>
> ------------------------------
> *From:* archivesspace_users_group-bounces at lyralists.lyrasis.org <
> archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of
> Chris Powell <sooty at umich.edu>
> *Sent:* Tuesday, September 15, 2015 3:25 PM
> *To:* Archivesspace Users Group
> *Subject:* Re: [Archivesspace_Users_Group] Invalid EAD export
>
> Aside from the theoretical question of export validity, this is a case
> where ArchivesSpace itself is importing a valid EAD and exporting invalid
> EAD when no changes have been made to the resource at all.  Is it safe to
> assume that the
>
> ( content.strip.start_with?("<p")
>
> will be modified for the next release?
>
> On Tue, Sep 15, 2015 at 9:18 AM, Chris Fitzpatrick <
> Chris.Fitzpatrick at lyrasis.org> wrote:
>
>>
>>
>> Hi All,
>>
>> I actually think it's probably more important that people have a way to
>> get their data out rather than attempting to disallow export based on the
>> validation to EAD. If you have problems in your records, how would you know
>> where to find them if you were not allowed to export?
>>
>> Right, so if the exporter detects the note starts with markup, it assumes
>> you've already inserted <p> in there, and so it leaves them. This is
>> because a lot of folks put the <p> in their notes manually.
>>
>> If you have a different use case, it's pretty easy to modify the
>> exporter.
>>
>>
>> best, chris.
>>
>>
>>
>> Chris Fitzpatrick | Developer, ArchivesSpace
>> Skype: chrisfitzpat  | Phone: 918.236.6048
>> http://archivesspace.org/
>>
>>
>> ------------------------------
>> *From:* archivesspace_users_group-bounces at lyralists.lyrasis.org <
>> archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of
>> Chris Powell <sooty at umich.edu>
>> *Sent:* Tuesday, September 15, 2015 3:07 PM
>> *To:* Archivesspace Users Group
>> *Subject:* Re: [Archivesspace_Users_Group] Invalid EAD export
>>
>> Good eye, Dallas! So yes, it looks like it just needs a better match --
>> on either "<p>" or "<p " -- to resolve the issue.
>>
>> I think, just on principle, ASpace should never export invalid EAD.
>>
>> On Tue, Sep 15, 2015 at 8:58 AM, Dallas Pillen <djpillen at umich.edu>
>> wrote:
>>
>>> Could this be what's causing the issue?
>>>
>>>
>>> https://github.com/archivesspace/archivesspace/blob/master/backend/app/exporters/serializers/ead.rb#L18-L28
>>>
>>> if ( content.strip.start_with?("<p") will match both "<p>" and
>>> "<persname>", so if a note starts with either of those (or any other tag
>>> that begins with p) the content will not get wrapped in a <p> tag on export.
>>>
>>> On Tue, Sep 15, 2015 at 8:36 AM, Chris Powell <sooty at umich.edu> wrote:
>>>
>>>> I am suspicious -- the ONLY instance where this occurs is when the
>>>> initial tag is persname.  It seems to me this could be some sort of failed
>>>> test to see if <p> is already there.
>>>>
>>>> On Tue, Sep 15, 2015 at 3:10 AM, Chris Fitzpatrick <
>>>> Chris.Fitzpatrick at lyrasis.org> wrote:
>>>>
>>>>> Hi Chris,
>>>>>
>>>>>
>>>>> Yes, the <p> is one of the more unfortunate aspects of EAD.
>>>>>
>>>>> For this use case ( where you start the note with markup ), you have
>>>>> to add your own <p> tags to wrap the note.
>>>>>
>>>>> b,chris.
>>>>>
>>>>>
>>>>>
>>>>> Chris Fitzpatrick | Developer, ArchivesSpace
>>>>> Skype: chrisfitzpat  | Phone: 918.236.6048
>>>>> http://archivesspace.org/
>>>>>
>>>>>
>>>>> ------------------------------
>>>>> *From:* archivesspace_users_group-bounces at lyralists.lyrasis.org <
>>>>> archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of
>>>>> Chris Powell <sooty at umich.edu>
>>>>> *Sent:* Monday, September 14, 2015 5:42 PM
>>>>> *To:* archivesspace_users_group
>>>>> *Subject:* Re: [Archivesspace_Users_Group] Invalid EAD export
>>>>>
>>>>> Please disregard the hyphen in the example EAD import! The hazards of
>>>>> cutting and pasting out of Internet Explorer.
>>>>>
>>>>> <bioghist encodinganalog="545"><p><persname>Francis
>>>>> Steiner</persname>was born January 16, 1895 in New Jersey, to German
>>>>> parents. He was the oldest of three children.</p><p>A communist and
>>>>> conscientious objector [etc.] </p><p>There is no information regarding
>>>>> Francis Steiner after his last letter of November 7, 1920. </p></bioghist>
>>>>>
>>>>> On Mon, Sep 14, 2015 at 11:15 AM, Chris Powell <sooty at umich.edu>
>>>>> wrote:
>>>>>
>>>>>> Hello --
>>>>>>
>>>>>> It appears that if the first word or phrase in any of the "notes"
>>>>>> elements that contain a text block and support mixed content, like
>>>>>> abstract, bioghist or scopecontent is wrapped in a persname, the EAD export
>>>>>> is invalid as all paragraphs lack p element wrappers.
>>>>>>
>>>>>> I've tested this with other elements to start the first paragraph and
>>>>>> persname to start the second paragraph and those do not cause problems,
>>>>>> only persname to start the first paragraph.
>>>>>>
>>>>>> Example bioghist EAD prior to import:
>>>>>>
>>>>>> <bioghist encodinganalog="545">-<p><persname>Francis
>>>>>> Steiner</persname>was born January 16, 1895 in New Jersey, to German
>>>>>> parents. He was the oldest of three children.</p><p>A communist and
>>>>>> conscientious objector [etc.] </p><p>There is no information regarding
>>>>>> Francis Steiner after his last letter of November 7, 1920. </p></bioghist>
>>>>>>
>>>>>> Example bioghist EAD after export:
>>>>>>
>>>>>> <bioghist
>>>>>> id="aspace_6e16003b2d18f8ad6c487cd5712fc162"><head>Biographical /
>>>>>> Historical</head><persname>Francis Steiner</persname>was born January 16,
>>>>>> 1895 in New Jersey, to German parents. He was the oldest of three children.
>>>>>> A communist and conscientious objector [etc.] There is no information
>>>>>> regarding Francis Steiner after his last letter of November 7, 1920.
>>>>>> </bioghist>
>>>>>>
>>>>>>
>>>>>> Chris Powell
>>>>>> University of Michigan
>>>>>> Digital Library Production Service
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Archivesspace_Users_Group mailing list
>>>>> Archivesspace_Users_Group at lyralists.lyrasis.org
>>>>> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Archivesspace_Users_Group mailing list
>>>> Archivesspace_Users_Group at lyralists.lyrasis.org
>>>> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> *Dallas Pillen *Project Archivist
>>>
>>>
>>>   Bentley Historical Library <http://bentley.umich.edu/>
>>>   1150 Beal Avenue
>>>   Ann Arbor, Michigan 48109-2113
>>>   734.647.3559
>>>   Twitter <https://twitter.com/umichBentley> Facebook
>>> <https://www.facebook.com/bentleyhistoricallibrary>
>>>
>>> _______________________________________________
>>> Archivesspace_Users_Group mailing list
>>> Archivesspace_Users_Group at lyralists.lyrasis.org
>>> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>>>
>>>
>>
>> _______________________________________________
>> Archivesspace_Users_Group mailing list
>> Archivesspace_Users_Group at lyralists.lyrasis.org
>> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>>
>>
>
> _______________________________________________
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group at lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20150915/5dc70ada/attachment.html>


More information about the Archivesspace_Users_Group mailing list