[Archivesspace_Users_Group] Invalid EAD export

Chris Powell sooty at umich.edu
Tue Sep 15 09:25:24 EDT 2015


Aside from the theoretical question of export validity, this is a case
where ArchivesSpace itself is importing a valid EAD and exporting invalid
EAD when no changes have been made to the resource at all.  Is it safe to
assume that the

( content.strip.start_with?("<p")

will be modified for the next release?

On Tue, Sep 15, 2015 at 9:18 AM, Chris Fitzpatrick <
Chris.Fitzpatrick at lyrasis.org> wrote:

>
>
> Hi All,
>
> I actually think it's probably more important that people have a way to
> get their data out rather than attempting to disallow export based on the
> validation to EAD. If you have problems in your records, how would you know
> where to find them if you were not allowed to export?
>
> Right, so if the exporter detects the note starts with markup, it assumes
> you've already inserted <p> in there, and so it leaves them. This is
> because a lot of folks put the <p> in their notes manually.
>
> If you have a different use case, it's pretty easy to modify the exporter.
>
>
> best, chris.
>
>
>
> Chris Fitzpatrick | Developer, ArchivesSpace
> Skype: chrisfitzpat  | Phone: 918.236.6048
> http://archivesspace.org/
>
>
> ------------------------------
> *From:* archivesspace_users_group-bounces at lyralists.lyrasis.org <
> archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of
> Chris Powell <sooty at umich.edu>
> *Sent:* Tuesday, September 15, 2015 3:07 PM
> *To:* Archivesspace Users Group
> *Subject:* Re: [Archivesspace_Users_Group] Invalid EAD export
>
> Good eye, Dallas! So yes, it looks like it just needs a better match -- on
> either "<p>" or "<p " -- to resolve the issue.
>
> I think, just on principle, ASpace should never export invalid EAD.
>
> On Tue, Sep 15, 2015 at 8:58 AM, Dallas Pillen <djpillen at umich.edu> wrote:
>
>> Could this be what's causing the issue?
>>
>>
>> https://github.com/archivesspace/archivesspace/blob/master/backend/app/exporters/serializers/ead.rb#L18-L28
>>
>> if ( content.strip.start_with?("<p") will match both "<p>" and
>> "<persname>", so if a note starts with either of those (or any other tag
>> that begins with p) the content will not get wrapped in a <p> tag on export.
>>
>> On Tue, Sep 15, 2015 at 8:36 AM, Chris Powell <sooty at umich.edu> wrote:
>>
>>> I am suspicious -- the ONLY instance where this occurs is when the
>>> initial tag is persname.  It seems to me this could be some sort of failed
>>> test to see if <p> is already there.
>>>
>>> On Tue, Sep 15, 2015 at 3:10 AM, Chris Fitzpatrick <
>>> Chris.Fitzpatrick at lyrasis.org> wrote:
>>>
>>>> Hi Chris,
>>>>
>>>>
>>>> Yes, the <p> is one of the more unfortunate aspects of EAD.
>>>>
>>>> For this use case ( where you start the note with markup ), you have to
>>>> add your own <p> tags to wrap the note.
>>>>
>>>> b,chris.
>>>>
>>>>
>>>>
>>>> Chris Fitzpatrick | Developer, ArchivesSpace
>>>> Skype: chrisfitzpat  | Phone: 918.236.6048
>>>> http://archivesspace.org/
>>>>
>>>>
>>>> ------------------------------
>>>> *From:* archivesspace_users_group-bounces at lyralists.lyrasis.org <
>>>> archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of
>>>> Chris Powell <sooty at umich.edu>
>>>> *Sent:* Monday, September 14, 2015 5:42 PM
>>>> *To:* archivesspace_users_group
>>>> *Subject:* Re: [Archivesspace_Users_Group] Invalid EAD export
>>>>
>>>> Please disregard the hyphen in the example EAD import! The hazards of
>>>> cutting and pasting out of Internet Explorer.
>>>>
>>>> <bioghist encodinganalog="545"><p><persname>Francis
>>>> Steiner</persname>was born January 16, 1895 in New Jersey, to German
>>>> parents. He was the oldest of three children.</p><p>A communist and
>>>> conscientious objector [etc.] </p><p>There is no information regarding
>>>> Francis Steiner after his last letter of November 7, 1920. </p></bioghist>
>>>>
>>>> On Mon, Sep 14, 2015 at 11:15 AM, Chris Powell <sooty at umich.edu> wrote:
>>>>
>>>>> Hello --
>>>>>
>>>>> It appears that if the first word or phrase in any of the "notes"
>>>>> elements that contain a text block and support mixed content, like
>>>>> abstract, bioghist or scopecontent is wrapped in a persname, the EAD export
>>>>> is invalid as all paragraphs lack p element wrappers.
>>>>>
>>>>> I've tested this with other elements to start the first paragraph and
>>>>> persname to start the second paragraph and those do not cause problems,
>>>>> only persname to start the first paragraph.
>>>>>
>>>>> Example bioghist EAD prior to import:
>>>>>
>>>>> <bioghist encodinganalog="545">-<p><persname>Francis
>>>>> Steiner</persname>was born January 16, 1895 in New Jersey, to German
>>>>> parents. He was the oldest of three children.</p><p>A communist and
>>>>> conscientious objector [etc.] </p><p>There is no information regarding
>>>>> Francis Steiner after his last letter of November 7, 1920. </p></bioghist>
>>>>>
>>>>> Example bioghist EAD after export:
>>>>>
>>>>> <bioghist
>>>>> id="aspace_6e16003b2d18f8ad6c487cd5712fc162"><head>Biographical /
>>>>> Historical</head><persname>Francis Steiner</persname>was born January 16,
>>>>> 1895 in New Jersey, to German parents. He was the oldest of three children.
>>>>> A communist and conscientious objector [etc.] There is no information
>>>>> regarding Francis Steiner after his last letter of November 7, 1920.
>>>>> </bioghist>
>>>>>
>>>>>
>>>>> Chris Powell
>>>>> University of Michigan
>>>>> Digital Library Production Service
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Archivesspace_Users_Group mailing list
>>>> Archivesspace_Users_Group at lyralists.lyrasis.org
>>>> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Archivesspace_Users_Group mailing list
>>> Archivesspace_Users_Group at lyralists.lyrasis.org
>>> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>>>
>>>
>>
>>
>> --
>>
>> *Dallas Pillen *Project Archivist
>>
>>
>>   Bentley Historical Library <http://bentley.umich.edu/>
>>   1150 Beal Avenue
>>   Ann Arbor, Michigan 48109-2113
>>   734.647.3559
>>   Twitter <https://twitter.com/umichBentley> Facebook
>> <https://www.facebook.com/bentleyhistoricallibrary>
>>
>> _______________________________________________
>> Archivesspace_Users_Group mailing list
>> Archivesspace_Users_Group at lyralists.lyrasis.org
>> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>>
>>
>
> _______________________________________________
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group at lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20150915/9ae0fb74/attachment.html>


More information about the Archivesspace_Users_Group mailing list