[Archivesspace_Users_Group] ampersand issue

Chris Powell sooty at umich.edu
Wed Dec 9 10:03:36 EST 2015


So, you are inputting character entities, like ø or á and not
numeric entities or the UTF-8 characters themselves?

On Wed, Dec 9, 2015 at 9:06 AM, Novakovic, Julia <
jNovakovic at museumofplay.org> wrote:

> Hi Brian,
>
>
>
> The quotation marks worked fine, but ø still reads as  ø !
> Screenshot attached.
>
>
>
> [Our Director of IT will confirm for me if our XML files are encoded as
> UTF-8.]
>
>
>
> Thanks!
>
>
>
> --Julia
>
>
>
>
>
> *From:* archivesspace_users_group-bounces at lyralists.lyrasis.org [mailto:
> archivesspace_users_group-bounces at lyralists.lyrasis.org] *On Behalf Of *Brian
> Hoffman
> *Sent:* Tuesday, December 08, 2015 5:15 PM
> *To:* Archivesspace Users Group
> *Subject:* Re: [Archivesspace_Users_Group] ampersand issue
>
>
>
> Hi Chris,
>
>
>
> You are right - I tested two records with these titles:
>
>
>
>  "AmpTest & <emph>a</emph>”
>
>  "AmpTest & <emph>a</emph>"
>
>
>
> they *both* export identically as:
>
>
>
> <unittitle>AmpTest & <emph>a</emph></unittitle>
>
>
>
> I also tried a similar experiment with the < entity and got different
> results:
>
>
>
> "AmpTest < A”
>
> exports as
>
> <unittitle>AmpTest < A</unittitle>
>
> which is invalid XML.
>
>
>
> "AmpTest < A”
>
> exports as
>
> <unittitle>AmpTest &lt; A </unittitle>
>
> which is probably going to seem wrong to any user who would try to do it
> this way.
>
>
>
> I’m wondering whether most users intend to key in actual XML mixed content
> or just text with inline markup that corresponds to EAD.
>
>
>
> Julia, what happens if you cut and paste this text “øøøøøø” into a new
> resource record title and save it?
>
>
>
> Brian
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Dec 8, 2015, at 4:14 PM, Chris Fitzpatrick <
> Chris.Fitzpatrick at lyrasis.org> wrote:
>
>
>
>
> Hi Brian,
>
>
>
> Hm,trying to think what the issues are with having "AmpTest &
> <emph>a</emph>" stored in the DB?
>
> The exporter converts the & into & , but you're thinking this would be
> an import problem?
>
>
>
> The other option, I thinki, would be to add things to the MixedContent
> parser, which turns all the wonderful EAD "mixed content" into actual HTML.
>
>
>
>
> Julia:
>
>
>
> Do you know if your XML files are saved as UFT-8? I wonder if you have an
> encoding issue that might be causing this.
>
>
>
> best, Chris.
>
>
>
> Chris Fitzpatrick | Developer, ArchivesSpace
> Skype: chrisfitzpat  | Phone: 918.236.6048
> http://archivesspace.org/
>
>
> ------------------------------
>
> *From:* archivesspace_users_group-bounces at lyralists.lyrasis.org <
> archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of
> Galligan, Patrick <PGalligan at rockarch.org>
> *Sent:* Tuesday, December 8, 2015 10:07 PM
> *To:* Archivesspace Users Group
> *Subject:* Re: [Archivesspace_Users_Group] ampersand issue
>
>
>
> I also didn’t have any issues with quotation marks. Maybe they were smart
> quotes or something?
>
>
>
> Patrick Galligan
>
> Rockefeller Archive Center
>
> Assistant Digital Archivist
>
> 914-366-6386
>
>
>
> *From:* archivesspace_users_group-bounces at lyralists.lyrasis.org [
> mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org
> <archivesspace_users_group-bounces at lyralists.lyrasis.org>] *On Behalf Of *Brian
> Hoffman
> *Sent:* Tuesday, December 08, 2015 4:04 PM
> *To:* Archivesspace Users Group
> *Subject:* Re: [Archivesspace_Users_Group] ampersand issue
>
>
>
> Julia,
>
>
>
> I’m not having any trouble saving a title with ‘ø’ (see screenshot). Are
> you on a windows machine?
>
>
>
>
>
> Brian
>
>
>
> <image001.png>
>
>
>
> On Dec 8, 2015, at 3:50 PM, Novakovic, Julia <jNovakovic at museumofplay.org>
> wrote:
>
>
>
> Similarly, we have found that special characters like ø or quotation marks
> in the title field do not appear correctly, while they appear fine in other
> notes fields throughout imported collections. I have had to go through
> manually and change the characters to something that closely resembles the
> actual characters we want. [For example, Brøderbund to Broderbund … Gerald
> A. (“Jerry”) Lawson papers to Gerald A. (‘Jerry’) Lawson papers.] I would
> also appreciate clarification like Brian has outlined below.
>
>
>
> Thanks!
>
> --Julia
>
>
>
>
>
> Julia Novakovic
>
> Archivist
>
> Associate Editor, *American Journal of Play*
>
> *The Strong*
>
> One Manhattan Square
>
> Rochester, NY 14607 U.S.A.
>
> Tel 585-410-6307
>
> Fax 585-423-1886
>
> jnovakovic at museumofplay.org
>
> www.museumofplay.org
>
>
>
> The Strong is home to:
>
> International Center for the History of Electronic Games | National Toy
> Hall of Fame | World Video Game Hall of Fame
> Brian Sutton-Smith Library and Archives of Play | Woodbury School | *American
> Journal of Play*
>
>
>
>
>
> *From:* archivesspace_users_group-bounces at lyralists.lyrasis.org [
> mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org
> <archivesspace_users_group-bounces at lyralists.lyrasis.org>] *On Behalf Of *Brian
> Hoffman
> *Sent:* Tuesday, December 08, 2015 3:23 PM
> *To:* Archivesspace Users Group
> *Subject:* Re: [Archivesspace_Users_Group] ampersand issue
>
>
>
> Currently unittitle gets imported as-is to the title field of resources
> and components. So when you import the example you get two records whose
> title field is "AmpTest & A”
>
>
>
> We could change this so that you get "AmpTest & A” instead, but what
> happens when a user imports a unittitle like: "AmpTest &
> <emph>A<emph>”? We can’t unescape the & entity without leaving the XML
> zone, so what happens to the <emph> tag and content?
>
>
>
> To me this is part of a larger still unresolved question of what exactly
> the data type of ASpace text field is supposed to be. Is it XML? If so, I
> think we should be assuming that archivists are managing XML data here, and
> it should be ‘&’. If it isn’t, and yet we still want to support inline
> styling, we need to come up with a set of rules for what kind of inline
> pseudo-markup is allowed and for how it maps to EAD on export.
>
>
>
> Brian
>
>
>
>
>
>
>
>
>
> On Dec 8, 2015, at 1:50 PM, Chris Fitzpatrick <
> Chris.Fitzpatrick at lyrasis.org> wrote:
>
>
>
>
> Hi,
>
> I think I understand..
>
> So, the title is in the imported XML is
>
> <unittitle>AmpTest & A</unittitle>
>
> but you want the EAD converter to switch this to be "AmpTest & A"?
>
> If that's the case, it seems like a pretty easy thing to add to the
> converter...
>
>
>
> b,chris
>
>
>
>
>
> Chris Fitzpatrick | Developer, ArchivesSpace
> Skype: chrisfitzpat  | Phone: 918.236.6048
> http://archivesspace.org/
>
>
> ------------------------------
>
> *From:* archivesspace_users_group-bounces at lyralists.lyrasis.org<
> archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of
> Galligan, Patrick <PGalligan at rockarch.org>
> *Sent:* Tuesday, December 8, 2015 6:58 PM
> *To:* Archivesspace Users Group
> *Subject:* Re: [Archivesspace_Users_Group] ampersand issue
>
>
>
> Christine,
>
>
>
> I’d like to circle back to this issue.
>
>
>
> I was doing some testing with the merged pull request, and while it no
> longer just deletes the escaped character, it actually adds “&” to the
> display of the title.
>
>
>
> I’ve also noticed that while it corrects the unittitle on the highest
> level, it doesn’t seem to work with series levels later.
>
>
>
> Attached is a screenshot and the EAD that I imported into AS. Has anyone
> else run into this issue? Has anyone found a viable solution so far?
>
>
>
> Patrick Galligan
>
> Rockefeller Archive Center
>
> Assistant Digital Archivist
>
> 914-366-6386
>
>
>
> *From:* archivesspace_users_group-bounces at lyralists.lyrasis.org[
> mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org
> <archivesspace_users_group-bounces at lyralists.lyrasis.org>] *On Behalf Of *Christine
> Di Bella
> *Sent:* Thursday, December 03, 2015 10:10 AM
> *To:* Archivesspace Users Group
> *Subject:* [Archivesspace_Users_Group] FW: ampersand issue
>
>
>
> Forwarded for Matt Francis.
>
>
>
> (I believe there has been some work on this issue very recently by the
> University of Michigan folks. See this thread on Github -
> https://github.com/archivesspace/archivesspace/issues/332 - and the
> associated merged pull request. – Christine)
>
> <https://github.com/archivesspace/archivesspace/issues/332>
>
> EAD Import - Problem with Escaped Characters · Issue #332 ·
> archivesspace/archivesspace
>
> It looks like there is an issue with certain escaped characters (& and <)
> getting dropped in at least and tags. What we suspect is happening is that
> escaped characters are b...
>
> Read more... <https://github.com/archivesspace/archivesspace/issues/332>
>
>
>
> *From:* MATTHEW R FRANCIS [mailto:mrf22 at psu.edu <mrf22 at psu.edu>]
> *Sent:* Wednesday, December 2, 2015 3:20 PM
> *To:* archivesspace users group-bounces <
> archivesspace_users_group-bounces at lyralists.lyrasis.org>
> *Subject:* ampersand issue
>
>
>
> All,
>
>
>
> We are currently in ASpace v1.4.1 and recently observed an issue for when
> we try to import EAD files that contain ampersands in the XML, and are now
> curious if others have experienced this and/or if anyone knows a fix for
> the issue.
>
>
>
> Currently when import files with an ampersand coded as "&" as seen in:
>
>
>
> <image001.png>
>
>
>
>
>
> The "&" does not appear to be rendered in ASpace in any form, as seen
> in:
>
>
>
>
>
> <image002.png>
>
>
>
> In looking through JIRA it does not appear that this specific
> issue/behavior has been reported, but before reporting it as a bug we were
> hoping to determine if this was a universal issue, or perhaps just local.
>
>
>
> Thanks for the help and feedback.
>
>
>
> -Matt
>
>
>
> *Matt** Francis*
>
> Archivist for Collection Management
>
> Special Collections Library
> Penn State University
>
> _______________________________________________
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group at lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>
>
>
> _______________________________________________
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group at lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>
>
>
> _______________________________________________
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group at lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>
>
>
> _______________________________________________
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group at lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20151209/1127e11f/attachment.html>


More information about the Archivesspace_Users_Group mailing list