[Archivesspace_Users_Group] ampersand issue with PDF button in 2.1.2 public interface
Custer, Mark
mark.custer at yale.edu
Fri Sep 22 09:41:31 EDT 2017
Dave, Benn:
Another important point here is that the ASpace staff interface attempts to handle both “ & “ and “ & “. Those spaces are (quite unfortunately) important here. & is a special HTML and XML entity reference for the ampersand character. A nice, short overview is provided here: https://mrcoles.com/blog/how-use-amersands-html-encode/
Benn, as you’ve discovered, when there aren’t any spaces, it looks like those attempts to handle both types of references go out the window. So, AT&T, b&w, &c., &c., &c., cause problems. The really problematic part is that the problems and the “solutions” vary depending on whether you’re exporting data, storing it titles vs. notes, displaying it in the PUI, creating a PDF, &c. I just recently discovered another wrinkle that puts us into a real catch 22. Here it is:
We’ve got a note with “&c.” in ArchivesSpace. It would be great if that could be handled okay everywhere, whether ArchivesSpace forced us to use &c OR &c (and I don’t care which way!), but as Dave explains, there aren’t clear distinctions for ASpace to make in every situation right now.
With that “&c” in the note, here’s what I wind up with:
* The PUI displays the note correctly. Yay!
* The EAD exporter won’t wrap paragraph elements around the note, which results in invalid EAD. Boo.
If I edit that note to instead be “&c”, here’s what I wind up with:
* The PUI displays everything in the note up until that character. Everything else in the note silently falls away into oblivion (but surely this would be an easy bug fix, I hope???). Boo.
* The EAD exporter works perfectly. Yay!
Dave, since you included the line about being on the Core Committer’s Group, does that mean that this issue will be discussed soon ☺ (I know that it wouldn’t be an easy thing to tackle, but it would make a lot of metadata folks out there pretty happy!)
Mark
From: archivesspace_users_group-bounces at lyralists.lyrasis.org [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of Mayo, Dave
Sent: Friday, 22 September, 2017 8:49 AM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org>
Subject: Re: [Archivesspace_Users_Group] ampersand issue with PDF button in 2.1.2 public interface
Hi Benn,
This is a recurring issue I hit over both Harvard and Smith’s collections – it’s a consequence of ASpace not really having a distinction between mixed content and plaintext content.
Unfortunately, there isn’t really a good solution. The best solution as far as I’ve been able to figure is to use HTML/XML entity for ampersand (&) wherever it appears in a context that’s treated by the interface/etc as markup; title fields _definitely_ fall under that category. There’s unfortunately no reliable guide to what fields are “mixed content” and what fields are “plaintext content” because, well, the underlying system doesn’t track that distinction – it’s up to how the fields are eventually displayed/used to build exports/etc.
As to _how_ to fix it – well, it depends somewhat on whether you can be ABSOLUTELY SURE you don’t have any HTML/XML entities in your title fields. If you are ABSOLUTELY SURE of this, you should be able to make the change via API or on the SQL level, but if you DO have entities, it gets a lot harder, to the point where manual review is probably appropriate.
- Dave Mayo
ASpace Core Committer’s Group Member
From: <archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org>> on behalf of Benn Joseph <benn.joseph at northwestern.edu<mailto:benn.joseph at northwestern.edu>>
Reply-To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org<mailto:archivesspace_users_group at lyralists.lyrasis.org>>
Date: Thursday, September 21, 2017 at 4:21 PM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org<mailto:archivesspace_users_group at lyralists.lyrasis.org>>
Subject: [Archivesspace_Users_Group] ampersand issue with PDF button in 2.1.2 public interface
Hi all,
We've encountered an issue with the v2.1.2 Print-to-PDF button in the public interface--apparently for any resource record with an ampersand that is followed immediately by another character that is not a space (e.g. "b&w" or "AT&T"), the ampersand is misinterpreted and causes the Print-to-PDF button to fail with an error. For me, that error is just "something went wrong", but the log shows this (when it gets tripped up on "b&w"):
RuntimeError (Failed to clean XML: The reference to entity "w" must end with the ';' delimiter.):
So we're guessing ArchivesSpace is thinking "&w" should be "&w;", and so forth for any other string of text with an ampersand. I checked this by going into a record that wouldn’t print and changing the lone suspect ampersand (“AT&T” to “AT and T”) and the PDF generated just fine.
This doesn't impact being able to just view resource records in the public interface, it's just the PDF function that isn't working. It's a problem, though, because we want to be able to use that PDF functionality but we also have a lot of ampersands in our resource records! Has anyone else experienced this issue or possibly come up with a fix?
Thanks,
--Benn
Benn Joseph
Head of Archival Processing
Northwestern University Libraries
Northwestern University
www.library.northwestern.edu<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.library.northwestern.edu&d=DwMFAg&c=WO-RGvefibhHBZq3fL85hQ&r=_Mv1dY22K7jvT5MD7xjbvGVzRDOUMhx4WYcnPSIzYnE&m=m73cREghXWiIzy9ulXvIZW1Mx-NoJoH_rB1LSdzHQ6Q&s=Xj5cFVS13R-ioWYCsYqxItOviZziBf6vpg_FBhiC1c4&e=>
benn.joseph at northwestern.edu<mailto:benn.joseph at northwestern.edu%0d>
847.467.6581
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20170922/247641b4/attachment.html>
More information about the Archivesspace_Users_Group
mailing list