[Archivesspace_Users_Group] ampersand issue with PDF button in 2.1.2 public interface

Mayo, Dave dave_mayo at harvard.edu
Fri Sep 22 08:48:51 EDT 2017


Hi Benn,

This is a recurring issue I hit over both Harvard and Smith’s collections – it’s a consequence of ASpace not really having a distinction between mixed content and plaintext content.

Unfortunately, there isn’t really a good solution.  The best solution as far as I’ve been able to figure is to use HTML/XML entity for ampersand (&) wherever it appears in a context that’s treated by the interface/etc as markup; title fields _definitely_ fall under that category.  There’s unfortunately no reliable guide to what fields are “mixed content” and what fields are “plaintext content” because, well, the underlying system doesn’t track that distinction – it’s up to how the fields are eventually displayed/used to build exports/etc.

As to _how_ to fix it – well, it depends somewhat on whether you can be ABSOLUTELY SURE you don’t have any HTML/XML entities in your title fields.  If you are ABSOLUTELY SURE of this, you should be able to make the change via API or on the SQL level, but if you DO have entities, it gets a lot harder, to the point where manual review is probably appropriate.

- Dave Mayo
ASpace Core Committer’s Group Member
From: <archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of Benn Joseph <benn.joseph at northwestern.edu>
Reply-To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org>
Date: Thursday, September 21, 2017 at 4:21 PM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org>
Subject: [Archivesspace_Users_Group] ampersand issue with PDF button in 2.1.2 public interface


Hi all,

We've encountered an issue with the v2.1.2 Print-to-PDF button in the public interface--apparently for any resource record with an ampersand that is followed immediately by another character that is not a space (e.g. "b&w" or "AT&T"), the ampersand is misinterpreted and causes the Print-to-PDF button to fail with an error. For me, that error is just "something went wrong", but the log shows this (when it gets tripped up on "b&w"):



RuntimeError (Failed to clean XML: The reference to entity "w" must end with the ';' delimiter.):



So we're guessing ArchivesSpace is thinking "&w" should be "&w;", and so forth for any other string of text with an ampersand. I checked this by going into a record that wouldn’t print and changing the lone suspect ampersand (“AT&T” to “AT and T”) and the PDF generated just fine.



This doesn't impact being able to just view resource records in the public interface, it's just the PDF function that isn't working. It's a problem, though, because we want to be able to use that PDF functionality but we also have a lot of ampersands in our resource records! Has anyone else experienced this issue or possibly come up with a fix?



Thanks,

--Benn

Benn Joseph
Head of Archival Processing
Northwestern University Libraries
Northwestern University
www.library.northwestern.edu<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.library.northwestern.edu&d=DwMFAg&c=WO-RGvefibhHBZq3fL85hQ&r=_Mv1dY22K7jvT5MD7xjbvGVzRDOUMhx4WYcnPSIzYnE&m=m73cREghXWiIzy9ulXvIZW1Mx-NoJoH_rB1LSdzHQ6Q&s=Xj5cFVS13R-ioWYCsYqxItOviZziBf6vpg_FBhiC1c4&e=>
benn.joseph at northwestern.edu<mailto:benn.joseph at northwestern.edu%0d>
847.467.6581

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20170922/a98d1127/attachment.html>


More information about the Archivesspace_Users_Group mailing list