[Archivesspace_Users_Group] Print to PDF job failing

Zachary L Pelli zachary.pelli at shu.edu
Mon Nov 26 15:26:15 EST 2018


Mark,

Regarding my instance in particular, would you recommend running that SQL block? I do not see a way to edit the XML from within AS itself, so I'm thinking that leaves the mysql DB or through the API.

Regards,
Zach

-----Original Message-----
From: archivesspace_users_group-bounces at lyralists.lyrasis.org <archivesspace_users_group-bounces at lyralists.lyrasis.org> On Behalf Of Custer, Mark
Sent: Monday, November 26, 2018 2:59 PM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org>
Subject: Re: [Archivesspace_Users_Group] Print to PDF job failing

Lydia, all:

I don't know if it's a bug with the AT migrator, but I have raised this issue in 2014 when providing (a lot!) of feedback about the AT migration process.  There are a couple of complicated issues, namely:

All of those xlink namespace prefixes are treated and stored as text, not as XML (in both the AT and ASpace).  So, it can be a bit dangerous and inefficient to change them in bulk.  I had hoped that the migration tool would address that issue, but the decision at the time was not to have the migration tool do this.  Because of that, we decided to address this prior to our migration with the following (albeit inelegant) SQL update:  https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FYaleArchivesSpace%2FmigrationSQL%2Fblob%2Fmaster%2FAllDatabasesPreMigration.sql%23L1-L26&data=01%7C01%7CZachary.pelli%40shu.edu%7C51655a3db3c24d40642a08d653d9af16%7C51f07c2253b744dfb97ca13261d71075%7C1&sdata=6E9RbWLc%2BM%2FTxJGJ9X%2FgviFCRFlCzYhBlxAPhPVcIRs%3D&reserved=0  (that largely did the trick, if I recall correctly).

All of those @target attributes are also stored as text.  That was a lot messier, so the AT migration tool eventually added the ability to keep the ID values as is (e.g. "ref33" ==> "ref33") during the migration process, but you had to pass that argument (with the "-refid_original" flag) to the migration tool to keep those values as is.

Anyhow, given the complexities of the migrations, it's hard to say what's a bug and what's behaving as expected (at least it's hard for me to remember everything involved).  That said, Maureen did an AMAZING job of writing up our steps:  https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcampuspress.yale.edu%2Fyalearchivesspace%2F2015%2F06%2F14%2Fmigration-step-by-step%2F&data=01%7C01%7CZachary.pelli%40shu.edu%7C51655a3db3c24d40642a08d653d9af16%7C51f07c2253b744dfb97ca13261d71075%7C1&sdata=VHF7%2BFRIYh0VQQh%2Bui2YndcnqwGVn%2FJQBgC3g7D4Npc%3D&reserved=0 .  If anything in there is something that can be fixed with the AT migration tool, I'd be happy to provide more feedback on those tickets.  That said, I'm honestly not sure how best to capture those there at this point (and I think / assumed that all of those issues were recorded in a previous bug/feature tracking system used by ASpace).

Mark



-----Original Message-----
From: archivesspace_users_group-bounces at lyralists.lyrasis.org [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of Tang, Lydia
Sent: Monday, 26 November, 2018 1:23 PM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org>
Subject: Re: [Archivesspace_Users_Group] Print to PDF job failing

Hi Zach, Ed, and Mark,
It sounds like this is a bug with the AT Migrator.  Would you be willing to create a bug report for this in JIRA?  On a quick search, it didn’t seem like this issue has been logged yet.
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Farchivesspace.atlassian.net%2Fsecure%2FDashboard.jspa&data=01%7C01%7CZachary.pelli%40shu.edu%7C51655a3db3c24d40642a08d653d9af16%7C51f07c2253b744dfb97ca13261d71075%7C1&sdata=CtyHiDP11seG18JK%2FhkmdsW8P%2Fgz7bJeJZ6FRGbYs5Y%3D&reserved=0
It would be great if you could include in the tags “ATMigrator.”  Thanks!
Lydia
-on behalf of Dev. Pri.

From: <archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of "Custer, Mark" <mark.custer at yale.edu>
Reply-To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org>
Date: Monday, November 26, 2018 at 12:36 PM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org>
Subject: Re: [Archivesspace_Users_Group] Print to PDF job failing

Removing the ns2: prefix altogether should work, as should updating those ns2: prefixes to be xlink: prefixes (but it’s not good to have that in the text in the first place, so I wouldn’t advise that).  I’m not sure why ASpace’s normal EAD cleaning process isn’t changing those prefixes to xlink prefixes during the export, though.  Another issue (seeing that this came from the AT) is that the @target attributes in the EAD file don’t match any @id attributes (e.g. “ref16” does not equal “aspace_ref16_p33”).  This won’t cause the file from converting to a PDF, but it’s still a broken link.

All that said, if there’s no need to have a valid EAD file, then probably the easiest way to fix the PDF conversion issue (and something that ASpace could handle easily) would be to add a second namespace prefix for the xlink attributes.  So, just changing this:

<ead xmlns="urn:isbn:1-931666-22-9" xmlns:xlink="https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.w3.org%2F1999%2Fxlink&data=01%7C01%7CZachary.pelli%40shu.edu%7C51655a3db3c24d40642a08d653d9af16%7C51f07c2253b744dfb97ca13261d71075%7C1&sdata=7mH74IJHuOZi1g%2BGKWiUdvWzU%2Bo2QLRxbx1HHZTpDdQ%3D&reserved=0"
  xmlns:xsi="https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema-instance&data=01%7C01%7CZachary.pelli%40shu.edu%7C51655a3db3c24d40642a08d653d9af16%7C51f07c2253b744dfb97ca13261d71075%7C1&sdata=Wfz16HlY0gz72zi84u4q5D9kLEA9fFU0AZh88dIPWXU%3D&reserved=0"
  xsi:schemaLocation="urn:isbn:1-931666-22-9 https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.loc.gov%2Fead%2Fead.xsd&data=01%7C01%7CZachary.pelli%40shu.edu%7C51655a3db3c24d40642a08d653d9af16%7C51f07c2253b744dfb97ca13261d71075%7C1&sdata=qHOPP9hZ9YSR0msyJsMSKxAjn%2BcRV2Eey%2B8nMlUn%2BBI%3D&reserved=0">

To this:

<ead xmlns="urn:isbn:1-931666-22-9" xmlns:xlink="https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.w3.org%2F1999%2Fxlink&data=01%7C01%7CZachary.pelli%40shu.edu%7C51655a3db3c24d40642a08d653d9af16%7C51f07c2253b744dfb97ca13261d71075%7C1&sdata=7mH74IJHuOZi1g%2BGKWiUdvWzU%2Bo2QLRxbx1HHZTpDdQ%3D&reserved=0"
  xmlns:ns2="https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.w3.org%2F1999%2Fxlink&data=01%7C01%7CZachary.pelli%40shu.edu%7C51655a3db3c24d40642a08d653d9af16%7C51f07c2253b744dfb97ca13261d71075%7C1&sdata=7mH74IJHuOZi1g%2BGKWiUdvWzU%2Bo2QLRxbx1HHZTpDdQ%3D&reserved=0"
  xmlns:xsi="https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema-instance&data=01%7C01%7CZachary.pelli%40shu.edu%7C51655a3db3c24d40642a08d653d9af16%7C51f07c2253b744dfb97ca13261d71075%7C1&sdata=Wfz16HlY0gz72zi84u4q5D9kLEA9fFU0AZh88dIPWXU%3D&reserved=0"
 xsi:schemaLocation="urn:isbn:1-931666-22-9 https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.loc.gov%2Fead%2Fead.xsd&data=01%7C01%7CZachary.pelli%40shu.edu%7C51655a3db3c24d40642a08d653d9af16%7C51f07c2253b744dfb97ca13261d71075%7C1&sdata=qHOPP9hZ9YSR0msyJsMSKxAjn%2BcRV2Eey%2B8nMlUn%2BBI%3D&reserved=0">

Given the broken links, though, which were also caused by the AT to ASpace migration (and the fact that ASpace prepends “aspace_” during its EAD export process), I’d say that some data updates would have to happen at some point, though.  But the above trick should allow you to create the PDF for this file with the least amount of editing.



From: archivesspace_users_group-bounces at lyralists.lyrasis.org [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of Busch, Ed
Sent: Monday, 26 November, 2018 11:58 AM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org>
Subject: Re: [Archivesspace_Users_Group] Print to PDF job failing

I just remove ns2: from ns2:actuate.

Ed

From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> <archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org>> On Behalf Of Zachary L Pelli
Sent: Monday, November 26, 2018 11:51 AM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org<mailto:archivesspace_users_group at lyralists.lyrasis.org>>
Subject: Re: [Archivesspace_Users_Group] Print to PDF job failing

Thanks for the reply, guys. I have attached the EAD file. This is indeed likely a carry-over from AT. I do not see a namespace declaration in the file, but I do see the ns2 prefixes. So would the solution be to eliminate the ns2 prefixes?

Regards,
Zach

From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> <archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org>> On Behalf Of Custer, Mark
Sent: Tuesday, November 20, 2018 3:43 PM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org<mailto:archivesspace_users_group at lyralists.lyrasis.org>>
Subject: Re: [Archivesspace_Users_Group] Print to PDF job failing

Zach,

I’d suggest exporting the EAD file and taking a look at the file that way (and I’d be happy to take a look if you can send it to me).

Those “ns2” namespace prefixes are likely from Archivist’s Toolkit, which at some point started appending that prefix for the xlink namespace (which is fine, but everything that looks like XML in the AT and ASpace is treated as text, and things can get messy when namespace prefixed are hardcoded in that text!).  ArchivesSpace has a process to clean the XML upon export, which generally fixes a lot of those hard-coded namespace prefixes, but I’m honestly not sure why you’d be getting that error without seeing the entire EAD file since you shouldn’t even need a valid EAD file for the PDF process to potentially still work.  There are other reasons why the PDF file not be created, but I can’t think of why that type of invalidity would cause it to fail on its own.  Anyhow, the issue that’s being reported is that the EAD file that ASpace will produce for this record will have the following at the top of the file:

xmlns:xlink=”http://www.w3.org/1999/xlink<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__na01.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Furldefense.proofpoint.com-252Fv2-252Furl-253Fu-253Dhttp-2D3A-5F-5Fwww.w3.org-5F1999-5Fxlink-2526d-253DDwMFAg-2526c-253DnE-5F-5FW8dFE-2DshTxStwXtp0A-2526r-253DnzQRpzss-5FAwHOHAKVWRsNQ-2526m-253DJ7u-2DpzUAtEowL9CMv6zcGzDAF9jip-2DFovsDGWzizIFE-2526s-253DRsTVozfa9pRzr4KkH6jg9bmLJjgVDvEd-2Dosjunk3Ccc-2526e-253D-26data-3D02-257C01-257Cmark.custer-2540yale.edu-257C62fcbaf5a614487348af08d653c063b2-257Cdd8cbebb21394df8b4114e3e87abeb5c-257C0-257C0-257C636788483122443204-26sdata-3D-252BDob1dheccf4CNcdrm3v67lkJCMS9aPjaFplX-252F2mPUU-253D-26reserved-3D0%26d%3DDwMFAg%26c%3DnE__W8dFE-shTxStwXtp0A%26r%3DlG1-HSCEGsZJf-_mV6BDLh4PvkC3fOv47rKbM_dbh1g%26m%3D_jKYDQ_LWLST6UghdOLXVjzVxmw51YJS3j_xDv-RhUQ%26s%3DJiZF4ZOeqG5nrYTWzSbzIC5yIaZL5DE42t9S7PAf7vU%26e&data=01%7C01%7CZachary.pelli%40shu.edu%7C51655a3db3c24d40642a08d653d9af16%7C51f07c2253b744dfb97ca13261d71075%7C1&sdata=wbPlfB0x8CypNAcAbvcbnQ5KYYQNA4qP%2FI723XymoXQ%3D&reserved=0=>”

Whereas what you have further down in your file on that ref tag is “ns2:”, and if you export that same file from the AT (or whatever else caused the ns2 prefix to get in there), you’d see this at the top of the file:

xmlns:ns2:”http://www.w3.org/1999/xlink<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__na01.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Furldefense.proofpoint.com-252Fv2-252Furl-253Fu-253Dhttp-2D3A-5F-5Fwww.w3.org-5F1999-5Fxlink-2526d-253DDwMFAg-2526c-253DnE-5F-5FW8dFE-2DshTxStwXtp0A-2526r-253DnzQRpzss-5FAwHOHAKVWRsNQ-2526m-253DJ7u-2DpzUAtEowL9CMv6zcGzDAF9jip-2DFovsDGWzizIFE-2526s-253DRsTVozfa9pRzr4KkH6jg9bmLJjgVDvEd-2Dosjunk3Ccc-2526e-253D-26data-3D02-257C01-257Cmark.custer-2540yale.edu-257C62fcbaf5a614487348af08d653c063b2-257Cdd8cbebb21394df8b4114e3e87abeb5c-257C0-257C0-257C636788483122443204-26sdata-3D-252BDob1dheccf4CNcdrm3v67lkJCMS9aPjaFplX-252F2mPUU-253D-26reserved-3D0%26d%3DDwMFAg%26c%3DnE__W8dFE-shTxStwXtp0A%26r%3DlG1-HSCEGsZJf-_mV6BDLh4PvkC3fOv47rKbM_dbh1g%26m%3D_jKYDQ_LWLST6UghdOLXVjzVxmw51YJS3j_xDv-RhUQ%26s%3DJiZF4ZOeqG5nrYTWzSbzIC5yIaZL5DE42t9S7PAf7vU%26e&data=01%7C01%7CZachary.pelli%40shu.edu%7C51655a3db3c24d40642a08d653d9af16%7C51f07c2253b744dfb97ca13261d71075%7C1&sdata=wbPlfB0x8CypNAcAbvcbnQ5KYYQNA4qP%2FI723XymoXQ%3D&reserved=0=>”

And “xlink” does not equal “ns2”, even though they’re both trying to stand in for the same namespace.

All that said, the only way that I can think to troubleshoot the issue is investigating the EAD file itself, since the process for creating a PDF from the staff interface is 1) export the EAD, then 2) convert that EAD into a PDF file.

Mark




From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of Busch, Ed
Sent: Tuesday, 20 November, 2018 2:59 PM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org<mailto:archivesspace_users_group at lyralists.lyrasis.org>>
Subject: Re: [Archivesspace_Users_Group] Print to PDF job failing

The collection probably has a component title of the form <title ns2:actuate="onrequest" render="italic">The Alert</title>--. These will generate an error when creating a finding aid in PDF format. The ns2 text around the title should be removed. For example, <title actuate="onrequest" render="italic">The Alert</title> from the example above.

So, you get to figure out which one it is. If you have access to the backend DB, you can probably come up with a query to find it. Or you can go through your Resource component lines looking for it.

Good luck!

Ed Busch, MLIS
Electronic Records Archivist
Michigan State University Archives
Conrad Hall
943 Conrad Road, Room 101
East Lansing, MI 48824
517-884-6438
buschedw at msu.edu<mailto:buschedw at msu.edu>




From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> <archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org>> On Behalf Of Zachary L Pelli
Sent: Tuesday, November 20, 2018 2:52 PM
To: archivesspace_users_group at lyralists.lyrasis.org<mailto:archivesspace_users_group at lyralists.lyrasis.org>
Subject: [Archivesspace_Users_Group] Print to PDF job failing

Hey all,

One of our archivists is having an issue with a Print to PDF job with a particular collection (other large collections work fine).

In the log within AS Background Jobs section, it gives this:

Generating PDF for John M. Oesterreicher papers org.xml.sax.SAXParseException; lineNumber: 28; columnNumber: 76; The prefix "ns2" for attribute "ns2:actuate" associated with an element type "ref" is not bound.
net.sf.saxon.s9api.DocumentBuilder.build(net/sf/saxon/s9api/DocumentBuilder.java:379)
java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:606)
org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(org/jruby/javasupport/JavaMethod.java:453)
org.jruby.javasupport.JavaMethod.invokeDirect(org/jruby/javasupport/JavaMethod.java:314)
RUBY.parse(/var/local/archivesspace/archivesspace250/gems/gems/saxon-xslt-0.8.2.1-java/lib/saxon/xml.rb:28)
RUBY.XML(/var/local/archivesspace/archivesspace250/gems/gems/saxon-xslt-0.8.2.1-java/lib/saxon/processor.rb:58)
RUBY.XML(/var/local/archivesspace/archivesspace250/gems/gems/saxon-xslt-0.8.2.1-java/lib/saxon/xml.rb:10)
RUBY.to_fo(/var/local/archivesspace/archivesspace250/data/tmp/jetty-0.0.0.0-8089-backend.war-_-any-/webapp/WEB-INF/app/lib/AS_fop.rb:32)
RUBY.to_pdf(/var/local/archivesspace/archivesspace250/data/tmp/jetty-0.0.0.0-8089-backend.war-_-any-/webapp/WEB-INF/app/lib/AS_fop.rb:38)
RUBY.block in run(/var/local/archivesspace/archivesspace250/data/tmp/jetty-0.0.0.0-8089-backend.war-_-any-/webapp/WEB-INF/app/lib/job_runners/print_to_pdf_runner.rb:39)
var.local.archivesspace.archivesspace250.data.tmp.jetty_minus_0_dot_0_dot_0_dot_0_minus_8089_minus_backend_dot_war_minus___minus_any_minus_.webapp.WEB_minus_INF.app.lib.request_context.open(/var/local/archivesspace/archivesspace250/data/tmp/jetty-0.0.0.0-8089-backend.war-_-any-/webapp/WEB-INF/app/lib/request_context.rb:24)
RUBY.run(/var/local/archivesspace/archivesspace250/data/tmp/jetty-0.0.0.0-8089-backend.war-_-any-/webapp/WEB-INF/app/lib/job_runners/print_to_pdf_runner.rb:13)
var.local.archivesspace.archivesspace250.data.tmp.jetty_minus_0_dot_0_dot_0_dot_0_minus_8089_minus_backend_dot_war_minus___minus_any_minus_.webapp.WEB_minus_INF.app.lib.background_job_queue.invokeOther43:run(var/local/archivesspace/archivesspace250/data/tmp/jetty_minus_0_dot_0_dot_0_dot_0_minus_8089_minus_backend_dot_war_minus___minus_any_minus_/webapp/WEB_minus_INF/app/lib//var/local/archivesspace/archivesspace250/data/tmp/jetty-0.0.0.0-8089-backend.war-_-any-/webapp/WEB-INF/app/lib/background_job_queue.rb:126)
var.local.archivesspace.archivesspace250.data.tmp.jetty_minus_0_dot_0_dot_0_dot_0_minus_8089_minus_backend_dot_war_minus___minus_any_minus_.webapp.WEB_minus_INF.app.lib.background_job_queue.run_pending_job(/var/local/archivesspace/archivesspace250/data/tmp/jetty-0.0.0.0-8089-backend.war-_-any-/webapp/WEB-INF/app/lib/background_job_queue.rb:126)
RUBY.block in start_background_thread(/var/local/archivesspace/archivesspace250/data/tmp/jetty-0.0.0.0-8089-backend.war-_-any-/webapp/WEB-INF/app/lib/background_job_queue.rb:169)
org.jruby.RubyProc.call(org/jruby/RubyProc.java:289)
org.jruby.RubyProc.call(org/jruby/RubyProc.java:246)
java.lang.Thread.run(java/lang/Thread.java:748)

Has anyone encountered this problem before?

Regards,

Zach Pelli
Digital Collections Infrastructure Developer Seton Hall University Libraries
973.275.2046

_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flyralists.lyrasis.org%2Fmailman%2Flistinfo%2Farchivesspace_users_group&data=01%7C01%7CZachary.pelli%40shu.edu%7C51655a3db3c24d40642a08d653d9af16%7C51f07c2253b744dfb97ca13261d71075%7C1&sdata=%2FQRtZ%2B4vignqblnGJvP4u2h%2FJproCgBipkpGiSZ9bl0%3D&reserved=0
_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flyralists.lyrasis.org%2Fmailman%2Flistinfo%2Farchivesspace_users_group&data=01%7C01%7CZachary.pelli%40shu.edu%7C51655a3db3c24d40642a08d653d9af16%7C51f07c2253b744dfb97ca13261d71075%7C1&sdata=%2FQRtZ%2B4vignqblnGJvP4u2h%2FJproCgBipkpGiSZ9bl0%3D&reserved=0


More information about the Archivesspace_Users_Group mailing list