[Archivesspace_Users_Group] Question about OAI harvest of MARCXML records

Andy Boze Boze.1 at nd.edu
Fri Mar 10 14:45:15 EST 2023


Hi, all.

Before I get to the question, let me give some background. We've been 
successfully harvesting EAD records from ASpace. We're currently running 
v2.8.1 and when we test the harvest on v3.3 we consistently get timeout 
problems where sometimes ASpace will simply stop responding or return 
some error. Some of our records are very large, but this happens when we 
request even relatively small records.

As a work-around, we wanted to try harvesting records in MARCXML format. 
It doesn't provide all of the data that are included in the EAD record, 
but it's good enough for our purposes.

The problem we have with harvesting records in MARCXML format is that 
ASpace returns not only resource records (which are the only records 
returned by EAD) but also records for archival objects, which we don't 
want. That is, we want records with an identifier of

<identifier>oai:und//repositories/2/resources/1301</identifier>

but not

<identifier>oai:und//repositories/2/archival_objects/673199</identifier>

When I add set=fonds to the OAI URL, I do get just resources (plus 
deleted records), which is pretty much what I would expect, but not all 
of our resources are fonds. When I add set=collections, I start getting 
archival objects as well as resources. And without specifying a set, I 
get a mix of resources and archival objects. (Our harvester also doesn't 
allow us to request specific records, just a set and a beginning/ending 
date.)

So, my question is: Is there a way to harvest MARCXML records only for 
resources?

I hope this makes sense. I've not an archivist, so I hope I'm stating 
things adequately.

Andy

-- 
Andy Boze, Associate Librarian
University of Notre Dame
271H Hesburgh Library
(574) 631-8708


More information about the Archivesspace_Users_Group mailing list