<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">I believe that the ArchivesSpace OAI feed uses system-mtime for the OAI timestamps, but dates in the staff interface are usually user-mtimes. ( system-mtime used because they propagate thru record hierarchy: i.e. if you make changes to a child archival_object, but not to the parent, user_mtime is updated on the archival_object only, but system_mtime is updated on both. ) <div class=""><br class=""></div><div class=""><br class=""></div><div class="">But if I understand what you’re saying, the timestamps differ when using the harvester compared to, for example, what you see using the oai sample form. ( Is that correct ? ) </div><div class=""><br class=""></div><div class="">If that is the case, then I do think it’s an issue with the harvester. </div><div class=""><br class=""></div><div class="">We’re using this branch:</div><div class=""><a href="https://github.com/sdm7g/oai-harvest/tree/fix-pyoai" class="">https://github.com/sdm7g/oai-harvest/tree/fix-pyoai</a></div><div class=""><br class=""></div><div class="">Which is a patched version of oaiharvest <a href="https://github.com/bloomonkey/oai-harvest" class="">https://github.com/bloomonkey/oai-harvest</a> . </div><div class=""><br class=""></div><div class="">For oai_marc payload, you may be able to use the upstream version that you can install using ‘pip install oaiharvest’ . ( My fixes are needed for EAD payload: EAD export in ArchivesSpace has more bugs in it, so I’m using the recover option to XML parser to recover from and log parse errors that would otherwise halt the harvest before completion. I don’t think MARC payloads, being simpler, have the same issues, and I know oai_dc has no similar glitches. If you do run into parse errors with oai_marc, you can use my version with added ‘—-recover’ command line option. ) </div><div class=""><br class=""></div><div class=""><br class=""></div><div class="">If your OAI endpoint is public and you send me the URL, I try my harvester and look at results. </div><div class=""><br class=""></div><div class=""><br class=""></div><div class="">— Steve Majewski</div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><b class="">$</b> oai-harvest -h</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">usage: oai-harvest [-h] [--db DATABASEPATH] [-p METADATAPREFIX] [-r TOKEN]</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> [-f YYYY-MM-DD] [-u YYYY-MM-DD] [-s SET] [-b HH:MM HH:MM]</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> [-d DIR] [--delete | --no-delete] [-l LIMIT]</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> [--create-subdirs | --subdirs-on SUBDIRS]</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> [--recover | --no-recover]</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> provider [provider ...]</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal; min-height: 18px;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""></span><br class=""></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">Harvest records from an OAI-PMH provider.</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal; min-height: 18px;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""></span><br class=""></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">positional arguments:</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> provider OAI-PMH Provider from which to harvest. This may be</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> the base URL of an OAI-PMH server, or the short name</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> of a registered provider. You may also specify "all"</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> for all registered providers.</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal; min-height: 18px;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""></span><br class=""></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">optional arguments:</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> -h, --help show this help message and exit</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> --db DATABASEPATH, --database DATABASEPATH</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> Path to provider registry database. Currently supports</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> sqlite3 only.</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> -p METADATAPREFIX, --metadataPrefix METADATAPREFIX</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> the metadataPrefix of the format (XML Schema) in which</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> records should be harvested.</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> -r TOKEN, --resume-from TOKEN</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> start at the given resumption TOKEN</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> -f YYYY-MM-DD, --from YYYY-MM-DD</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> harvest only records added/modified after this date.</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> -u YYYY-MM-DD, --until YYYY-MM-DD</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> harvest only records added/modified up to this date.</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> -s SET, --set SET harvest only records within this set</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> -b HH:MM HH:MM, --between HH:MM HH:MM</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> harvest only between the first and the second wall</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> clock time (enables incremental harvesting)</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> -d DIR, --dir DIR where to output files for harvested records. default:</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> current working path</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> --delete respect the server's instructions regarding deletions,</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> i.e. delete the files locally (default)</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> --no-delete ignore the server's instructions regarding deletions,</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> i.e. DO NOT delete the files locally</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> -l LIMIT, --limit LIMIT</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> limit the number of records to harvest from each</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> provider</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> --create-subdirs create target subdirs (based on / characters in</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> identifiers) if they don't exist. To use something</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> other than /, use the newer--subdirs-on option</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> --subdirs-on SUBDIRS create target subdirs based on occurrences of the</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> given characterin identifiers</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> --recover create XMLParser with (recover=True) option: parser</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> will try to continue to parse broken XML payloads</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""> --no-recover default is --no-recover</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal; min-height: 18px;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""></span><br class=""></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">Copyright (c) 2013, the University of Liverpool <<a href="http://www.liv.ac.uk" class="">http://www.liv.ac.uk</a>>. All</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">rights reserved. Distributed under the terms of the BSD 3-clause License</span></div><div style="margin: 0px; font-stretch: normal; font-size: 15px; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><<a href="http://opensource.org/licenses/BSD-3-Clause" class="">http://opensource.org/licenses/BSD-3-Clause</a>>.</span></div></div><div class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><br class=""></span></div><div class=""><br class=""></div><div class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Apr 22, 2020, at 3:06 PM, Kevin W. Schlottmann <<a href="mailto:kws2126@columbia.edu" class="">kws2126@columbia.edu</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><meta http-equiv="Content-Type" content="text/html; charset=utf-8" class=""><div dir="ltr" class=""><div class="">Dear AS List,</div><div class=""><br class=""></div><div class="">We rely on the OAI feed to pipe updated records to various places, on a nightly basis. We recently came across some odd behavior that we are hoping list members might have some suggestions. <br class=""></div><div class=""><br class=""></div><div class="">We have a few resource records that have been recently updated, show the correct updated time in the staff GUI, and have the correct updated time when the downloaded directly using the OAI getRecord command[1].</div><div class=""><br class=""></div><div class="">However, in our bulk OAI download of all records, using pyoaiharvester[2], the record's datestamp is somehow stuck on an earlier date. <br class=""></div><div class=""><br class=""></div><div class="">Even stranger, if we add the 'from' parameter to [2] manually with the correct date value, we *get* the records, with the correct datestamp. <br class=""></div><div class=""><br class=""></div><div class="">We are digging into this with help from Lyrasis, but we don't have an answer yet. My guess is an issue with the harvester, but it's not immediately obvious what it would be. Other avenues we're looking at issues with the resumption token, or with the indexer (the latter often being the cause of AS issues, anecdotally). Questions for the list:</div><div class=""><br class=""></div><div class="">1) Is there anything known in the OAI implementation that might cause this off datestamp behavior? <br class=""></div><div class=""><br class=""></div><div class="">2) Since this may be an issue with the harvester, does anyone have a preferred OAI harvester that handles marcxml? </div><div class=""><br class=""></div><div class="">Best,</div><div class=""><br class=""></div><div class="">Kevin<br class=""></div><div class=""><br class=""></div><div class="">[1] getRecord command; getting it as a single record has the right datestamp:<br class="">https://{oaiendpoint}?verb=GetRecord&identifier=oai:columbia//repositories/2/resources/6381&metadataPrefix=oai_marc</div><div class=""><br class=""></div><div class="">[2] Using the pyoaiharvester library (<a href="https://github.com/vphill/pyoaiharvester" target="_blank" class="">https://github.com/vphill/pyoaiharvester</a>). <br class="">python /.../as_reports/pyoaiharvester/pyoaiharvest.py -l
{oaiendpoint}
-m oai_marc -s collection -o /.../archivesspace/oai/20200419.asRaw.xml</div><div class=""><br class=""></div><div class="">-- <br class=""><div dir="ltr" data-smartmail="gmail_signature" class="">Kevin Schlottmann<br class="">Head of Archives Processing<br class="">Rare Book & Manuscript Library<br class="">Butler Library, Room 801<br class="">Columbia University<br class="">535 W. 114th St., New York, NY 10027<br class="">(212) 854-8483</div></div></div>
_______________________________________________<br class="">Archivesspace_Users_Group mailing list<br class=""><a href="mailto:Archivesspace_Users_Group@lyralists.lyrasis.org" class="">Archivesspace_Users_Group@lyralists.lyrasis.org</a><br class="">http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group<br class=""></div></blockquote></div><br class=""></div></body></html>