[Archivesspace_Users_Group] Retrieving tree info via API (what are "waypoints"?)

Trevor Thornton trthorn2 at ncsu.edu
Tue Jul 23 14:22:04 EDT 2019


First of all, great documentation (in the code, API documentation less so
but we're working on that) 👍

To close the loop on this thread (for anyone still interested):
For what I'm doing I need the container info, which is included in the
.../tree/... responses. Basically I'm re-creating a version of the AS
resource tree to provide a browsable view of a resource hierarchy in
another application. So the process will be something like this (for a
resource with URI  */repositories/1/resources/123):*

   1. Call * /repositories/1/resources/123/tree/root* to get the
   resource-level data and its children (up to 200)
   2. If the value for "waypoints" in the response is greater than 1, call
   */repositories/1/resources/123/tree/waypoints&offset=n* for each
   additional waypoint (n = 1 through # of waypoints - 1) to get the rest of
   the children
   3. Then for each child record with other children, I'll provide a link
   to see the next level, which will call
* /repositories/1/resources/123/tree/node&node_uri=[URI OF THE RECORD THAT
   WAS CLICKED] **(NOTE: node_uri is a required parameter for this endpoint
   but that's not mentioned in API the documentation)*
   This provides a response similar to the *.../tree/root* endpoint but
   with data for the archival object record instead of the resource
   4. Repeat step 2 if there is more than one waypoint at this level,
   including the current node URI as *parent_id* in the GET params
   5. Repeat steps 3 & 4 until you get to the end

I *think* this is close to right.

Thanks again for your help!

On Tue, Jul 23, 2019 at 12:51 PM Majewski, Steven Dennis (sdm7g) <
sdm7g at virginia.edu> wrote:

>
> I believe for the next level of archival_objects, you have to get
> /repositories/$REPO/archival_objects/$ID/children , but check the API docs.
>
>
> Note that there is also a GET
> /repositories/$REPO/resources/$ID/ordered_records method that gives you the
> whole hierarchy, but minimal info about each resource:  { ref:
> display_string:, depth:, level: }
>
> I don’t think I knew about that one the first time I was wrestling with
> this sort of task.
> If you’re doing backend API and not worried about real time display
> update, it might make more sense to walk the output ordered_records
> If you want more complete info on resource children.
>
>
> — Steve.
>
>
> On Jul 23, 2019, at 12:11 PM, Trevor Thornton <trthorn2 at ncsu.edu> wrote:
>
> Just found that file in the repo before I saw your message and I think I
> understand now - thanks!
>
> So, if you're looking at a node below the root (an ArchivalObject) that
> has >200 children, you would hit the ".../tree/waypoint" endpoint however
> many times and include "parent_node" in the GET params with the
> ArchivalObject URI, right?
>
> On Tue, Jul 23, 2019 at 11:57 AM Majewski, Steven Dennis (sdm7g) <
> sdm7g at virginia.edu> wrote:
>
>>
>> So the next question is how do you make the subsequent calls to retrieve
>> the next 200, etc.?
>>
>>
>>
>> You call  /repositories/$repo/resources/$id/tree/waypoint?offset=$N  23
>> times.
>> ( You already got the first batch in .precomputed_waypoints in the call
>> to /ress/root  )
>>
>>
>> I found the documentation note in the source I was looking for:
>>
>> https://github.com/archivesspace/archivesspace/blob/master/backend/app/model/large_tree.rb
>>
>>
>> # What's the big idea?
>> #
>> # ArchivesSpace has some big trees in it, and sometimes they look a lot
>> like big
>> # sticks.  Back in the dark ages, we used JSTree for our trees, which in
>> general
>> # is perfectly cromulent.  We recognized the risk of having some very
>> large
>> # collections, so dutifully configured JSTree to lazily load subtrees as
>> the
>> # user expanded them (avoiding having to load the full tree into memory
>> right
>> # away).
>> #
>> # However, time makes fools of us all.  The JSTree approach works fine if
>> your
>> # tree is fairly well balanced, but that's not what things look like in
>> the real
>> # world.  Some trees have a single root node and tens of thousands of
>> records
>> # directly underneath it.  Lazy loading at the subtree level doesn't save
>> you
>> # here: as soon as you expand that (single) node, you're toast.
>> #
>> # This "large tree" business is a way around all of this.  It's
>> effectively a
>> # hybrid of trees and pagination, except we call the pages "waypoints" for
>> # reasons known only to me.  So here's the big idea:
>> #
>> #  * You want to show a tree.  You ask the API to give you the root node.
>> #
>> #  * The root node tells you whether or not it has children, how many
>> children,
>> #    and how many waypoints that works out to.
>> #
>> #  * Each waypoint is a fixed-size page of nodes.  If the waypoint size
>> is set
>> #    to 200, a node with 1,000 children would have 5 waypoints underneath
>> it.
>> #
>> #  * So, to display the records underneath the root node, you fetch the
>> root
>> #    node, then fetch the first waypoint to get the first N nodes.  If
>> you need
>> #    to show more nodes (i.e. if the user has scrolled down), you fetch
>> the
>> #    second waypoint, and so on.
>> #
>> #  * The records underneath the root might have their own children, and
>> they'll
>> #    have their own waypoints that you can fetch in the same way.  It's
>> nodes,
>> #    waypoints and turtles the whole way down.
>> #
>> # All of this interacts with the largetree.js code in the staff and public
>> # interfaces.  You open a resource record, and largetree.js fetches the
>> root
>> # node and inserts placeholders for each waypoint underneath it.  As the
>> user
>> # scrolls towards a placeholder, the code starts building tracks ahead of
>> the
>> # train, fetching that waypoint and rendering the records it contains.
>> When a
>> # user expands a node to view its children, that process repeats again
>> (the node
>> # is fetched, waypoint placeholders inserted, etc.).
>> #
>> # The public interface runs the same code as the staff interface, but
>> with a
>> # small twist: it fetches its nodes and waypoints from Solr, rather than
>> from
>> # the live API.  We hit the API endpoints at indexing time and store them
>> as
>> # Solr documents, effectively precomputing all of the bits of data we
>> need when
>> # displaying trees.
>>
>>
>>
>>
>>
>> On Jul 23, 2019, at 11:08 AM, Trevor Thornton <trthorn2 at ncsu.edu> wrote:
>>
>> Thanks, Steve. That makes sense, and I tested with a resource with >1000
>> top level children and I see that only 200 of them are included, which
>> corresponds to the value for "waypoint_size" in the response:
>>
>> {
>>>    "child_count":4780,
>>>    "waypoints":24,
>>>    "waypoint_size":200
>>> ...
>>
>>
>> So the next question is how do you make the subsequent calls to retrieve
>> the next 200, etc.?
>>
>> On Tue, Jul 23, 2019 at 10:52 AM Majewski, Steven Dennis (sdm7g) <
>> sdm7g at virginia.edu> wrote:
>>
>>> I believe the rationale of the waypoints was that initially, it was
>>> expected that resource children/ archival objects would fall into a more
>>> balanced tree structure, but it turned out that there were many flat
>>> hierarchies with hundreds of top level children, and getting all of the
>>> children at once was not working very efficiently. So with they waypoint
>>> calls, you may only be getting some of the children, but the display can
>>> start populating the tree display while making additional calls for the
>>> rest.
>>>
>>> I may have some postman examples and internal notes around somewhere:
>>> I’ll see what I can dig out.
>>>
>>> — Steve.
>>>
>>>
>>> On Jul 23, 2019, at 9:05 AM, Trevor Thornton <trthorn2 at ncsu.edu> wrote:
>>>
>>> Hi everybody-
>>>
>>> I'm building a service using these API endpoints (or I think I am):
>>> [:GET] /repositories/:repo_id/resources/:id/tree/root
>>> <http://archivesspace.github.io/archivesspace/api/#fetch-tree-information-for-the-top-level-resource-record>
>>> [:GET] /repositories/:repo_id/resources/:id/tree/node
>>> <http://archivesspace.github.io/archivesspace/api/#fetch-tree-information-for-an-archival-object-record-within-a-tree>
>>>
>>> These incorporate the concept of "waypoints", which I admit that I'm not
>>> familiar with in this context, and it isn't explained very well in the
>>> documentation. This is what I have to work with (these are elements
>>> included in the API response):
>>>
>>>    - child_count – the number of immediate children
>>>    - waypoints – the number of “waypoints” those children are grouped
>>>    into
>>>    - waypoint_size – the number of children in each waypoint
>>>    - precomputed_waypoints – a collection of arrays (keyed on child
>>>    URI) in the same format as returned by the ’/waypoint’ endpoint. Since a
>>>    fetch for a given node is almost always followed by a fetch of the first
>>>    waypoint, using the information in this structure can save a backend call.
>>>
>>> Can anyone explain what exactly waypoints are and how they are different
>>> from children? In the examples I've seen, the "precomputed_waypoints"
>>> element in the response looks like a convoluted way (an array value of the
>>> lone element in an object, which is itself the value of the lone element in
>>> another object) to provide the children nodes of the given node (or root).
>>> What's the difference?
>>>
>>> Thanks,
>>> Trevor
>>>
>>> --
>>> Trevor Thornton
>>> Applications Developer, Digital Library Initiatives
>>> North Carolina State University Libraries
>>> _______________________________________________
>>> Archivesspace_Users_Group mailing list
>>> Archivesspace_Users_Group at lyralists.lyrasis.org
>>> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>>>
>>>
>>> _______________________________________________
>>> Archivesspace_Users_Group mailing list
>>> Archivesspace_Users_Group at lyralists.lyrasis.org
>>> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>>>
>>
>>
>> --
>> Trevor Thornton
>> Applications Developer, Digital Library Initiatives
>> North Carolina State University Libraries
>> _______________________________________________
>> Archivesspace_Users_Group mailing list
>> Archivesspace_Users_Group at lyralists.lyrasis.org
>> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>>
>>
>> _______________________________________________
>> Archivesspace_Users_Group mailing list
>> Archivesspace_Users_Group at lyralists.lyrasis.org
>> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>>
>
>
> --
> Trevor Thornton
> Applications Developer, Digital Library Initiatives
> North Carolina State University Libraries
> _______________________________________________
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group at lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>
>
> _______________________________________________
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group at lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>


-- 
Trevor Thornton
Applications Developer, Digital Library Initiatives
North Carolina State University Libraries
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20190723/35d727a3/attachment.html>


More information about the Archivesspace_Users_Group mailing list