[nfais-l] NFAIS Enotes, 2014, No. 1

Jill O'Neill jillmwo at gmail.com
Wed Feb 5 08:12:56 EST 2014


*NFAIS Enotes, 2014, No. 1*

*Written and compiled by Jill O'Neill*



*Google Scholar, Jane Austen and Changes in Scholarly Search*



In going through old documents on a computer recently, I came across a
short piece I had written back in 2006 about Google Scholar and a feature
it had added that was supposed to direct researchers to recently published
items. At that time, Google Scholar was experiencing serious issues, poor
metadata quality being the most significant one. Relevancy ranking
algorithms were mysterious. The scope of the content included in the
service was unclear. Full-text availability was really a hit-or-miss
proposition. Overall, as an information service, there were drawbacks for
the too-casual student.

 The search query I used in that evaluation was  ["Jane Austen" Church
clergy Georgian], a standard query that I have used since Google Scholar
was launched in 2004 as a way of tracking the service's scope and added
enhancements. In June of 2006, that query yielded a paltry 127 results. Run
in 2014, that same query yielded 1,220 results.


What is now included in that larger result set in 2014? On my first page of
results, the ten links presented included six scanned titles searchable on
Google Books, one primary cited document from the Hampshire County Council
in the UK, and three items from digital repositories. Running a query from
the sciences such as [alkaloids "conium maculatum" livestock] retrieves a
smaller set of hits, but all of the items on the first page are either from
journal publishers (Elsevier, Sage) or specialized aggregators such as
InformaHealthCare.com



A set of 2013 exchanges on the Web4Lib listserv outlined what some of the
pros and cons of using Scholar in place of a discovery service were in the
eyes of the information professional. The service is free and comes with a
user-friendly interface, *but* there is no clear outline of what is covered
in the resource and Google could pull the plug on it at any time without
apology. Local holdings aren't included in Google Scholar, which puts
librarians at a disadvantage. They can't make clear to patrons what content
is immediately accessible and what may entail a waiting period.

 Google has not worried too much about soothing that irritation. On the
Google Scholar FAQ, in response to the question *Do you cover PubMed?
JStor? Elsevier?,* the response reads "*We index research articles and
abstracts from most major academic publishers and repositories worldwide,
including both free and subscription sources. To check current coverage of
a specific source in Google Scholar, search for a sample of their article
titles in quotes*." Even that response is clearly directed towards the
primary end user, rather than the information professional who may be
trying to assist that user with their research. (
http://scholar.google.com/intl/en/scholar/help.html#coverage)



Of course, given the rapidly expanding options available, scholarship need
not take the form of either journal articles or monographs. I was much
taken by a digital humanities project that reproduced an art gallery
visited by Austen in 1813. (http://www.whatjanesaw.org/).  The project was
sufficiently impressive to capture attention from the main stream press (
http://www.nytimes.com/2013/05/25/books/what-jane-saw-is-an-online-trip-for-jane-austen-fans.html)
.



There was a journal article written about the project, one identified by
the following preferred citation:

Barchas, Janine (2012) "Digitally Reconstructing the Reynolds Retrospective
Attended by Jane Austen in 1813: A Report on E-Work-in-Progress," ABO:
Interactive Journal for Women in the Arts, 1640-1830: Vol. 2: Iss. 1,
Article 13.

DOI: http://dx.doi.org/10.5038/2157-7129.2.1.12

Available at: http://scholarcommons.usf.edu/abo/vol2/iss1/13



The problem is that* What Jane Saw* is not adequately captured in that
traditional scholarly article, at least in part because the article is
focused on the building of the digital environment rather than on the
author's scholarship which actually focuses on the concept of celebrity in
the Georgian era. The more traditional outputs of her scholarship (article
and monograph) have appropriate metadata assigned. The digital construction
itself, however, is poorly served. Unlike the citation provided above, the
digital exhibit lacks reliable metadata and thus can become easily obscured
to subsequent searching.  Yes, you can find *What Jane Saw* in Google
Scholar if you search using that phrase, but it displays the author's name
not as *Janine Barchas* but instead as *WJ Saw*.


At this year's  NFAIS Annual Conference, we'll be hearing from scholars who
are making similarly creative investigations, leveraging digital content
and new technologies.  Yet Google isn't particularly concerned about the
long-term retrievability of some of this digital content in the context of
standard library practice.  In this context, it may be interesting to read
the work of younger scholars, in particular, this related paper by Dutch
researchers entitled, *Just Google It: Digital Research Practices of
Humanities Scholars* (http://arxiv.org/abs/1309.2434).


While limited to Dutch and Flemish humanists, the authors' study concludes
that "*It is probable that considerations of convenience supersede the
principles of provenance and context. Google might not cover all the
relevant sources, but it does probably cover the most**.  Furthermore in
terms of efficiency, relying on Google instead of searching in multiple
alternative more refined search systems, within the websites of specialized
institutions, and subsequently comparing the results, saves time and energy*"
 Added emphasis in boldface is mine.  Further on in the paper, the authors
essentially shrug off concerns that digital practice is unlikely to mirror
analog library and information science practice.



Throughout 2013, there was an on-going discussion documenting issues
surrounding use of the tool in a series of BioMedCentral articles. Jean
Francois Gehanna (University of Rouen) delved into the use of Google
Scholar for purposes of developing systematic reviews: (
http://www.biomedcentral.com/1472-6947/13/7) and Martin Boeker of the
Uniklinik-freburg.de took that work a step further (
http://www.biomedcentral.com/1471-2288/13/131). To clarify, systematic
reviews emerged in recent decades as a way of providing health-care
professionals with data from a broad spectrum of research trials in order
to enable evidence-based practices. Quoting from a leading institute*:
Systematic reviews aim to find as much as possible of the research relevant
to the particular research questions, and use explicit methods to identify
what can reliably be said on the basis of these studies... Such reviews then
go on to synthesize research findings in a form which is easily accessible
to those who have to make policy or practice decisions. In this way,
systematic reviews reduce the bias which can occur in other approaches to
reviewing research evidence*. [
http://eppi.ioe.ac.uk/cms/Default.aspx?tabid=67]. The work by both Gehanna
and Boeker indicate that Google Scholar is inadequate to the search and
recall necessary to ensure a high quality systematic review.  Lack of
precision was a major drawback to its use.

 The library community largely agrees with this assessment (
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3733758/), although now and
again, there are those suggesting that discovery services do no better at
the task (
http://musingsaboutlibrarianship.blogspot.com/2013/11/is-summon-alone-good-enough-for.html#.UqDZ8_RDsRp).
 Others suggest that the problem lies less with discovery service
technologies and more with "weaknesses in the educational process for
systematic review methodologies, and in the level of methodological
expertise on the part of the authors, editors, and reviewers of the
scholarly journals." (
http://etechlib.wordpress.com/2013/01/23/whats-wrong-with-google-scholar-for-systematic-reviews/).
The information and research communities seem to be at odds over what
should be accepted as best practices in service to scholarship.



Among content providers and librarians, it is understood that Google views
Scholar as a something of a low-priority service. Google knows the tool
gets used and they do occasionally beef up the offering, but the aura of
being Anurag Acharya's private "20% time" project still remains. There were
numerous upgrades in 2012 but only two improvements in 2013. The most
recent upgrade in November had to do with saving discovered items from a
search query - essentially mirroring most bibliographic management
software. (
http://googlescholar.blogspot.com/2013/11/google-scholar-library.html).  It
is an improvement but not an overly dramatic one and, again, Google's
target is the end-user rather than the information professional.   To  me,
however, the most telling indicator of Google's valuation of Scholar lies
in the fact that Scholar has not yet been optimized for Google Now, the
voice command interface that works with mobile and wearable technologies -
everything from the Google search app on one's iPod Touch up through Google
Glass.



Meanwhile, Google has been upgrading other varieties of search, as well as
acquiring companies like DeepMind and Nest.  Tech analysts have their
various theories as to Google's strategic direction (the Internet of
Things, etc.), but what does it mean in the context of what NFAIS member
organizations do? Click through on the following headline links to two
stories regarding deep learning initiatives at Google:



(1) If this doesn't terrify you... Google's computers OUTWIT their humans

'Deep learning' clusters crack coding problems their top engineers can't

http://www.theregister.co.uk/2013/11/15/google_thinking_machines/

>From the story: *By working hard to give its machines greater capabilities,
and local, limited intelligence, Google can crack classification problems
that its human experts can't solve*.



(2) More on DeepMind: AI Startup to Work Directly With Google's Search Team

http://recode.net/2014/01/27/more-on-deepmind-ai-startup-to-work-directly-with-googles-search-team/

>From the story: "...*sources said Deep Mind is actually being inserted into
Google's oldest team: Search. Or as search is known at Google today, the
"Knowledge Group" - so called because it no longer finds keywords on Web
pages, but instead connects larger concepts*."



Even now, Google is expanding beyond what we might think of as "simple"
linked data and into a more complex and challenging environment of
presenting useful answers.



Last year, I could run a very broad query "Jane Austen" and Google would
pop up one of its enhanced Knowledge cards. Off to the right of my screen,
a box displays a portrait of Jane Austen, her birth and death dates, etc.
The box wasn't all that useful except in the most limited function of
looking up such information. (Google's designers of that card display
didn't think that links referring the user to Austen's novels should take
precedence over links to film and television adaptations of her works,
anymore than they would think to include links to the digital scholarship
of *What Jane Saw*.)  Still, that display was most users' introduction to
the idea of linked data aggregating answers in response to a search.



This month, a new wrinkle has emerged.  If I run the query I noted at the
beginning of this issue of Enotes ("Jane Austen" Church clergy Georgian) in
the mainstream version of Google today, I will see grey notations with each
hit indicating the source of that link if Google deems that source to be
"widely recognized as notable online" (See
http://insidesearch.blogspot.com/2014/01/more-information-about-websites-to-help.html
).



Suddenly the user sees a recent book title published by Ashgate. Click on a
small downward-pointing grey arrow and Google displays a smaller Google
Knowledge box that notes what Ashgate is --an academic book and journal
publisher based in the UK. Does that information about Ashgate (taken from
Wikipedia) lend credibility to the book title *Jane Austen's
Anglicanism*in the mind of the user? Conceivably. The link below that
one is to a
relevant PDF document posted by the University of Nebraska - Lincoln, where
the author of that title teaches. Does Google plan on linking up that data
for the researcher as well? Conceivably.



Not all links get this indicator. A link to Irene Collins' *Jane Austen and
The Clergy* on Google Books receives no such bolstering. A little deeper
into the search results, there are such notable sources listed as the
University of Florida and Goodreads.  Still, Google is making an attempt to
provide users with the means of discovery of reliable content as well as to
some entity that might aid in gaining access.



Google isn't delivering particularly worthwhile information from an
educational or research standpoint, but it's easy to see how something more
competitive might eventually emerge. Google has the capability of matching
up Jane Austen with book content from Ashgate Publishing as well as an
honors thesis housed in the University of Florida's digital repository.
When they succeed in harnessing their machines' "deep learning" - the fruit
of their engineers' creativity as well as the DeepMind acquisition -- will
they be building the advanced information service of the future? At that
point, will we even be consider it to be an advanced service?

 There are members of NFAIS that are leveraging linked data in ways similar
to Google's Knowledge Graph approach. When they emerge, they will be far
more authoritative than Google's current tool, but will users notice?  They
will, if NFAIS members move nimbly. Sometimes it is this community's
advantage to be focused on satisfying user needs that Google doesn't deem a
priority.







Want to learn more about how new forms of content and big data techniques
are changing publishing? Plan to attend the 2014 NFAIS Annual
Conference, *Giving
Voice to Content:  Re-envisioning the Business of Information,* scheduled
for February 23-25, 2014 in Philadelphia, PA (see:
*http://nfais.org/event?eventID=530
<http://nfais.org/event?eventID=530>*).





2013 NFAIS Supporters



Access Innovations, Inc.

Accessible Archives, Inc.

American Psychological Association/PsycINFO

American Theological Library Association

Annual Reviews

CAS

CrossRef

Data Conversion Laboratory, Inc.

Defense Technical Information Center

EBSCO Information Services

Getty Research Institute

The H. W. Wilson Foundation

Information Today, Inc.

IFIS

Modern Language Association

OCLC

Philosopher's Information Center

ProQuest

RSuite CMS

Scope e-Knowledge Center

TEMIS, Inc.

-- 
Jill O'Neill
jillmwo at gmail.com
http://www.linkedin.com/in/jilloneill
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/nfais-l/attachments/20140205/f81e3cb0/attachment.html>


More information about the nfais-l mailing list