<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">So, it sounds like if a resource was
*ever* set to "Publish" at the resource level, Google / other
search engines will have crawled it and the EAD-PDF link will be
there in search engines and live on, even if that resource
subsequently gets unpublished.<br>
<br>
If in our conversion from AT, *all* resources were set to
"publish" (instead of just publishing ones with the status
"completed", then that most likely explains this situation. I do
recall that in our post-conversion QC work a year ago, we did
"unpublish" whatever Resources had a status other than
"completed". So these are resources that could not have been
"published" for long (more than a couple of months, though - long
enough to be crawled) but have been "unpublished" at the resource
level for a year now, and yet are discoverable via their EAD-PDF
link.<br>
<br>
Mang mentions that If a visibility check could be introduced when
generating EAD-PDF and etc., the problem can be solved. <br>
That sounds like it could be helpful as being built into
ArchivesSpace, no? <br>
<br>
For now though, we will have make a plan for dealing with our
unpublished resources which are discoverable in search engines. <br>
<br>
Thanks for helping to think this through,<br>
Amanda<br>
<br>
<br>
<br>
On 7/5/2016 3:17 PM, Mang Sun wrote:<br>
</div>
<blockquote cite="mid:a7f6460b-9508-0184-86d7-2f463ac12dd3@rice.edu"
type="cite">
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
<p><br>
</p>
<p>Amanda, <br>
<br>
<br>
When did you change the Publish? flag of those resources to "No"
? I vaguely recall, at the very beginning, all resources were
set to Publish? YES. Then if this is true, Google has already
crawled and indexed the shortcut link to EAD-PDF for each
resource, even its Publish? was set to NO later on. Because at
our current version which is 1.4.2 the code underlying the link
to EAD-PDF seemingly doesn't check the PUBLISH? flag of
resources, the shortcut link (even can be assembled manually by
following a pattern) in question will remain valid and in
effect,and will be kept crawled and indexed by Google even for
unpublished resources that were ever published . If a visibility
check could be introduced when generating EAD-PDF and etc., the
problem can be solved. To prevent Google from remembering a
shortcut link with our current version, a new resource should be
set to Publish?NO at the very beginning without, but this still
can't prevent power users from handcrafting the link to get
EAD-PDF output of an invisible resource if they know the
generated or assigned resource number. <br>
<br>
<br>
Mang </p>
<p><br>
</p>
<p><br>
</p>
<br>
<div class="moz-cite-prefix">On 7/5/2016 2:52 PM, Custer, Mark
wrote:<br>
</div>
<blockquote
cite="mid:BN3PR08MB13189C2B339CE4A63825BB418C390@BN3PR08MB1318.namprd08..prod.outlook.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<meta name="Generator" content="Microsoft Word 15 (filtered
medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman",serif;
color:black;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p
{mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
font-size:12.0pt;
font-family:"Times New Roman",serif;
color:black;}
pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0in;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";
color:black;}
span.issue-link
{mso-style-name:issue-link;}
span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:Consolas;
color:black;}
span.EmailStyle21
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Amanda,<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">So,
it sounds like the PUI is working as expected in that
case, but that the ASpace PDF conversion process is
including everything from each finding aid, whether it’s
listed as published or not. Is that right? If so, it
should just be a simple update to the ASpace PDF
stylesheet, and that type of change should definitely be
in the core code.<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">I’ll
look to see if there’s an open issue for this, but if
there’s not, I can create one in JIRA. I’ve made a couple
updates to the core ASpace PDF stylesheet, and I hope to
make a few more before the next PUI is released.<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Mark<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #E1E1E1
1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:windowtext">From:</span></b><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:windowtext">
<a moz-do-not-send="true"
class="moz-txt-link-abbreviated"
href="mailto:archivesspace_users_group-bounces@lyralists.lyrasis.org">archivesspace_users_group-bounces@lyralists.lyrasis.org</a>
[<a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="mailto:archivesspace_users_group-bounces@lyralists.lyrasis.org">mailto:archivesspace_users_group-bounces@lyralists.lyrasis.org</a>]
<b>On Behalf Of </b>Amanda Focke<br>
<b>Sent:</b> Tuesday, 05 July, 2016 2:45 PM<br>
<b>To:</b> <a moz-do-not-send="true"
class="moz-txt-link-abbreviated"
href="mailto:archivesspace_users_group@lyralists.lyrasis.org">archivesspace_users_group@lyralists.lyrasis.org</a><br>
<b>Subject:</b> Re: [Archivesspace_Users_Group]
unpublished resource showing up as PDF download in a
Google search<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal">Hello Mang and all -<br>
<br>
What I did was search Google for something I know is from
one of our unpublished finding aids, <br>
<br>
such as this text string:<br>
"10002. Genetic regulatory proteins -1 (laci with altered
ligand responsivity. Kathleen Matthews"<br>
<br>
and the result was that the entire ArchivesSpace-generated
PDF version of the (unfinished / unpublished) finding aid
is available as the 2nd hit from Google's results list.<br>
<br>
<br>
I was hoping to attend the ArchivesSpace webinar which is
going on right now to see if this issue has been resolved,
<br>
but the webinar is full. I'll just wait for the recording
and if my questions aren't answered there, will follow up
with ArchivesSpace folks.<br>
<br>
Amanda <br>
<br>
<br>
<br>
<br>
<br>
On 7/5/2016 9:48 AM, Mang Sun wrote:<o:p></o:p></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p>Amanda, <o:p></o:p></p>
<p>I am just back. I seemingly can't reproduce the Google
hit by searching Google for "Randall Hulet" and I don't
see problem with our Public interface when searching for
"Randall Hulet". Can you give me a screen snapshot of
your googling result for the title of this archival
object? <o:p></o:p></p>
<p>Mang<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal">On 6/15/2016 1:59 PM, Amanda Focke
wrote:<o:p></o:p></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal">I think this may be<span
class="issue-link"><span style="color:#3B73AF">
AR-583 or AR-27</span></span><span
class="issue-link">8 which both seem to say they are
resolved, so maybe if we upgrade this summer to the
new version this will be fixed....</span><br>
<br>
<span class="issue-link">Amanda</span><br>
<br>
On 6/14/2016 4:29 PM, Amanda Focke wrote:<o:p></o:p></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p class="MsoNormal" style="margin-bottom:12.0pt">Hello
-- <br>
<br>
We have an *unpublished* Resource in our ArchivesSpace
instance which is showing up <br>
when I search a text string from it in Google.<br>
<br>
I search a text string from that resource and I get a
hit (in Google) coming from our ArchivesSpace offering
a "printer friendly download" of the full PDF for the
Resource. <br>
<br>
I double checked the Resource, it is definitely
"unpublished" at the top level, although it has
components which are marked as published (I'm not sure
why those are published but it shouldn't matter if the
parent is unpublished). <br>
<br>
Has anyone noticed this behavior? <br>
Thanks,<br>
Amanda<br>
<br>
<o:p></o:p></p>
<div>
<p class="MsoNormal">-- <br>
<b>Amanda Focke, CA, DAS</b><br>
Asst. Head of Special Collections<br>
Woodson Research Center <br>
Fondren Library MS-44<br>
Rice University <br>
6100 Main St. <br>
Houston, TX 77005<br>
713-348-2124 | <a moz-do-not-send="true"
href="mailto:afocke@rice.edu">afocke@rice.edu</a><br>
Website: <a moz-do-not-send="true"
href="https://urldefense.proofpoint.com/v2/url?u=http-3A__library.rice.edu_woodson&d=CwMD-g&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=qrl1p9pdF8AKUWh4QzJttjsQJvj57JscK0PiJy-NDGM&s=3SIN67f0Tro00gQKJHxLbmDWmnRPz399UpBuwNe5Xr4&e=">http://library.rice.edu/woodson</a><br>
Blog: <a moz-do-not-send="true"
href="https://urldefense.proofpoint.com/v2/url?u=http-3A__woodsononline.wordpress.com_&d=CwMD-g&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=qrl1p9pdF8AKUWh4QzJttjsQJvj57JscK0PiJy-NDGM&s=dMbXm9sY5G9VGaxj-ur6CTV2KvUNNmKIK2Y0_39Ne5g&e=">http://woodsononline.wordpress.com/</a><o:p></o:p></p>
</div>
<p class="MsoNormal"><br>
<br>
<br>
<o:p></o:p></p>
<pre>_______________________________________________<o:p></o:p></pre>
<pre>Archivesspace_Users_Group mailing list<o:p></o:p></pre>
<pre><a moz-do-not-send="true" href="mailto:Archivesspace_Users_Group@lyralists.lyrasis.org">Archivesspace_Users_Group@lyralists..lyrasis.org</a><o:p></o:p></pre>
<pre><a moz-do-not-send="true" href="https://urldefense..proofpoint.com/v2/url?u=http-3A__lyralists.lyrasis.org_mailman_listinfo_archivesspace-5Fusers-5Fgroup&d=CwMD-g&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=qrl1p9pdF8AKUWh4QzJttjsQJvj57JscK0PiJy-NDGM&s=N0QkZjMA44kL7h0mu-ZlNla8zK2LgHWQ4PAEFM4eAhg&e=">http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group</a><o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre><o:p> </o:p></pre>
<pre><o:p></o:p></pre>
</blockquote>
<p class="MsoNormal" style="margin-bottom:12.0pt"><o:p> </o:p></p>
<div>
<p class="MsoNormal">-- <br>
<b>Amanda Focke, CA, DAS</b><br>
Asst. Head of Special Collections<br>
Woodson Research Center <br>
Fondren Library MS-44<br>
Rice University <br>
6100 Main St. <br>
Houston, TX 77005<br>
713-348-2124 | <a moz-do-not-send="true"
href="mailto:afocke@rice.edu">afocke@rice.edu</a><br>
Website: <a moz-do-not-send="true"
href="https://urldefense.proofpoint.com/v2/url?u=http-3A__library.rice.edu_woodson&d=CwMD-g&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=qrl1p9pdF8AKUWh4QzJttjsQJvj57JscK0PiJy-NDGM&s=3SIN67f0Tro00gQKJHxLbmDWmnRPz399UpBuwNe5Xr4&e=">http://library.rice.edu/woodson</a><br>
Blog: <a moz-do-not-send="true"
href="https://urldefense.proofpoint.com/v2/url?u=http-3A__woodsononline.wordpress.com_&d=CwMD-g&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=qrl1p9pdF8AKUWh4QzJttjsQJvj57JscK0PiJy-NDGM&s=dMbXm9sY5G9VGaxj-ur6CTV2KvUNNmKIK2Y0_39Ne5g&e=">http://woodsononline.wordpress.com/</a><o:p></o:p></p>
</div>
<p class="MsoNormal"><br>
<br>
<br>
<o:p></o:p></p>
<pre>_______________________________________________<o:p></o:p></pre>
<pre>Archivesspace_Users_Group mailing list<o:p></o:p></pre>
<pre><a moz-do-not-send="true" href="mailto:Archivesspace_Users_Group@lyralists.lyrasis.org">Archivesspace_Users_Group@lyralists.lyrasis.org</a><o:p></o:p></pre>
<pre><a moz-do-not-send="true" href="https://urldefense.proofpoint.com/v2/url?u=http-3A__lyralists.lyrasis.org_mailman_listinfo_archivesspace-5Fusers-5Fgroup&d=CwMD-g&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=qrl1p9pdF8AKUWh4QzJttjsQJvj57JscK0PiJy-NDGM&s=N0QkZjMA44kL7h0mu-ZlNla8zK2LgHWQ4PAEFM4eAhg&e=">http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group</a><o:p></o:p></pre>
</blockquote>
<p class="MsoNormal"><br>
!DSPAM:114,577bc8b160581446016412! <br>
<br>
<br>
<o:p></o:p></p>
<pre>_______________________________________________<o:p></o:p></pre>
<pre>Archivesspace_Users_Group mailing list<o:p></o:p></pre>
<pre><a moz-do-not-send="true" href="mailto:Archivesspace_Users_Group@lyralists.lyrasis.org">Archivesspace_Users_Group@lyralists.lyrasis.org</a><o:p></o:p></pre>
<pre><a moz-do-not-send="true" href="https://urldefense.proofpoint.com/v2/url?u=http-3A__lyralists.lyrasis.org_mailman_listinfo_archivesspace-5Fusers-5Fgroup&d=CwMD-g&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=qrl1p9pdF8AKUWh4QzJttjsQJvj57JscK0PiJy-NDGM&s=N0QkZjMA44kL7h0mu-ZlNla8zK2LgHWQ4PAEFM4eAhg&e=">http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group</a><o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre><o:p> </o:p></pre>
<pre>!DSPAM:114,577bc8b160581446016412!<o:p></o:p></pre>
</blockquote>
<p class="MsoNormal" style="margin-bottom:12.0pt"><o:p> </o:p></p>
<div>
<p class="MsoNormal">-- <br>
<b>Amanda Focke, CA, DAS</b><br>
Asst. Head of Special Collections<br>
Woodson Research Center <br>
Fondren Library MS-44<br>
Rice University <br>
6100 Main St. <br>
Houston, TX 77005<br>
713-348-2124 | <a moz-do-not-send="true"
href="mailto:afocke@rice.edu">afocke@rice.edu</a><br>
Website: <a moz-do-not-send="true"
href="https://urldefense.proofpoint.com/v2/url?u=http-3A__library.rice.edu_woodson&d=CwMD-g&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=qrl1p9pdF8AKUWh4QzJttjsQJvj57JscK0PiJy-NDGM&s=3SIN67f0Tro00gQKJHxLbmDWmnRPz399UpBuwNe5Xr4&e=">http://library.rice.edu/woodson</a><br>
Blog: <a moz-do-not-send="true"
href="https://urldefense.proofpoint.com/v2/url?u=http-3A__woodsononline.wordpress.com_&d=CwMD-g&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=qrl1p9pdF8AKUWh4QzJttjsQJvj57JscK0PiJy-NDGM&s=dMbXm9sY5G9VGaxj-ur6CTV2KvUNNmKIK2Y0_39Ne5g&e=">http://woodsononline.wordpress.com/</a><o:p></o:p></p>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Archivesspace_Users_Group mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Archivesspace_Users_Group@lyralists.lyrasis.org">Archivesspace_Users_Group@lyralists.lyrasis.org</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group">http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group</a>
</pre>
</blockquote>
<br>
!DSPAM:114,577c15c560581109012355!
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Archivesspace_Users_Group mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Archivesspace_Users_Group@lyralists.lyrasis.org">Archivesspace_Users_Group@lyralists.lyrasis.org</a>
<a class="moz-txt-link-freetext" href="http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group">http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group</a>
!DSPAM:114,577c15c560581109012355!
</pre>
</blockquote>
<br>
<br>
<div class="moz-signature">-- <br>
<b>Amanda Focke, CA, DAS</b><br>
Asst. Head of Special Collections<br>
Woodson Research Center <br>
Fondren Library MS-44<br>
Rice University <br>
6100 Main St. <br>
Houston, TX 77005<br>
713-348-2124 | <a class="moz-txt-link-abbreviated" href="mailto:afocke@rice.edu">afocke@rice.edu</a><br>
Website: <a href="http://library.rice.edu/woodson">http://library.rice.edu/woodson</a><br>
Blog: <a href="http://woodsononline.wordpress.com/">http://woodsononline.wordpress.com/</a></div>
</body>
</html>