[Irtalk] FW: [Web4lib] Ranking Web of Repositories and pdf suffixes

Smith, Ina <ismith@sun.ac.za> ismith at sun.ac.za
Thu Jan 20 17:59:50 SAST 2011

Congratulations to all who have been listed on this ranking list!

Kind regards

-----Original Message-----
From: web4lib-bounces at webjunction.org [mailto:web4lib-bounces at webjunction.org] On Behalf Of Isidro F. Aguillo
Sent: 20 January 2011 17:54
Cc: Web4lib at webjunction.org
Subject: [Web4lib] Ranking Web of Repositories and pdf suffixes

Thanks for your suggestions. We have just added them to the new edition 
of the Ranking Web of Repositories that has been published today.

The Ranking of Repositories is published since 2008 and two editions are 
available usually at the end of January and July. The January 2011 
edition is available from her:


This edition consists of more than 1,200 repositories, ranked according 
to a composite index that combines activity indicators (size, rich files 
and Scholar) and impact (link visibility). The repositories should have 
their own web domain or subdomain and include at least peer-reviewed 
papers to be considered (services that contain only archives, databanks 
or learning objects are not ranked). A separate ranking (Top Portals) is 
devoted to national services, international platforms and portals of 
journals, but it is still in beta version.

The list is headed by the subject repositories: SSRN is the first, Arxiv 
the second and close to them CiteSeerX and RepEC. PMC and others 
repositories not using suffixes are being penalized both by the search 
engines and the Ranking.

As usual we thank any comment, suggestion or criticism.


El 20/01/2011 2:34, Michael escribió:
> The IANA and RFC documents would probably be the best place to find 
> this, though it's hard if you don't know the structures or archives of 
> the protocols behind the Internet.
> Here's the RFC 3778 for PDF from the RFC Archives:
> http://www.rfc-archive.org/getrfc.php?rfc=3778
> and look at the file extension identified clearly as .pdf (under 8. 
> IANA Considerations)
> Whether or not archival/repository groups would handle naming 
> files/documents as it should be done (including extensions), is 
> problematic, IMHO. They *should* follow standards and protocols, but 
> perhaps they do not by choice, ignorance, or for other reasons.
> Best,
> Michael
> Michael aka DrWeb | E-mail: DrWeb2 at gmail.com <mailto:DrWeb2 at gmail.com> 
> | Twitter: @DrWeb2
> On Tue, Jan 18, 2011 at 11:55 PM, Isidro F. Aguillo 
> <isidro.aguillo at cchs.csic.es <mailto:isidro.aguillo at cchs.csic.es>> wrote:
>     Dear Thomas:
>     Thank you for your comments. This is exactly the reason for asking
>     for a "official" statement supporting one of the two views. My
>     problem is that search engines are using the suffixes in some
>     filtering options and many people thinks like you that adding
>     suffix is not mandatory nor needed. I am searching for documents
>     relating to mandates or recommendations if they exists.
>     El 18/01/2011 18:07, Thomas Dowling escribió:
>         On 01/18/2011 04:08 AM, Isidro F. Aguillo wrote:
>             Dear colleagues:
>             A large number of pdf files currently available from many
>             repositories are
>             not using the .pdf suffix at all. Although this is not a
>             major problem I
>             think this is a "bad practice" but I do not any document
>             stating this.
>             Could you help me on this issue?
>         Why do you think it's a bad practice?
>         There are no files on the web - only data streams.  What
>         *ought* to matter
>         is not ".pdf" at the end of the file name but
>         "application/pdf" at the
>         start of the stream.
>         That said, many browsers remain clueless about default file
>         names for
>         saving and downloading (if it's PDF, "output.php" is not a
>         good guess for
>         a "Save As..." option).  They benefit from a little
>         handholding, so when I
>         spit PDF out of a script, I usually tack on
>         "/Something_Sensible.pdf" at
>         the end of the URL.
>         Thomas Dowling
>         tdowling at ohiolink.edu <mailto:tdowling at ohiolink.edu>
>         _______________________________________________
>         Web4lib mailing list
>         Web4lib at webjunction.org <mailto:Web4lib at webjunction.org>
>         http://lists.webjunction.org/web4lib/
>     -- 
>     ****************************************
>     Isidro F. Aguillo, HonPhD
>     The Cybermetrics Lab
>     CSIC
>     Albasanz, 26-28. Madrid 28037. Spain
>     isidro.aguillo at cchs.csic.es <mailto:isidro.aguillo at cchs.csic.es>
>     ****************************************

Isidro F. Aguillo, HonPhD
The Cybermetrics Lab
Albasanz, 26-28. Madrid 28037. Spain

isidro.aguillo at cchs.csic.es

Web4lib mailing list
Web4lib at webjunction.org

More information about the Irtalk mailing list