[IRTalk] Open source data repository technologies

Ina Smith Ina at assaf.org.za
Wed Mar 6 10:20:05 SAST 2019

Dear Colleagues

>From the AOSP landscape study, it was clear that open access institutional repositories are well established (179 African repositories registered on OpenDOAR<http://v2.sherpa.ac.uk/view/repository_by_country/002.html>) on the African continent. The majority of the repositories use DSpace<https://duraspace.org/dspace/> open source software, and great capacity exists among African system administrators and librarians. Alternative options to data repository software include: Invenio 3<https://invenio-software.org/> (open source for large scale repositories, highly scalable up to 100+ million records and petabytes of file) and Dataverse<https://dataverse.org/> (open source research data repository with many features - see https://github.com/IQSS/dataverse/releases/tag/v4.10 ), as well as the technology options mentioned by the World Bank Toolkit<http://opendatatoolkit.worldbank.org/en/technology.html> (a great resource to guide you in terms of setting up your data repository service). Ideally a data repository should form part of a science gateway<https://sciencegateways.org/new-to-gateways>, including shared equipment and instruments, computational services, advanced software applications, collaboration capabilities, data repositories, and networks.

For those interested in looking at DSpace as an option: the following information on how DSpace can serve as a data repository was recently shared via the DSpace mailing list (Bram Luyten<mailto:bram at atmire.com>):

Examples of DSpace used as data repositories
University of Exeter https://ore.exeter.ac.uk/repository/handle/10871/14881 (single item, multiple TBs of data)
University of Nottingham Research Data Management Repository https://rdmc.nottingham.ac.uk/
Swiss Federal Institute of Technology in Zurich (ETH Zurich) https://www.research-collection.ethz.ch/
University of Cambridge https://www.repository.cam.ac.uk/browse?type=type&value=Dataset
DRYAD https://datadryad.org/
Indiana University https://dataworks.iupui.edu/
Smithsonian Libraries https://repository.si.edu/handle/10088/27850

Strengths of DSpace as a data repository
- File type agnostic. You're not limited any specific file type or particular size.
- No theoretical file size limit. Even though there might be limits in other places (OS, underlying software), DSpace itself has no known limit of data size.
- Flexible metadata schemas, allowing you to align with DataCite and other metadata schema's.
- DOI integration with DataCite (connected with DataCite for automatic DOI minting).
- Different workflows and rules are possible on a per collection basis, giving an excellent starting point for a mixed Publication/Data set repository.
- Advancing URLs, e.g. where a researcher wants a permanent URL for their data set, so they can send it to publishers, but they would also like to refer to the permanent URL of the published paper in the dataset submission. DSpace-CRIS can generate the links to the datasets while submitting the publication, and DSpace-CRIS generates the reciprocal link (from the dataset to the publication) automatically, without the need for repository administrator to reopen the dataset items and manually add the link to the publication.

DSpace-CRIS consists of a data model describing objects of interest to Research and Development and a set of tools to manage the data. Standard DSpace is used to deal with publications and data sets, whereas DSpace-CRIS involves other CRIS entities: Researcher Pages, Projects, Organization Units and Second Level Dynamic Objects (single entities specialized by a profile, such as Journal, Prize, Event etc; because any profile can define its own set of properties and nested objects). For more info, see: https://dspace-cris.4science.it/handle/123456789/15
Kind regards

Ina Smith
Project Manager: African Open Science Platform<http://africanopenscience.org.za/>
Academy of Science of South Africa (ASSAf)
DOAJ Ambassador, Southern Africa Region<https://blog.doaj.org/2016/09/07/the-doaj-ambassadors-biographies/>
LIASA Librarian of the Year 2016<https://loy2016blog.wordpress.com/>

[http://orcid.org/sites/default/files/images/orcid_16x16(1).gif]<http://orcid.org/>  http://orcid.org/0000-0002-9710-3668

Switchboard: +27 12 349 6600
Tel: +27 12 349 6641
Fax: +27 (0) 86 576 9512
Email: ina at assaf.org.za<mailto:ina at assaf.org.za>


1st Floor Block A, The Woods, 41 De Havilland Crescent, Persequor Park
Meiring Naudé Road, Lynnwood 0020, Pretoria, South Africa.

PO Box 72135, Lynnwood Ridge 0040, Pretoria, South Africa.

Website: www.assaf.org.za<http://www.assaf.org.za/>

ASSAf Disclaimer: The views and opinions included in this email belong to their author and do not necessarily mirror the views and opinions of the organisation. Our employees are obliged not to make any defamatory statements, infringe, or authorise infringement of any legal right. Therefore, the organisation will not accept any liability for such statements included in emails. In case of any damages or other liabilities arising, employees are fully responsible for the content of their emails.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lib.sun.ac.za/pipermail/irtalk/attachments/20190306/40d42448/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.gif
Type: image/gif
Size: 1221 bytes
Desc: image001.gif
URL: <http://lists.lib.sun.ac.za/pipermail/irtalk/attachments/20190306/40d42448/attachment-0001.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.jpg
Type: image/jpeg
Size: 3762 bytes
Desc: image002.jpg
URL: <http://lists.lib.sun.ac.za/pipermail/irtalk/attachments/20190306/40d42448/attachment-0001.jpg>

More information about the IRTalk mailing list