dspace at ilri: a semi-technical overview of “cgspace”
TRANSCRIPT
![Page 1: DSpace at ILRI: A semi-technical overview of “CGSpace”](https://reader036.vdocuments.us/reader036/viewer/2022062313/55b95774bb61eb94788b4823/html5/thumbnails/1.jpg)
A semi-technical overview of “CGSpace”
DSpace at ILRI
Alan OrthKAINET Open Data and Open Science’ Workshop
Nairobi, Kenya, 18 June 2015
![Page 2: DSpace at ILRI: A semi-technical overview of “CGSpace”](https://reader036.vdocuments.us/reader036/viewer/2022062313/55b95774bb61eb94788b4823/html5/thumbnails/2.jpg)
History of DSpace at ILRI
● 2009: ILRI launches Mahider (“repository” in Amharic)
● 2010: Other CGIAR centers and programs join our platform and share hard / soft costs
● 2011: Rebranded as “CGSpace”● 2015: 9 CGIAR centers, ~50,000 items, ~250k
hits/month
![Page 3: DSpace at ILRI: A semi-technical overview of “CGSpace”](https://reader036.vdocuments.us/reader036/viewer/2022062313/55b95774bb61eb94788b4823/html5/thumbnails/3.jpg)
“CGSpace” in June, 2015
![Page 4: DSpace at ILRI: A semi-technical overview of “CGSpace”](https://reader036.vdocuments.us/reader036/viewer/2022062313/55b95774bb61eb94788b4823/html5/thumbnails/4.jpg)
How we use DSpace
● Content people embedded in each department help capture results (presentations, papers, brochures, etc)
● Primary location for institutional outputs!● No posting PDFs on corporate website!● Integrate with website and blogs via RSS feeds● Direct ALL traffic to DSpace!● For data sets, videos, etc we make a metadata-
only accession with a link to eg YouTube
![Page 5: DSpace at ILRI: A semi-technical overview of “CGSpace”](https://reader036.vdocuments.us/reader036/viewer/2022062313/55b95774bb61eb94788b4823/html5/thumbnails/5.jpg)
● Communities, sub-communities, and collections● Tempting to model after organization hierarchy!● (we did)● … but organization hierarchies change!
DSpace hierarchies
![Page 6: DSpace at ILRI: A semi-technical overview of “CGSpace”](https://reader036.vdocuments.us/reader036/viewer/2022062313/55b95774bb61eb94788b4823/html5/thumbnails/6.jpg)
Mostly organized by output type now...
![Page 7: DSpace at ILRI: A semi-technical overview of “CGSpace”](https://reader036.vdocuments.us/reader036/viewer/2022062313/55b95774bb61eb94788b4823/html5/thumbnails/7.jpg)
Metadata
● Standard Dublin Core is available● No AGROVOC● You can create custom controlled vocabularies in
arbitrary namespaces, eg: cg.subject.ilri
![Page 8: DSpace at ILRI: A semi-technical overview of “CGSpace”](https://reader036.vdocuments.us/reader036/viewer/2022062313/55b95774bb61eb94788b4823/html5/thumbnails/8.jpg)
Custom metadata in ILRI report
Not AGROVOC!
![Page 9: DSpace at ILRI: A semi-technical overview of “CGSpace”](https://reader036.vdocuments.us/reader036/viewer/2022062313/55b95774bb61eb94788b4823/html5/thumbnails/9.jpg)
“Discovery” facets
● Context-aware metadata summaries
● Side effect: helps spot metadata inconsistencies!
● … Open Access, Open access, open Access, etc.
![Page 10: DSpace at ILRI: A semi-technical overview of “CGSpace”](https://reader036.vdocuments.us/reader036/viewer/2022062313/55b95774bb61eb94788b4823/html5/thumbnails/10.jpg)
Search engine optimization (SEO)
Help Google Scholar consume your content!
● XML sitemaps● Consistent domain name, eg: cgspace.cgiar.org● Persistent links for resources● Website speed and HTTPS both a plus● Sign up for Google Webmaster Tools to submit
sitemap, control indexing, see stats, etc
![Page 11: DSpace at ILRI: A semi-technical overview of “CGSpace”](https://reader036.vdocuments.us/reader036/viewer/2022062313/55b95774bb61eb94788b4823/html5/thumbnails/11.jpg)
Sitemap view in Google Webmaster Tools
![Page 12: DSpace at ILRI: A semi-technical overview of “CGSpace”](https://reader036.vdocuments.us/reader036/viewer/2022062313/55b95774bb61eb94788b4823/html5/thumbnails/12.jpg)
Importance of persistent links
● Website addresses change…● mahider.ilri.org -> cgspace.cgiar.org● But resources stay the same!
http://hdl.handle.net/10568/67073
● “Handle” service from handle.net● Everything under prefix 10568 is CGSpace● Default DSpace handle prefix is 123456789!
![Page 13: DSpace at ILRI: A semi-technical overview of “CGSpace”](https://reader036.vdocuments.us/reader036/viewer/2022062313/55b95774bb61eb94788b4823/html5/thumbnails/13.jpg)
dc.identifier.uri specifies an item’s persistent universal resource identifier (URI)
![Page 14: DSpace at ILRI: A semi-technical overview of “CGSpace”](https://reader036.vdocuments.us/reader036/viewer/2022062313/55b95774bb61eb94788b4823/html5/thumbnails/14.jpg)
Getting data INTO DSpace
● Day-to-day submission is manual, by a small army of editors
● One-time batch uploads of items from other systems in CSV format (InMagic!)
● OAI-PMH for metadata only● OAI-ORE for metadata + bitstreams (eg, from
another DSpace or Sharepoint, etc)● SWORD (haven't tried)● REST API (DSpace 5+, haven't tried)
![Page 15: DSpace at ILRI: A semi-technical overview of “CGSpace”](https://reader036.vdocuments.us/reader036/viewer/2022062313/55b95774bb61eb94788b4823/html5/thumbnails/15.jpg)
Getting data OUT OF DSpace
● REST API for structured JSON or XML● OAI-PMH for metadata● OAI-ORE for metadata + bitstreams (PDFs, etc)● RSS feeds for websites / blogs● XML sitemaps for search engines*
*Google discontinued the use of OAI for discovering site content in 2008! http://googlewebmastercentral.blogspot.com/2008/04/retiring-support-for-oai-pmh-in.html
![Page 16: DSpace at ILRI: A semi-technical overview of “CGSpace”](https://reader036.vdocuments.us/reader036/viewer/2022062313/55b95774bb61eb94788b4823/html5/thumbnails/16.jpg)
CCAFS website, driven by Drupal + DSpace APIs
![Page 17: DSpace at ILRI: A semi-technical overview of “CGSpace”](https://reader036.vdocuments.us/reader036/viewer/2022062313/55b95774bb61eb94788b4823/html5/thumbnails/17.jpg)
“Latest outputs” on project blog populated via RSS, links to CGSpace
![Page 19: DSpace at ILRI: A semi-technical overview of “CGSpace”](https://reader036.vdocuments.us/reader036/viewer/2022062313/55b95774bb61eb94788b4823/html5/thumbnails/19.jpg)
Skills needed in your organization
Besides content people(!)...
● Prioritize Linux systems administration experience (Tomcat, httpd, PostgreSQL, DNS, SSH, git)
● General: computer science background● Web developers a diverse bunch...● Java development experience doesn't hurt
![Page 20: DSpace at ILRI: A semi-technical overview of “CGSpace”](https://reader036.vdocuments.us/reader036/viewer/2022062313/55b95774bb61eb94788b4823/html5/thumbnails/20.jpg)
Extra considerations
● Item mapping● Maintenance tasks (background batch jobs)● Backups of assetstore and PostgreSQL!● Altmetrics tracks social media mentions● Separate production / development
environments● CGSpace server is $80/month● ~20GB of PDFs, ~8GB of Solr data