exploring map-based discovery services in the digital library environment

1
1. Consultants vs. in-house mapping. Does your institution have a budget that enables you to consider hiring an outside service to do the mapping for you? Most do not, in which case you’ll need to rely on your department’s staff. 2. Gather a team. Project manager, researcher, programmer, etc. Maybe even an intern if you can wrangle one! While many of the first steps will be research-based, as the project progresses it is helpful to have people with who are able to support each other and develop realistic workflows. 3. Open source? Does your institution have a preference on whether you use an open source tool to do your geocoding and/or mapping? For some, an institutional preference will determine which services are possibilities. 4. Take a hard look at the data you want to map. The ubiquity of GPS systems in our society can send the message that mapping large sets of data is stress-free. However, many people overlook the time it will take to clean up the data so that it is in mappable shape. Be realistic about what you are working with! 5. Try things out. See what happens. After mulling over the aforementioned considerations, start reading. Pick some tools and explore what works for your institution’s needs. In May 2012 I began a 3-credit, 180-hour internship in the Indiana University Digital Library Program (DLP), where I have since been immersed in learning about ways that the DLP can incorporate mapping and discovery services for their online archival collections, Image Collections Online (ICO). As I wrap up my internship in the upcoming weeks, whether the DLP will choose to implement any of my findings remains to be seen. It is a process that requires energy and forethought, but looking back it is apparent to me that mapping an image-based collection is much more straightforward than I previously thought. On this poster I’ve included some aspects of the mapping process to consider if your institution is interested in utilizing map-based discovery services—I hope you find it helpful! Mapping image-based collections tackles the more literal interpretation of mapping: figuring out where the photo was taken and mapping it to that location. Alternatively, images could be mapped according to what repository contains them, if applicable. Mapping text-based collections is less common, though it is becoming more and more prevalent as scholars take locations that are mentioned within a novel or historical document and map them to reveal new ways of thinking about the topic. With the rise of digital humanities scholarship, an interest in mapping has even spawned the term “spatial humanities.” In the case of the DLP, my co-intern has done research and prototyping on ways to map mentions of locations in the Indiana Magazine of History (IMH). In this instance, the mapped locations act as a discovery portal to the IMH just as the images I have mapped act as a discovery portal to ICO. Research Experiment Implement Check out my internship blog, Info Apprentice, at: http://infoapprentice.wordpress.com Peruse the resources I read during my research on the Map-Based Discovery Service team’s Zotero bibliography, Map Interfaces: https://www.zotero.org/groups/map_interfaces Contact me at [email protected] with any questions you may have! The data available to you will make or break your efforts. Depending on the size of the dataset you want to map, you will likely want to automate the process by running a script that will take your data and map it for you in batches. APIs are great for this purpose, so no worries there. However, your ability to use an API and depends on the data you have. In order for most services to map items, you will need latitude and longitude coordinates or an address for each item. This could prove to be a significant challenge, depending on the level of detail in the metadata you have. Things to consider Do you have data for the items you want to map, and if so, does this data contain locations? If yes, are the locations within structured or unstructured fields? How granular is the location-related metadata? Is it at the county-level, city- level, or street-level? Incomplete metadata doesn’t necessarily mean that an item isn’t mappable. One solution for less-than-stellar metadata includes geoparsers, which are natural language processors capable of plucking place names from unstructured text. These place names are then resolved against a gazetteer. If geoparsing is not the way your institution wants to go, it is an option to clean up you data manually or there are services available that can clean your data for a fee. In the case of the DLP, ICO contains a multitude of collections, each of which has had different catalogers—leading to vastly different fields and levels of completion. An option the DLP is strongly considering due to the many data issues it has is to create guidelines to pass on to the individual collections detailing the information needed to create maps. This way, rather than mapping being a service the DLP provides directly, moving forward the catalogers can invest their time in creating mappable metadata if their collection values that service. This negates the need for the DLP to retroactively clean up massive quantities of unhelpful metadata. Of course, this precise scenario is only directly applicable to a digital library setting; smaller collections interested in mapping could also consider altering their cataloging workflow and map only newly created records. Google Geocoding API Yahoo! Maps Geocoding API Geocoder.us GPS Visualizer BatchGeo

Upload: brianna-marshall

Post on 04-Jul-2015

131 views

Category:

Health & Medicine


0 download

DESCRIPTION

Presented by Brianna Marshall at the 2012 Special Libraries Association Annual Conference.

TRANSCRIPT

Page 1: Exploring Map-Based Discovery Services in the Digital Library Environment

1. Consultants vs. in-house mapping. Does your institution have a budget that enables you to consider hiring an outside service to do the mapping for you? Most do not, in which case you’ll need to rely on your department’s staff.

2. Gather a team. Project manager, researcher, programmer, etc. Maybe even an intern if you can wrangle one! While many of the first steps will be research-based, as the project progresses it is helpful to have people with who are able to support each other and develop realistic workflows.

3. Open source? Does your institution have a preference on whether you use an open source tool to do your geocoding and/or mapping? For some, an institutional preference will determine which services are possibilities.

4. Take a hard look at the data you want to map. The ubiquity of GPS systems in our society can send the message that mapping large sets of data is stress-free. However, many people overlook the time it will take to clean up the data so that it is in mappable shape. Be realistic about what you are working with!

5. Try things out. See what happens. After mulling over the aforementioned considerations, start reading. Pick some tools and explore what works for your institution’s needs.

In May 2012 I began a 3-credit, 180-hour internship in the Indiana University Digital Library Program (DLP), where I have since been immersed in learning about ways that the DLP can incorporate mapping and discovery services for their online archival collections, Image Collections Online (ICO). As I wrap up my internship in the upcoming weeks, whether the DLP will choose to implement any of my findings remains to be seen. It is a process that requires energy and forethought, but looking back it is apparent to me that mapping an image-based collection is much more straightforward than I previously thought. On this poster I’ve included some aspects of the mapping process to consider if your institution is interested in utilizing map-based discovery services—I hope you find it helpful!

Mapping image-based collections tackles the more literal interpretation of mapping: figuring out where the photo was taken and mapping it to that location. Alternatively, images could be mapped according to what repository contains them, if applicable. Mapping text-based collections is less common, though it is becoming more and more prevalent as scholars take locations that are mentioned within a novel or historical document and map them to reveal new ways of thinking about the topic. With the rise of digital humanities scholarship, an interest in mapping has even spawned the term “spatial humanities.” In the case of the DLP, my co-intern has done research and prototyping on ways to map mentions of locations in the Indiana Magazine of History (IMH). In this instance, the mapped locations act as a discovery portal to the IMH just as the images I have mapped act as a discovery portal to ICO.

Research Experiment Implement

• Check out my internship blog, Info Apprentice, at: http://infoapprentice.wordpress.com

• Peruse the resources I read during my research on the Map-Based Discovery Service team’s Zotero bibliography, Map Interfaces: https://www.zotero.org/groups/map_interfaces

• Contact me at [email protected] with any questions you may have!

The data available to you will make or break your efforts. Depending on the size of the dataset you want to map, you will likely want to automate the process by running a script that will take your data and map it for you in batches. APIs are great for this purpose, so no worries there. However, your ability to use an API and depends on the data you have. In order for most services to map items, you will need latitude and longitude coordinates or an address for each item. This could prove to be a significant challenge, depending on the level of detail in the metadata you have.

Things to consider • Do you have data for the items you want to map, and if so, does this data

contain locations? • If yes, are the locations within structured or unstructured fields? • How granular is the location-related metadata? Is it at the county-level, city-

level, or street-level? Incomplete metadata doesn’t necessarily mean that an item isn’t mappable. One solution for less-than-stellar metadata includes geoparsers, which are natural language processors capable of plucking place names from unstructured text. These place names are then resolved against a gazetteer. If geoparsing is not the way your institution wants to go, it is an option to clean up you data manually or there are services available that can clean your data for a fee. In the case of the DLP, ICO contains a multitude of collections, each of which has had different catalogers—leading to vastly different fields and levels of completion. An option the DLP is strongly considering due to the many data issues it has is to create guidelines to pass on to the individual collections detailing the information needed to create maps. This way, rather than mapping being a service the DLP provides directly, moving forward the catalogers can invest their time in creating mappable metadata if their collection values that service. This negates the need for the DLP to retroactively clean up massive quantities of unhelpful metadata. Of course, this precise scenario is only directly applicable to a digital library setting; smaller collections interested in mapping could also consider altering their cataloging workflow and map only newly created records.

• Google Geocoding API • Yahoo! Maps Geocoding API • Geocoder.us • GPS Visualizer • BatchGeo