Putting the “where” in the archives: Internet mapping and archival records
As keepers of recorded and artifactual history, archival repositories provide communities with the raw materials to support collective memory and create an effective “sense of place.” Part of this requires exposing the underlying geographical locations whose history is documented by archival records. But traditional archival principles of arrangement and description primarily emphasize provenance, respect des fonds, and temporal organization rather than the spatial aspects of records. Internet-based GIS tools such as Google Maps and Google Earth offer opportunities for archives to present records in new and exciting ways, and can help better connect archives to the communities they serve.
Archives play a key role in the creation and maintenance of a sense of place. As keepers of recorded and artifactual history, archival repositories provide communities with the raw materials to support the collective memory of the past. But a sense of place encompasses more than just recorded history. It’s not just that “something happened”; it’s that “something happened here” — in this particular location. Furthermore, the idea of a sense of place implies that we can see the parallels and connections between what happened in the past and the location we now inhabit. As David Glassberg (1998) has observed, “Historical consciousness and place consciousness are inextricably intertwined. We attach histories to places, and the environmental value attached to a place comes largely from the memories and historical associations we have with it.”
The deep, intrinsic connections between places and the historical record will become apparent to anyone observing the contemporary American landscape. Historical signs, markers, and monuments can be found in virtually every community in the country, marking not just that something happened, but also the geographical location where it occurred. As J.J. Prats (2007), creator of online historical marker database hmdb.org, has noted, “National and global events all happened somewhere....But the richness of history is in its local details, details that can be insignificant on the global stage: the home of an individual who made a difference, a natural feature, building, byway, or something interesting that happened nearby. History is not just about the high and mighty.” Tourists travel to historic sites not because it’s the only way to learn history, but because they are able to connect much more readily with the events of the past if they feel that they can experience the landscape in which the events occurred.
This search for historical consciousness is not limited to those far from home. People who live, work, and play in a given location are equally likely to develop interest in its past — albeit in different ways. As Glassberg (1998) points out, “By and large tourists look for novelty in landscape, what is not back home, while local residents look at the landscape as a web of memory sites and social interactions.” Geography, then, is an important part of historical consciousness. As institutions concerned with preserving the recorded past and making it available, it follows that archives should have some level of concern about how their records relate to the underlying geography of the places they document. I would argue, however, that the geographic (spatial) aspects of archival records have received scant attention in traditional methodologies of archival arrangement and description, and that embracing these aspects can suggest new ways for archives to make their collections visible, useful, and relevant to researchers and the general public.
While archivists have always done an excellent job of acquiring, arranging, and describing materials that document the past, they have done relatively little to help reveal the deep connections between these materials and the places they document. The fact is, virtually every record held by an archive relates somehow to one or more places.
Traditional archival organization has focused on the centrality of provenance and respect des fonds — the idea that the creator of the records and his/her groupings represent the critical aspects of archival organization. These factors are certainly important, and they play a major role in the process of historical research. Furthermore, in an era before computerized indexing, organizing in this way helped ensure that materials would remain comprehensible and findable when removed from their original contexts. Archivists had to choose a single organizational pattern that would provide the best and most comprehensible access to collections. While there existed card file indexes and other retrieval tools that provided other methodologies to access collections, these tools were labor intensive to compile and somewhat cumbersome to use. Library classification techniques have been applied only minimally to archival records.
To the degree that archivists have moved beyond principles of provenance and original order, their efforts have often focused on temporal characteristics of records. The finding aid for a collection usually contains both bulk and inclusive dates, and folders and series are often labeled by date. In the absence of any sort of original order, archivists often default to a chronological ordering of materials. The temporal focus also extends to archival context. Historical/biographical timelines have long been a fixture in many archival finding aids, and this convention is explicitly included in the XML schema for Encoded Archival Description.
The archival profession, however, has not entirely come to grips with the need for spatial retrieval — that is, the ability to retrieve records based on the places they document rather than the identity of the creator. In her 2002 article “The Death of the Fonds and the Resurrection of Provenance: Archival Context in Space and Time,” Laura Millar argues that overreliance on the traditional archival principle of respect des fonds has resulted in fragmented, incomprehensible collections that make it difficult for researchers to reconstruct the true context of records. Instead, she argues, archivists should observe the importance of temporal and spatial information in the fields of archaeology and museum studies, and should consider how these concepts can help provide meaning to records. “The archaeologist protects the information that defines the piece of pottery in time and space. The curator also traces the movements of a work of art over time and space in order to ensure its integrity. But the archaeologist does not declare unilaterally that one fragment of pottery is a chamber pot. And the curator does not pretend that one painting is the sum total of Picasso’s work. Rather than pretend we have the fonds, archivists should explain what we have in hand, explain its temporal and spatial history, and let users create the linkages and so establish their own definition of the ‘whole.’”
Until archives develop improved ways to support spatial retrieval of archival records, they are crippled in their ability to help researchers understand the complex relationship between records and the places they relate to. Insufficient spatial tools also limit the ability of archives to tie their own collections to the unique communities in which they operate — a crucial aspect of archival outreach.
Part of the reason archives have historically done relatively little to support the use of spatial information has been the sheer complexity of this type of project. The broad provenance of a collection can generally be determined with a cursory glance at its accession records and folder titles. Even dates can easily be culled from a pile of documents and can be represented as ranges. (Again, the convention has been codified into archival standards like DACS and EAD, where we see support for indicating the inclusive dates and bulk dates of collections.
But documenting spatial data within a collection can be harder. Places are harder to pick out, and they are less easily represented as a simple range that can be reflected in a folder title or finding aid.
Even if the places can be identified, effective retrieval based on places has historically been difficult — merely including a LCSH-style listing of geographical names in a finding aid has done little to help researchers locate and relate spatial confluences between items in diverse collections.
Archaeologists, whose work ties them much more closely to particular sites, have been leaders in applying geographical analysis techniques to historical data. As the authors of a 2006 study observed, “...visualization of archaeological information is one of the most exciting ways in which computer technology can be employed in archaeology… [These techniques allow] visual interpretation of data through representation, modeling, display of solids, surfaces, properties or animation, what is rarely amenable to traditional paper publication” (Meyer et al., 2006). The fact that archival records are often found in attics and basements rather than underground should in no way inhibit the application of the same exciting technologies to archival information retrieval.
A Geographical Information System (GIS) is a computer system capable of capturing, storing, analyzing, and displaying geographically referenced information; that is, data identified according to location. The power of a GIS comes from the ability to relate different information in a spatial context and to reach a conclusion about this relationship (USGS, 2007). The results are particularly striking when applied to data from otherwise unrelated sources. As Loren Siebert (2000) puts it, “Geographic information systems are designed to record spatial features and related information, display them, and analyze their conditions and spatial relationships. These capacities enable spatial historical research and extend its analytical reach.” Investigative journalists have made good use of these techniques to craft stories about topics like crime, poverty, and other important issues — analyzing quantitative information to identify geographic patterns.
GIS use in the humanities has been somewhat more sporadic, partly because of the high entry barriers posed by expensive software with steep learning curves. Archivists and historians are not always on the bleeding edge of the technological adoption curve. That said, the idea of using GIS tools to facilitate historical scholarship and archival information retrieval is not new — a number of notable examples were documented by Ian Gregory, Karen Kemp, and Ruth Mostern in 2002. All of these projects, however, made use of relatively complex GIS applications that would be out of the reach of most small archival repositories. Steven Morris (2006) provides a good overview of some of the complexities of geospatial implementations in academic library settings.
But GIS techniques need not be limited to large–scale statistical data sets crunched by math wizards. A prime example is an interactive map produced by the September 11 Digital Archive (http://911digitalarchive.org), a digital library created as a joint venture between George Mason University and the City University of New York. This archive is composed of electronic materials gathered after the September 11 attacks. A recently added feature highlights photographs and textual accounts documenting what was happening at 9:00 a.m. on the day of the attacks. The materials from this time period are plotted onto a map of Manhattan, graphically showing where and when each account was generated. This small example hints at the powerful ways archival data can be “remixed” and made accessible when appropriate metadata is available.1 (See Figure 1 and Figure 2.)
The problem with “traditional” GIS is that it remains the domain of specialists. Many GIS projects make use of advanced software custom created for particular applications. Even standardized programs like ArcView from ESRI can be very expensive, and often have steep learning curves. The complexity of the software has necessitated specialization to the point where GIS experts are generally bought, not made. Few librarians and archivists (with the possible exception of map librarians) have the time to develop and maintain expertise on this type of complex and highly specialized software.
When Martyn Jessop described the application of GIS to a project documenting forced migrations in Macedonia in 2005, he noted the emergence of “a new style of digital projects in the humanities that make use of image, spatial database and text-based technologies.” However, he still made use of a “traditional” GIS tool (MapInfo), and noted that the complexity of such systems was still a perceived entry barrier. Hill (2004) offered a similar diagnosis, noting that “geospatial access to the information in digital libraries remains an underdeveloped capability today, too often perceived to be a capability exclusively associated with GIS or with special collections that hold geospatial objects such as maps and aerial photographs.” Contributing to this perception, she observed, were “the technological and intellectual challenges of integrating spatial representation and access into basic digital library practices and the extra dimensions of managing and using geospatial resources.”
However, in the past few years, things have begun to change. The availability of web–based mapping tools has, in a very short time, dramatically reduced entry barriers for those interested in using GIS technologies. Since 2005, we have begun to see free GIS tools emerge on the web, led by Google’s twin products, Google Maps and Google Earth. Microsoft offers a similar service in its Virtual Earth toolkit.
Part of this change is due to a new philosophy in online systems development, which has been nicknamed the “Web 2.0 phenomenon” (O’Reilly, 2005). Although various commentators have quibbled over its precise definition, Web 2.0 technologies generally include some or all of the following attributes:
Google’s mapping offerings are particularly representative of these trends. Both Google Earth and Google Maps make use of an XML format known as KML, which allows users to import and export geographical data sets. Although it includes many advanced capabilities, the basic KML data format is extraordinarily simple. Figure 3 presents a simple KML file that places two points in the Fenway neighborhood of Boston, Massachusetts:
The example above includes two “placemarks,” or markers, associated with certain points on the map — one corresponding to Simmons College, and the other corresponding to the Museum of Fine Arts, both located on the Fenway. The coordinates (the long decimal numbers between the <coordinates> tags) are given in latitude and longitude, the same x,y coordinate system that has been used by global navigators for centuries.
When loaded into Google Earth, the document above accurately locates the two institutions with place markers that can be clicked on for more information. (See Figure 4.)
It is relatively easy to generate KML files from existing data sets — if you happen to have databases that include latitudes and longitudes. But the fact is that most archival databases and digital libraries don’t currently record this type of information. When location data is recorded at all, it is more likely in the form of traditional street addresses, which must somehow be “geocoded” (converted to latitude and longitude) before they can be used for interesting mapping projects.
However, Google has provided alternatives. These include, in increasing order of complexity:
All of these techniques are far more accessible than traditional GIS solutions, and make it possible for smaller, less technical institutions to take advantage of web–based mapping. As a result, there has been an explosion of interactive maps over the past few years, documenting everything from available apartments2 to bike routes3 to significant locations in Beatles history.4 (See Figure 6.) HBO even created a sophisticated Google Maps mashup that linked the fictional storyline of its popular series The Sopranos to real locations in the New York City area5 (Miller, 2006). (See Figure 7.) The same techniques that enabled this explosion of low–cost, high–quality mapping can be used by libraries and archives to improve the presentation and accessibility of their collections, propagating the vision of “geolibraries” pioneered by the Alexandria Digital Library Project in the late 1990s (Goodchild, 2004). Experimentation in this area is ongoing. For example, the Kingston Frontenac Public Library in eastern Ontario developed a conceptual implementation that used Library of Congress subject headings and Google Maps to enable country-based browsing of a library OPAC (Vandenberg, 2008).
While it is easy to create annotated maps using tools like Google Earth and Google Maps, as long as archivists or catalogers must manually encode the geographical metadata needed to support them, the technology will mainly be used for special exhibits or high–end digital libraries. But the ability of this technology to help researchers and the general public understand records in new ways should point to future directions for the archival profession. Spatial data can add extraordinary value to archival records — and future refinements to descriptive practice should take this into account.
Imagine, for example, a collection management system in which an archivist cataloging items or series within a collection could open a pop–up map and make annotations about the physical locations relevant to the records being described. This data would then be stored in the system using latitude and longitude coordinates. As this spatial metadata accumulated at various repositories, it could be harvested into centralized databases (enhanced versions of OAIster, Archives USA or ArchiveGrid, for example), which could in turn use it to populate searchable union maps of archival holdings. This type of metadata might reflect the provenance of the records (where they were found or created), or it might reflect the content of the records (locations that appear in them). In either case, having this metadata available and discoverable across collections would make them far more accessible, and would promote new types of collaboration among archival institutions — for example, the creation of map-based “virtual collections” around holdings with a shared geography.
It may be instructive to imagine the application of this technology to a particular research question. Currently, if I want to find records related to the built environment of Scollay Square in Boston, I must first work backward and consider the provenance and custody of the records I might be seeking. What organizations or individuals might have created records about this area? What archives might now hold these records? This can be bewildering, especially since Scollay Square as a distinct neighborhood met its demise during the urban renewal craze of the mid–20th century. (Controlled vocabulary and name authorities can help with this sort of thing, but ultimately geographic coordinates are a less ambiguous way of identifying a particular place.)
If spatial data from archival collections were routinely collected and aggregated, I could conceivably pan across a map of Boston and see groups of potentially relevant records by simply finding the location formerly known as Scollay Square. The geographic access points would also alleviate the problems caused by shifting names — I would equally be able to find records whose main access point was Government Center, the present name of this area.
Creating this type of advanced geographic data would clearly increase the time needed to process or digitize collections. But we should not assume that the spatial metadata needed to drive such an advanced information retrieval system must necessarily be hand–entered by archivists. In fact, several factors point to the likelihood that this type of data will become more organically accessible in the future.
The first is the increasing percentage of “born digital” records being accessioned by archival repositories. Because they were originally designed to be read by automated systems, these records are far more likely to have pre–existing computer–readable metadata that can be repurposed by archivists to aid in information retrieval. Some of this information corresponds directly to traditional access points for archival information — for example, the modification time of a file relates to the creation date of records, and size is just another form of extent. But technology is also enabling new forms of metadata — with the most relevant to our discussion being the automatic incorporation of latitude and longitude.
The satellite–based global positioning system has made accurate locational coordinates ubiquitous — virtually all new cellphones have built–in location sensors, and cameras are also coming on the market that support automatic tagging of photos with GPS location coordinates.6 In fact, as early as 1999, researchers at MIT were experimenting with GPS–enabled cameras to allow students to take pictures in their community and then retrieve historical images of the same location (Smith et al., 1999). Even if equipment doesn’t automatically add GPS coordinates, the growth of mapping services has made it far easier for creators of all types to create their own georeferenced records. For example, Yahoo’s Flickr photo sharing service allows users to place map markers where individual photos were taken, and to view these “geotags” placed by other users.7 (See Figure 8.)
As a result of these new technological capabilities, archivists should begin to see more records coming through the door with spatial coordinates already attached. It is important that they are prepared to respond with information retrieval systems that understand this data and make use of it in an intelligent way.
But what of the masses of archival collections that have entered repositories without this type of enhancement? Technology may again provide some alternatives to manual keying. As more and more collections are digitized, it becomes possible to develop computer algorithms to analyze the digitized works and automatically pick out meaningful data for information retrieval. For example, the Perseus Digital Library project at Tufts University has successfully used natural language processing techniques to extract locational information from a wide variety of digitized sources (Crane, 2004). Similarly, Lesbegueries, Gaio, and Loustau (2006) describe a system that is able to pick out “geographic features” from a diverse collection of archival documents (including maps, text, and graphics), and then use linguistic and mathematical techniques to analyze their relationship and develop relevant geographical metadata. The authors observe that traditional library databases do not generally offer the indexing or searching sophistication needed for effective use of geographical limiters. Their solution, which attempts to build this capability programmatically, is perhaps futuristic, but it points toward the possibilities of mining newly digitized collections to extract data that would otherwise have been unavailable in a purely analog world.
While coordinate–based georeferences are the lingua franca of GIS systems, they don’t replace the need to consider traditional place names. As Janée, Frew, and Hill (2004) observe, “...human spatial cognition relies on relationships to and among known features, and to the extent that those features are named, to place–names.” Thus a fully featured information retrieval system must have the ability to understand place–names and map them onto the coordinate system used by GIS applications like Google Earth. A number of researchers, including those working on the Alexandria Digital Library Project (Hill, Frew, & Zheng, 1999) and the Electronic Cultural Atlas Initiative (Buckland & Lancaster, 2004), have explored ways of using electronic gazetteers to help enhance the cataloging and discovery of records. These databases of geographical information allow computers to cross–reference traditional place–names found in records with other related places, and with the underlying coordinate system, resulting in improved resource discovery.
In addition to the automated solutions outlined above, archives might also consider another Web 2.0–oriented solution — giving users the ability to “tag” existing digital collections with enhanced metadata. For example, in a talk at Simmons College8, Casey Bisson described the use of his Scriblio software to run Beyond Brown Paper, a digital library created by Plymouth State University. The site documents the activities of a now closed paper mill in Berlin, New Hampshire. The archive has found that former employees and others with a detailed knowledge of the company have eagerly logged on and begun annotating the under–documented photographic collection. In addition to the pure informational value of the comments, the comments also allow researchers to observe how others have interacted with the records. There is no reason why a similar approach could not be used for geographic metadata, allowing site visitors to pinpoint relevant locations on a map, which would then aid other visitors in locating relevant materials for a given location. This type of collaborative knowledge sharing is a new and potentially fruitful avenue for enhancing the vitality of archival collections in a digital world.
Just as in other professions, digital technologies are rapidly permeating every aspect of the archival field. To some degree, technology simply extends activities that archivists have always undertaken — allowing them, for example, to make “flat” finding aids available online or to answer reference queries by e–mail. But to truly take advantage of new technologies, archivists must consider the implications new capabilities and weigh the changing expectations of users.
The online availability of archival collections means that they are no longer the exclusive domain of specialists who can spend days poring over finding aids. Online archives allow the general public to interact with history in new ways, and can attract nontraditional users to existing collections. To serve these new users, archivists must help them discover the real connections between the records and their own lives. Highlighting the places documented by their records is one particularly effective way of demonstrating this relevance. David Glassberg (1998) has observed that cultural resource managers “...can help residents and visitors alike see what ordinarily cannot be seen: both the memories attached to places and the larger social and economic processes that shaped how the places were made.” I would argue that this assertion is as true for those charged with historical records as with historical sites. Web 2.0 GIS tools like Google Maps are an accessible way for archivists to better present the spatial aspects of their collections, making it easier for communities of users to discover and utilize records of a place in new and different ways. These developments point toward a future where archival users can browse historical documentation as easily as they can seek out a new apartment.
1Roy Rosensweig and Dan Cohen of the Center for History and Technology at George Mason University presented on this and other projects at a fascinating SAA 2006 session. My detailed notes from the session are available online at http://gslis.simmons.edu/blogs/dispatches/2006/08/04/possibilities-and-problems-of-digital-history-and-digital-collections/
7See, for example, http://www.flickr.com/photos/dwig/map/
8Bisson, Casey and Rancourt, Lichen. (2007). “What is Scriblio?” A talk presented by the Simmons College Chapter of ASIS&T. Podcast and notes available online at http://gslis.simmons.edu/podcasts/index.php?id=23
Buckland, M., & Lancaster, L. (2004). Combining place, time, and topic. The Electronic Cultural Atlas Initiative. D-Lib Magazine 10(5). Retrieved on 5/24/2007 from http://www.dlib.org/dlib/may04/buckland/05buckland.html
Crane, G. (2004). Georeferencing in historical collections. D-Lib Magazine 10(5). Retrieved on 5/24/2007 from http://www.dlib.org/dlib/may04/crane/05crane.html
Goodchild, M. (2004). The Alexandria Digital Library Project: Review, assessment, and prospects. D-Lib Magazine 10(5). Retrieved on 5/24/2007 from http://www.dlib.org/dlib/may04/goodchild/05goodchild.html
Gregory, I., Kemp, K., & Mostern, R. (2003). Geographical information and historical research: Current progress and future directions. Humanities and Computing 13:7-22. Retrieved on 4/28/2007 from http://www.institute.redlands.edu/kemp/review/humcomputing.pdf
Hill, L. (2004). Guest editorial: Georeferencing in digital libraries. D-Lib Magazine 10(5). Retrieved on 5/24/2007 from http://www.dlib.org/dlib/may04/hill/05hill.html
Hill, L., Frew, J., & Zheng, Q. Geographic names: The implementation of a gazetteer in a georeferenced digital library. D-Lib Magazine 5(1). Retrieved on 5/24/2007 from http://www.dlib.org/dlib/january99/hill/01hill.html
Janée, G., Frew, J., & Hill, L. Issues in georeferenced digital libraries. D-Lib Magazine 10(5). Retrieved on 5/24/2007 from http://www.dlib.org/dlib/may04/janee/05janee.html
Meyer, E., Grussenmeyer, P., Perri, J., Durand, A., & Drap, P. (2006). Intra-site level cultural heritage documentation: Combination of survey, modeling and imagery data in a Web information system. The 7th International Symposium on Virtual Reality, Archaeology and Cultural Heritage VAST.
Miller, L. (2006, February 20). Can't remember who whacked whom? Just check the map on the Web site. The New York Times. Retrieved from http://www.nytimes.com/2006/02/20/technology/20google.html?_r=1
O’Reilly, T. (2005, Sept. 30). What is Web 2.0: Design patterns and business models for the next generation of software. Retrieved on 4/29/2007 from http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html?page=1
Prats, J.J. (2007). History happened here. Retrieved on 4/27/2007 from http://www.hmdb.org
USGS (2007). Geographic Information Systems. Retrieved on 4/27/2007 from http://erg.usgs.gov/isb/pubs/gis_poster/
David Dwiggins received an MLIS with a concentration in Archives Management from Simmons College in 2008, and also holds an MS in Technology Management from the University of Maryland University College. He is currently pursuing an MA in History from Simmons, and works as Systems Librarian/Archivist at Historic New England.
Copyright, 2013 Library Student Journal | Contact