About

This website is a demonstration of a digital public portal for archaeological information in Virginia from the Virginia Department of Historic Resources. The goal of this site is to provide an open portal to archaeology in Virginia to both researchers and to the general public. This space is growing and changing. Check back frequently.

Archaeological reports and digital media visible here are hosted in a KORA digital repository. Site mapping links to records from the Digital Index of North American Archaeology.InstituteDigitalArch-Logo-35

It is currently created and maintained by Jolene Smith as a product of participation in the Institute on Digital Archaeology Method and Practice. To read more about project development see the Institute blog posts or read the project documentation below.

The archaeological sites selected for this project are considered “high interest and low risk.” In other words, the images and reports are interesting and the sites are either destroyed or well known.

Full project documentation

Jolene L.U. Smith

Virginia Department of Historic Resources

Archaeology for Everyone: A Public Digital Repository

White Paper for the Institute on Digital Archaeology Method and Practice

August 29, 2016


Virginia Archaeology for Everyone is a portal to openly accessible archaeology information, geared toward the general public. It is also a proof-of-concept digital repository, digital objects and full archival metadata for preservation and research.

Background

The Virginia Department of Historic Resources is the State Historic Preservation Office, and the archive for records of around 44,000 archaeological sites and accompanying collections. While our agency launched a secure web database for our site documentation in 2013, managing and preserving accompanying digital media remains a pressing need. As a state agency curating records created with public funds or produced as a result of federal undertakings, we have both ethical and policy-based obligations to properly archive digital archaeology materials. This project, in part, lays the groundwork for paths forward in digital archiving.

Another major goal of this project was to repackage and re-present often technical archaeology information in a form that is interesting and accessible to the general public. Our agency has some public archaeology resources on the web, but they are hard to find and beginning to show their age through design and technology. Anecdotally, when participating in public events, I’ve found that many people outside of the historic preservation field aren’t aware that our agency exists or that there is so much incredible and active archaeology throughout the state. It’s a surprise to them, although they are almost always fascinated by what they learn and want to know more. By increasing public awareness, we encourage stewardship for the Commonwealth’s irreplaceable archaeological resources as well as public support for our agency’s mission.

It was clear to me from the beginning that completely satisfying these objectives was beyond the scope and timeframe of the Institute, so I chose to define this as a proof-of-concept project. Framing it as such also allowed me to experiment with technologies and applications that might be challenging to implement officially within the stringent information technology and security environment of a state agency. Essentially, this project is a demonstration of what the agency could do, and ideally adapt this into something more permanent and extensive in the future.

Workflow

Work began in earnest with high-level planning by presenting my “pitch” for a public web portal to agency colleagues in September 2015. Initial attention was on planning for the repository backend and identifying the kinds of metadata necessary to collect based on the existing database structure while also integrating research on digital library best practices. I supplemented the wealth of knowledge gained through prerequisite preparation and the 2015 Institute meeting with online courses and self-directed learning. Content focused around ~10 archaeological reports and their accompanying digital media to begin. This allowed for some geographic and content variety, while being realistic for this small-scale project.

Developing metadata schema was a much more challenging process than I had anticipated. As I ventured down the planning path, I began to realize how many important questions about access, intellectual property, and ethics of withholding versus releasing archaeological information still remained. From a practical standpoint, I had already decided to focus my project on “high interest, low risk” sites, meaning they are either publicly known or have been destroyed, but that didn’t speak to IP or traditional knowledge issues.

I found myself working backwards to learn some of these skills. While I knew the kinds of metadata fields I wanted to have generally, I had to take some time to learn some digital library science basics in order to understand how to implement Dublin Core standards in combination with our agency’s legacy metadata. Input from staff at the Library of Virginia was invaluable.

In addition to perceived low-sensitivity status as described above, “showcase” projects for the site were chosen because a) the copyright/distribution rights were clear enough, b) the sites themselves were interesting beyond a professional level, meaning there was a lot of material culture to look at or compelling histories in the research, c) the sites were distributed to some degree across Virginia, and d) there was variety in cultural and time period representation.

An unexpected and very important issue arose surrounding cultural representation and sensitivity. One of the biggest challenges of this project, as well as broader online archaeological outreach efforts in general, concerns the tension between openness and representation on one hand and cultural sensitivity and security on the other. In Virginia, records of Virginia Indian human remains (and the sites that contain them) are treated with extra care and sensitivity. Since individuals were often buried near or within dwelling areas, this means that information pre-Colonial towns, villages, and settlements might be too sensitive to release. But this creates a paradox. Withholding information about these vibrant people and places perpetuates colonial erasure. If the “safe” sites are European, what about other cultures in Virginia? I was able to find two good candidates to feature in the collection that do not seem to trigger these issues. But we won’t be able to avoid confronting these ideas for much longer. I plan to devote significant consideration to this important topic in the future, as it extends well beyond the scope of the Institute.

With collections reasonably settled, I began to gather digital objects and ingest records into KORA. Although we had many of these materials scanned already, I found as I began processing files that a significant portion needed to be re-scanned. Text was not properly recognized, sections were missing from reports, and images were not scanned at adequate resolution. This was a very important discovery, especially as my agency moves forward with larger digital archives initiatives. Our existing digital collection will need to be quality checked and possibly re-processed in the future.

I also used Tabula and OpenRefine to extract and clean tables of artifact data from PDF reports, in order to allow for machine readability and data reuse in the future. This very small element of my Institute project is one of the most important facets with broad implications for Virginia archaeology data in the future. Tables were uploaded to the repository as comma separated value (CSV) files along with reports and other documents.

Institute mentor Catherine Foley was exceptionally helpful with reviewing my metadata and providing feedback, as well as to answer some basic ground-level KORA and digital archives questions. I ultimately settled on one KORA metadata scheme for this project in order to keep things as simple as possible. I uploaded materials for several sites, but decided to pause until I could see how the front-end KORA plugin interacted with the database and if I needed to make adjustments to my scheme attributes.  

A delay in the release of the KORA plugin for WordPress hindered progress, but I used that time to wireframe the main pages for the site; the landing page with a gallery and interactive map, an advanced search page for researchers, individual collection pages, and a page for general information about archaeology in Virginia.

After the site domain for WordPress site through MSU was set up, I worked to troubleshoot the plugin installation as the first person to test beyond MSU MATRIX staff. I connected items and galleries to the site and began to build it out. The front page includes an interactive map made with Leaflet.js that allows visitors to navigate between the different galleries geographically. Additionally, each site gallery includes another leaflet map that shows the generalized site boundary, using the Open Context API to display the geospatial information for each. PDF reports were uploaded into Voyant Tools, an online text analysis and visualization application. While some of these visualizations are sophisticated and may not be readily understandable to the end user, visitors can manipulate the text and find their own relationships and patterns on their own terms.

Unfortunately, bugs in the WordPress plugin interfered with layout and functionality, so I decided to halt site development after consultation with Ethan Watrall and Catherine Foley pending updates on fixes for the plugin.

Challenges

Adapting to technical issues beyond my control has been by far the biggest challenge of this project. In order to maximize my own time and effort, I hoped to get a feel for how the front end website interacted with the back end database before I invested a great deal of energy in creating metadata that might need to be adjusted significantly in the end. I also found that the KORA plugin interfered with the appearance and layout of WordPress to a severe degree, which prevented me from choosing an attractive theme before launch date. The cascading effect of this is that I’m unable to complete the styling of Leaflet maps and site graphics. I also learned that there’s no way to search for KORA objects due to the way the information is pulled into the site display via PHP and not imported into WordPress itself. This makes advanced search impossible.

At present, the site landing page displays an update about technical difficulties, although all pages are active. KORA/MATRIX staff members continue to work on the plugin. When an update or schedule is available, I will decide between three possibilities for continuing work on this project. In the best case scenario, the plugin layout issues will be fixed and I can proceed once again with finalizing the look, feel, and functionality of the site. I’ve identified potential workarounds for the lack of search capabilities that I can build into this site as well. Another option will be to connect a separate instance of WordPress and essentially “quarantine” the KORA galleries on their own pages to allow for more theme flexibility on other parts of the site. The third option is to abandon KORA for this particular project and shift my material directly to the WordPress site itself instead of using the KORA backend. I continue to weigh these options. Ultimately, however, this project is very scalable and there are many possible routes to reach a positive outcome.

Successes

While the technical obstacles have been frustrating, successes abound. I have a strong foundation for digital archives management to bring back to my agency as we continue to develop a larger plan. Between general Institute activities and development of my project, I acquired new skills and knowledge in a way that would have been impossible outside of a formal academic setting. A list of skills acquired and topic studied follows. This list doesn’t cover everything I learned through the Institute directly or indirectly; these are simply the most impactful highlights that I can use as I continue work on this project and beyond. These skills are portable far beyond this project to meet larger agency digital archives, outreach, information analysis, and disaster planning efforts.

Technical skills

  • HTML/CSS
  • Javascript
  • GitHub/version control
  • Leaflet.js web mapping
  • RegEx (regular expressions)
  • Data analysis (multiple tools)
  • Accessing data through APIs
  • WordPress theming and functionality

Topics

  • Digital archives management
  • Copyright, access, intellectual property issues
  • Data visualization
  • Crowdsourcing data collection
  • Linked Open Data
  • Digital exhibits/outreach
  • Mobile data collection

In my project pitch at the end of the 2015 Institute working session, I defined several goals for this project, not knowing how technical pieces of the puzzle would come together over the coming year. These included laying the groundwork for a digital repository to archival standards, incorporating Linked Open Data, and presenting this technical material to the public in an engaging way. The project has met or will soon meet all three of these goals.

One of the most valuable elements of the Institute has been the community it has created. As participants, we forged professional and personal bonds during the 2015 session as we worked together to wrap our minds around a torrent of new concepts and skills. We continued to work together and with our mentors on the Digital Archaeology Commons through the intervening year. Participants chronicled successes and challenges in monthly blogs, and others chipped in to help, commiserate, or celebrate often, even if we were thousands of miles away from each other. Meeting again in 2016 allowed us to help each other with the final push toward finished products. From the perspective of an Institute participant who hasn’t taken a project like this from start to finish before, this kind of engagement and encouragement has been invaluable. It has allowed me to cultivate powerful skills and the confidence to push them further.

Next Steps

At the time of this report, site completion is delayed. Once a permanent layout is in place, I will incorporate styles, functionality, and design elements into all existing pages and set up spaces for new galleries. When this substructure is complete, it will be straightforward to add sites and content as time allows.

Moving farther forward I plan to continue learning about intellectual property and access rights. I hope to do more work to enhance Linked Open Data in this repository and to set the stage for future linkages. I’ve successfully pulled information from a linked data source, (the Digital Index of North American Archaeology/Open Context) and I plan to connect objects within the repository with other linked datasets.

I am also interested in integrating Traditional Knowledge Labels developed by Mukurtu and implementing a way for visitors to “flag” objects or sites that they might find sensitive. As our agency undertakes larger digital archives access projects, collaboration with Virginia Indian tribes and other descendent groups is essential to find a careful and considered way to present this valuable information to as wide a public as possible.

As I created this portal, the centrality of reflexive public engagement came to the forefront. An important enhancement I would like to include in the future is a public submission component where visitors can easily share information about sites or artifacts on their property. An anonymous gallery of user submitted objects (that are or aren’t cultural artifacts) could be engaging and also save time on the part of our agency archaeologists, who respond to many public queries to identify materials.

After this proof-of-concept site is launched and demonstrated, ideally the project will move from a largely independent effort to an agency collaborative initiative. The infrastructure I have created here may be dramatically reworked. Software applications used will likely vary significantly from what is currently in place due to policy and budget. But regardless of what the term future holds, this Institute project has planted the seeds for openness, engagement, and sustainable preservation of irreplaceable archaeological heritage.

Works Cited and Consulted

Archaeology Data Service. 2015. “Guides to Good Practice: Main.” Accessed December 31. http://guides.archaeologydataservice.ac.uk/.

Dublin Core Metadata Initiative. 2016. “DCMI Metadata Basics.” Accessed August 27. http://dublincore.org/metadata-basics/.

The Library of Virginia. 2014. “Virginia Public Records Management Manual.” http://www.lva.virginia.gov/agencies/records/manuals/vprmm.pdf.

Open Context. 2016. “Digital Index of North American Archaeology (DINAA)” http://ux.opencontext.org/archaeology-site-data/.

Society for American Archaeology. 1996. “Principles of Archaeological Ethics.” http://www.saa.org/AbouttheSociety/PrinciplesofArchaeologicalEthics/tabid/203/Default.aspx.

tDAR. 2016 “Compliance.” https://www.tdar.org/why-tdar/compliance/.

Virginia Department of Historic Resources. 2016. “About DHR.” http://dhr.virginia.gov/homepage_features/AboutDHR.htm.