Identifying lithic microdebitage in soil samples
Analyzing lithic microdebitage (<4mm debris produced in the creation of stone tools) can reveal ancient stone manufacturing sites to provide insight into past cultural activity. In this work, we build models to identify microdebitage particles from soil samples collected from a Mayan village. These models are then used to characterize the soil composition to reveal ancient stoneknapping sites in Nacimiento, Guatemala.
Background
Scope
Data
Models
Timeline
Repo structure
Logistics
Resources
Contact Info
The analysis of lithic microdebitage can illuminate ancient stone tool manufacturing practices to provide insight into past cultural activity. In his dissertation, Dr. Markus Eberl studied the small Mayan site of Nacimiento in the Petexbatun region of Guatemala to investigate how a substantial change in regional political power affected the inhabitants and the local community. One aspect of investigating the cultural dynamics in the region during the 7th and 8th centuries is to understand the the public structures and daily community interactions.
The purpose of this project is to identify the location of ancient stone tool manufacturing areas within this village. The locations of these manufacturing areas may be uncovered by analyzing the soil composition. Although ancient stoneknappers (people who made stone tools) often cleared large and sharp debris from their area of work, lithic microdebitage - that is, particles < 4mm or 1/6th of an inch - would be very difficult to remove from their workspace. Thus, evidence of lithic microdebitage in the soil can inform the location of these ancient stoneknapping sites and the methods used to create the stone tools. Soil samples from 50 locations within the village were collected for analysis.
Samples of microdebitage obtained from current stoneknappers provides a set of examplars to compare with the particles of the soil samples. In order to characterize these samples, a particle analyzer can be used, which provides 40 measurements about each particle contained within the sample. From this, the profiles of the soil samples can be computed to identify likely stoneknapping sites.
Johnson and Eberl, et.al. provides the current work towards these efforts in using dynamic image analysis to discern microdebitage in soil samples. More about the motivating work studying Nacimiento can be found here.
The motivating goal of this work is to identify the location(s) of ancient stone tool manufacturing sites given 50 soil samples from the Maya site of Nacimiento and 30-50 stoneknapping microdebitage exemplar samples. To achieve this result, this project will focus on generating a model or algorithm to characterize the composition of soil samples by identifying particles which are likely microdebitage. The model may then be used to calculate the percentage composition of the entire soil sample which is microdebitage, enabling the ability to identify which households likely hosted stoneknapping work. A secondary objective is to establish a workflow for the management of data, although this will proceed in parallel with the modeling and characterization efforts.
Data management is expected to be an challenge and a workflow should be established to enable expeditious usage of the data. Currently, there are 50 soil samples from the Mayan village of interest. The goal is to identify the composition of these soil samples, and the contained particles will be characterized by the PartAn 3D particle analyzer. The particle analyzer measures 40 variables for every particle, and there are approximately ~500,000 particles in every sample. The training data contains 30-50 collected and labelled microdebitage samples collected from modern stoneknappers, and represents the stages of specific work.
No issues
This is currently TBD for this project, and will reflect the expected distribution of microdebitage within soil.
The generated model will produce probabilities of particles being lithic microdebitage given their descriptors from the particle analyzer.
The particle analyzer is expected to arrive and the workstation setup by the end of February. Based on the ability to obtain particle descriptors from the analyzer around this time, the targeted end date for the project is the end of the semester before May 10th. Some days of flexibility may be necessary to accomodate particularly rigorous student exam schedules.
The deliverables of the project are models, code repository, and sample predictions. The expectation is that the model will be ready for comparative validation on new samples.
The repo is structured as follows: All *0- (e.g., 10-, 20-, 30-) define the work required towards a particular thrust (e.g., "load-data"). Subfiles related to the task (e.g., 11-, 12-) should be created in order to explore and document necessary, relevant, or interesting subtasks.
All files which appear in the repo should be able to run, and not generate errors, even if they are relatively midway in development of the proposed task. The only time code which generates errors should be committed to the repo is if asynchronous feedback is desired about resolving these errors. All notebooks relating to the analysis should have a numerical prefix (e.g., 31-) followed by the exploration (e.g. 31-student-features). Any utility notebooks should not be numbered, but be named according to their purpose. All notebooks should have lowercase and hyphenated titles (e.g., 10-load-data not 10-Load-Data). All notebooks should adhere to literate programming practices (i.e., markdown writing to describe problems, assumptions, conclusions) and provide adequate although not superfluous code comments.
Sprint planning: Mondays from 1-2pm at Zoom link here.
Coder's meeting: Fridays from 1-2pm at Zoom link here
Demos: Fridays at 2-3pm at Zoom link here
Data location: Initial data contained within repo, and subsequent data projected to be stored on Box.
Slack channel: ancient-artifacts
on Data Science TIP Slack Workspace
- R Basics: RStudio Primers
- Data Science with R and Tidyverse: R for Data Science, Hadley Wickham
- Git tutorials: Simple Guide, Learn Git Branching
Markus Eberl, Ph.D. Associate Professor of Anthropology
markus.eberl@vanderbilt.edu
Jesse Spencer-Smith, Ph.D. Chief Data Scientist, DSI
jesse.spencer-smith@vanderbilt.edu
Charreau Bell, Ph.D. Senior Data Scientist, DSI
charreau.s.bell@vanderbilt.edu
Umang Chaudhry Data Scientist, Vanderbilt DSI
umang.chaudhry@vanderbilt.edu
Phyllis Johnson, M.A. Graduate student
phyllis.s.johnson@vanderbilt.edu
Amy Rieth, Undergraduate student
amy.e.rieth@Vanderbilt.Edu
Mark Raj, Undergraduate student
mark.j.raj@vanderbilt.edu
Amanda Sarubbi, Undergraduate student
amanda.sarubbi@vanderbilt.edu