Skip to content

Virtual Fossil Fragmenter

Mike Caprio edited this page Mar 23, 2018 · 66 revisions

Use Computer Vision and 3D Modeling to Identify Fossil Fragments of Mollusk Shells and Trilobites

Hackathon Findings

Hackathon Projects

Background

You can read more about trilobites in the Trilobite Bits challenge

Large collections of marine fossils from ancient seabeds can be found in deposits of limestone, and often the specimens within have been broken into fragments. If rock deposit is unlithified or only partially lithified (= the grains are not strongly cemented together), the fossils can be sieved out of the deposit. In other cases, it is sometimes possible to use an acidic preparation to dissolve the rock surrounding the fossils, leaving behind a variety of fossil pieces from as many as a dozen or more different taxa. These collections are a great "snapshot" of the ancient marine ecosystem, showing which taxa could be found together in the same locality.

Examples of small fossils in limestone.

But with fragments, it can be difficult to determine if each came from a unique individual or if each individual is represented by many parts. It next becomes a painstaking process to pick through all the fragments to not only identify which taxa are present in the sample, but to determine how many individual specimens from a particular taxon there are. Luckily, because of unique features in the form and structure (morphology) of some of these taxa, we can estimate the abundances of some different taxa from fragmented collections. For example, each Lirobittium rugatum has one aperture and each Cyclocardia occidentalis has one umbo. Therefore if we only count the specimens with apertures or umbos, we know the sample includes only unique individuals.

(Note that each Cyclocardia individual had two shells that were mirror images of each other--thus each individual technically could have contributed two specimens to the collection. Because of this researchers sometimes consider dividing the number of specimens with umbos by 2 when comparing abundances, especially if there are many specimens of the same size. But either way we still need to know the number of specimens with umbos!)

Another idea for solving the "how many individuals vs how many fragments?" problem might be to ask "what percentage of the complete specimen does this fragment represent?" Then we could count the number of individuals in the collection as the number of fragments that are (say) 75% complete.

An AMNH collection of fossil fragments from a single rock sample.


Solutions

We want to identify these fragmented pieces of mollusk shells and trilobite exoskeleon. The theory is that we could take a 3D model of an organism, programmatically/algorithmically "shatter" the model into pieces, then render and randomly rotate the shattered pieces to generate a training set of images that could possibly identify the fragment using computer vision.

Complete fossil exoskeleton of the trilobite Cryptolithus (here Cryptolithus lorettensis), order Asaphida

We see these as the solutions to this challenge, in order of increasing difficulty:

  • A solution that recognizes that every fragment on a control image belongs to one taxon

  • A solution that recognizes that there are different fragments on a group image with several taxa

  • A solution that correctly recognizes ALL the different taxa on a group image

  • A solution that can identify which region of an individual taxa a fragment comes from

  • A solution that can determine what percentage of the complete specimen a particular fragment represents (count the number of individuals in a collection as the number of fragments that are, for example 75% complete)

  • A solution that can detect unique characteristics of individual fragments (umbo, aperture) and return a count of those individuals

Subproject 1: 400-million-year-old trilobites

Fragments of trilobite species Cryptolithus cephala (here Cryptolithus tesselatus) in two control images on a 1cm grid

What is provided:

  • 3D models of Cryptolithus tesselatus 1) cephalon (head), 2) thorax, 3) pygidium (tail). See schematic of trilobite exoskeleton here

  • Set of images of just one of the three parts, different orientations, different degrees of fragmentation

  • Set of images of known mix of specimens including all Cryptolithus tesselatus parts, different orientations, different degrees of fragmentation

WHAT PROPORTION OF FRAGMENTS CAN BE IDENTIFIED AS CRYPTOLITHUS?

Subproject 2: 2-million-year-old mollusk communities

What is provided:

HOW MANY OF THE FRAGMENTS ARE IDENTIFIABLE AS LIROBITTIUM OR CYCLOCARDIA?

HOW MANY OF THE FRAGMENTS INCLUDE THE APERTURE (FOR LIROBITTIUM) OR THE UMBO (FOR CYCLOCARDIA)?

All images are copyrighted by the American Museum of Natural History and are available for use during the Hackathon only. Permission for other (non-commercial) uses of the images may be available on request.


Resources

Libraries using booleans to split things into fragments

  • CSG.js: Constructive Solid Geometry (CSG) JavaScript library
  • Cork: Cork is designed to support Boolean operations between triangle meshes (C++)

Volumetric manipulation

  • OpenVDB: OpenVDB is an Academy Award-winning C++ library comprising a hierarchical data structure and a suite of tools for the efficient manipulation of sparse, time-varying, volumetric data discretized on three-dimensional grids.

Image segmentation

Check out Online Resources and Data Sets for more general purpose software and utilities.


Challenge owner: Melanie Hopkins