Skip to content

Issues Encountered when Metadata Scraping

Irina Velsko edited this page Aug 27, 2020 · 24 revisions

Introduction

Many papers report metadata in a hetereogeneous way. This can make rapid retrieval and compilation of useful data difficult when generating comparative datasets.

This page acts as a 'whiteboard' for people to note down common problems they have when adding studies to AncientMetagenomeDir. This will help the ancient metagenomics community to begin to define reporting standards in the field, in an easy to access and understand manner.

Issues

Sample Ages

  • In many cases radiocarbon dates are reported inconsistently
    • No uncalibrated date
    • No radiocarbon lab code (like OxA-0000 or MAMS-0000)
    • Only provides ranges (no median midpoint)
    • Reports in AD or BP
    • No calibration curve reported

Sample Codes

  • In some cases sample codes reported in the manuscript are not the same codes that are used in data upload to ENA/SRA

Sample Collection Date

  • Sample collection date is not frequently reported in manuscripts