This repository holds the supplementary data files and source code for the following paper.
Venue: Historical Methods: A Journal of Quantitative and Interdisciplinary History
Title: Linking Scottish Vital Event Records Using Family Groups
Authors: Özgür Akgün1, Alan Dearle1, Graham Kirby1, Eilidh Garrett2, Tom Dalton1, Peter Christen3, Chris Dibben4, Lee Williamson4
- School of Computer Science, University of St Andrews
- Department of Geography, University of Cambridge
- Research School of Computer Science, The Australian National University
- School of Geosciences, University of Edinburgh
Our algorithms make use of the M-tree data structure heavily. The following is a good reference about this data-structure.
- Ciaccia, P., M. Patella, and P. Zezula. 1997. M-Tree: an Efficient Access Method for Similarity Search in Metric Spaces. 23rd VLDB Conference, Athens, Greece. Morgan Kaufmann, 426–35.
The Wikipedia page for M-tree also contains useful information and some further references.
In the code-snapshot
directory we provide source code that we used in our experiments for this paper.
The script src/main/scripts/experiments/family_grouping.sh
is the starting point of our experiments.