-
Notifications
You must be signed in to change notification settings - Fork 0
4. Remove near metadata (nearby) duplicate
Profiles are near metadata duplicates if its truncated latitude and longitude (down to 1 decimal digit) are the same and it's time difference is smaller than 1 day (this is different than last the 2019v1 update when time was truncated down to the day). Uses function box_meta_neardup.m
Then, it checks if the profile is a content duplicate (> 95% threshold). Figure 12 MOCCA report.
Decision
- If is content duplicate: Delete worst profile (the decision about which profile is worst is exactly as in the exact metadata duplicates).
- If is not content duplicate: Keep both profiles
The same script can be use to thin the database (ex. Argo reference database) in regions with many profiles, comparing nearby profiles content in a certain depth range (ex. below 900 db which is used for the DMQC). Uses profcompcont_deep(upper depth limit, coincidence %)
Obs. the content duplicates found in this step are those that were not found using the method to find possible content duplicates in step 3 (Figure 13 MOCCA report)