-
Notifications
You must be signed in to change notification settings - Fork 2
Publishing Microdata
This page covers the functions included with NADAR to create and publish Microdata studies in the catalog and set various options.
For publishing Microdata studies, there are two options:
- Import DDI codebook 2.5 XML
- Create study from scratch using JSON schemas
Download the demo popstan project from here https://github.com/ihsn/ddi-examples/tree/main/demo-popstan-2006
For microdata studies, we need the study DDI, RDF, documentation and data files. You can organize the files in any way you like. We recommend to create a separate folder for each study and use the study IDNo for the folder name. This will make it easier to write scripts to automate data import using NADAR or any other tools. The study folder should include:
- Study DDI codebook (XML) file
- External resources file (RDF)
- Microdata and all other documentation files described in the RDF
This is the first step, uploading a DDI file will create a new study in a draft mode (unpublished state).
To upload a DDI file, use NADAR function import_ddi
:
Parameters
-
xml_file
: Path to the DDI2 codebook xml file -
rdf_file
: (optional) Dublin core RDF file for importing external resources -
repositoryid
: Study collection ID, -
access_policy
: Data access type. Options are {direct
,public
,licensed
,remote
,data_na
}, -
data_remote_url
: (optional) Link to the repository where data is available for download. Only required if access policy is set toremote
, -
published
: Set the publish status for the study. allowed values are {0=draft, 1=publish}, -
overwrite
: Set it to yes if you want to replace the study if it already exists. allowed values {yes, no},
Example: Import DDI
xml_file_path='popstan/ihsn-popstan-mics-2000.xml'
result=nadar::import_ddi(
xml_file=xml_file_path,
published = 1,
overwrite = "yes",
access_policy = "direct")
#check the status code - any value other than 200 means an error
if (result$status_code!=200){
stop(paste0("DDI import failed:",result$message)
}
#success, show the response message from the API
print(result$message)
The RDF file contains the description of all study files (questionnaire, reports, microdata files, etc). The NADAR function can upload the resource descriptions and upload the actual files if they are placed in the same folder or subfolder as the RDF file.
For popstan study, the RDF file contains relative links to the resource files, running the R code below will create the resources and upload the files in one go. If you don't have an RDF file or files are not organized in a folder structure, you can still create and upload resources, see section below.
NADAR function external_resources_import
:
Parameters
-
dataset_idno
: Study IDNo -
rdf_file
: Dublin core RDF file -
skip_uploads
: Set to TRUE to skip file uploads -
overwrite
: Set it to yes if you want to replace existing resources. allowed values {yes, no},
Example: Import External resources (RDF)
This example uses the popstan study, the RDF file includes the relative paths to the resource files. The external_resources_import
function will import the descriptions from the RDF and will find the files from the study folder and upload them to the catalog.
skip_uploads - With the skip_uploads
param set to FALSE
, the function will throw warnings/errors if a resource file cannot be found in the study folder.
resource_file_path='popstan/ihsn-popstan-mics-2000.rdf'
nadar::external_resources_import(
dataset_idno="ihsn-popstan-mics-2000",
rdf_file=resource_file_path,
skip_uploads = FALSE,
overwrite="yes"
)
The function does not return anything. If there were any errors, you'll see them in the R console. All errors are reported as warnings. To see the errors, call the R function warnings()
which will show output similar to this:
Warning messages:
1: In nadar::external_resources_import(dataset_idno = "ihsn-popstan-mics-2000", :
Resource file not found: /Users/m2/Downloads/ihsn-popstan-mics-2000/ihsn-popstan-mics-2000-stata.zip
If you don't have the RDF files available for your studies as in the step 2 or you have a different folder structure which does not allow to import and upload files in one step. You can use other functions available on NADAR to have your own workflow.
For external resources, you have two options:
- Import RDF file
- Create resource file from scratch
Import RDF file
Use the external_resources_import' function and set the parameter
skip_uploads` to TRUE.
resource_file_path='popstan/ihsn-popstan-mics-2000.rdf'
nadar::external_resources_import(
dataset_idno="ihsn-popstan-mics-2000",
rdf_file=resource_file_path,
skip_uploads = TRUE,
overwrite="yes"
)
Create resource file from scratch
You can create external resources without importing RDF files. To create a resource, use the NADAR function external_resources_add
.
Parameters
- idno:
Unique ID for the study
- dctype:
Resource document type
- see the API documentation for valid options - title:
Resource title
- dcformat:
Resource file format
- see the API documentation for valid options - author:
Author name
- dcdate:
Date using YYYY-MM-DD format
- country:
Country name
- language:
Language or Language code
- contributor:
Contributor name
- publisher:
Publisher name
- rights:
Rights
- description:
Resource detailed description
- abstract:
Resource abstract
- toc:
Table of contents
- file_path:
File path for uploading
- overwrite:
Overwrite if resource already exists
- Accepted values "yes", "no"
Example:
external_resources_add (
idno="ihsn-popstan-mics-2000",
dctype="Administrive document [doc/adm]",
title= "Resource title",
dcformat="Application/pdf",
author="Author name",
dcdate="2020/01/01",
country="USA",
language="english",
contributor="contributor name",
publisher="pubisher name",
rights="rights",
description="resource description",
abstract="abstract",
toc="table of contents",
file_path="path/to/file/publication.pdf",
overwrite="no")
4 - Set data access, and other options
TODO
5 - Upload a thumbnail
TODO