Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3D terms Proposal: new dc:type term and new ac terms #243

Open
magpiedin opened this issue Jun 3, 2022 · 1 comment
Open

3D terms Proposal: new dc:type term and new ac terms #243

magpiedin opened this issue Jun 3, 2022 · 1 comment

Comments

@magpiedin
Copy link
Contributor

magpiedin commented Jun 3, 2022

The 3D Task groups is proposing the addition of a new dc:type term "3D Digital Resource" and two AC terms ac:3DResourceType and ac:captureModality (definition linked below), with the following rationale:

Proposed vocabulary and term additions.

We propose a vocabulary addition to dc:type. Specifically the word "3D Resource" or "3D Digital Resource" is needed as 3D resources are a class unto themselves not strictly fitting under the definition of any other words: e.g., "data" or "image" are not sufficient as justified below.

In addition, we propose two Audubon Core terms with controlled vocabularies to qualify and categorize the nature of a given 3D resource: ac:3DResourceType and ac:captureModality.

Why not an existing dc or ac term?

Existing terms dc:type and ac:subtype use controlled vocabularies that do not allow adequate categorization and description of 3D resources. The proposed terms are also contrasted with definitions for dc:format and ac:resourceCreationTechnique, which are similarly limited. Our justification for adding new fields to more easily distinguish among the variety of 3D resource types is as follows:

dc:type
Two of the DCMI Type Vocabulary terms used in dc:type, “Image” and “Dataset”, fit some, but not all, 3D resource types.

  • The definition of “Image” (http://purl.org/dc/dcmitype/Image) allows for a broad range of “visual representations” currently, but this can misrepresent 3D resources. For example, raw CT scan data or constructed 3D models and other 3D resource types can be visually rendered, but they also consist of spatial and structural information that is not strictly visual.
  • The definition of “Dataset” (http://purl.org/dc/dcmitype/Dataset) would not technically misrepresent most 3D resources as “data in a defined structure”, but for many scientific communities it could misrepresent a 3D reconstruction as raw “reality capture data” when it is instead something more interpreted or created. “Dataset” may also be too broad or unconventional for resource-users and providers to find or label 3D-specific datasets in useful ways.

ac:subtype

  • Alternatively, a new word for 3D resources (e.g., "3D Resource" or "3D Digital Resource") could be added to the ac:subtype controlled vocabulary (https://ac.tdwg.org/subtype/). The issue here is that there is not necessarily a one-to-one correspondence between any given dc:type and all 3D resources. Therefore this would result in 3D Resource subtypes dis-aggregated under different type classes. It would be better to keep them all at the same level.

dc:format

  • The definition for dc:format (http://purl.org/dc/elements/1.1/format) and controlled vocabulary within Audubon Core (https://ac.tdwg.org/format/) capture the file extension, which does not always reflect the encoding of a file’s contents in a technical or more qualitative sense. For example, a ZIP file may contain a CT dataset or Photogrammetry image file set. The issue is common to video and audio file formats as well – e.g., the video content in an MP4 file needs to be encoded/decoded using one of a variety of codecs: h.264, MPEG-4, Apple ProRes 422, etc.

ac:resourceCreationTechnique

  • The purpose of this term is more flexible or verbose description of the steps in the resource’s creation process, rather than a controlled vocabulary to describe how the resource’s contents are encoded.

New Term Details

ac:3DResourceType

  • The type of file or the way the 3D information is encoded. It is not the same as format. An example of a type would be “mesh” which could be saved in a variety formats (.ply, .obj, .glb, etc.). Neither is “type” redundant with “modality”. Mesh files can be produced by a variety of modalities. Uses SKOS concept in RDF (see linked proposed ontology above).

ac:captureModality

  • The imaging/sensor and technology used to generate a 3D resource. Uses SKOS concept in RDF (see linked proposed ontology above).

Term Notes:

  • New AC Terms Proposal-v2.docx - proposed ac:3DResourceType definition in the Terms communicating file type and imaging modality section:

    • *Currently, ac:subtype has a controlled vocabulary specifying a variety of resources. We propose the addition of “Digital 3D Resource” to the list of terms. There are a variety of Digital 3d Resources created in a variety of ways as discussed above. We propose two terms with controlled vocabularies to specify this information. “3D Resource Type” and “Image Capture Modality”.
    • Specifically we are planning to handle information on “3D Resource Type” and 3D data “Image Capture Modality” with hierarchical SKOS concepts in RDF: https://www.w3.org/2004/02/skos/intro. Proposed ontologies are here.
  • Task Group Notes from 2022-1-27

    • In Audubon Core, formatting can use the MIME type or file extension, which is kind of dicey as the file extension is not enough to define the file types. This might not be this group’s problem, but there is an issue where Audubon Core says the file extension defines the file type, but the 3D work we are explicitly saying that this is not the case (there is a separate format and file extension field) - See page 5 in New AC terms proposal-v2.docx - Google Docs - I. Terms communicating file type and imaging modality
  • Original draft of this proposal

  • Related issues:

@magpiedin magpiedin changed the title Proposal for ac:3DResourceType - Justification & Rationale 3D terms Proposal: new dc:type term and new ac terms Jun 10, 2022
@AdamRountrey
Copy link

There are indeed some problems associated with the current dc:type options for 3D data. Users with 3D data to characterize with Dublin Core have no clear guidance on dc:type, and 3D data is a category that end users may be seeking exclusively. Currently, a metadata creator could more or less justifiably characterize a 3D mesh as an "image", a "dataset", or an "interactive resource", and the multiple potential characterizations could make it difficult for end users to find appropriate 3D resources.

I think that most general-purpose repositories use "dataset" for 3D data. To reduce the potential use of "image" or "interactive resource", perhaps we should also see if DCMI would be willing to explicitly mention 3D data as an example for "dataset"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants