-
Notifications
You must be signed in to change notification settings - Fork 0
/
bia_mifa_models.yaml
602 lines (525 loc) · 20.2 KB
/
bia_mifa_models.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
---
id: https://w3id.org/BioImage-Archive/bia-mifa-models
name: bia-mifa-models
title: bia-mifa-models
description: |-
Schemas and models for working with the BioImage Archive's MIFA (Metadata, Incentives, Formats and Accessibility) metadata implementation
license: CC0
see_also:
- https://BioImage-Archive.github.io/bia-mifa-models
prefixes:
bia_mifa_models: https://w3id.org/BioImage-Archive/bia-mifa-models/
linkml: https://w3id.org/linkml/
schema: http://schema.org/
ORCID: https://orcid.org/
ROR: https://ror.org/
DOI: https://doi.org/
PMID: https://pubmed.ncbi.nlm.nih.gov/
WIKIDATA_PROPERTY: https://www.wikidata.org/prop/
pav: http://purl.org/pav/
example: https://example.org/
default_prefix: bia_mifa_models
default_range: string
imports:
- linkml:types
classes:
Study:
description: >-
General information about the study
mixins:
- PublicationsCollection
- AuthorCollection
- Links
- GrantReferenceCollection
slots:
- title
- description
- keywords
- license
- ai_models_trained
- acknowledgements
- funding_statement
Publications:
description: >-
Information about any publications associated with the dataset
slots:
- publication_title
- publication_authors
- publication_doi
- publication_year
- pubmed_id
id_prefixes:
- DOI
PublicationsCollection:
mixin: true
description: >-
A holder for Publications objects
attributes:
publications:
description: a collection of publications associated with a study
range: Publications
multivalued: true
inlined: true
Author:
description: >-
Information about the authors
mixins:
- OrganisationInfoCollection
slots:
- author_first_name
- author_last_name
- email
- orcid_id
- role
AuthorCollection:
mixin: true
description: >-
A holder for Author objects
attributes:
authors:
description: a collection of the authors of a study
range: Author
multivalued: true
inlined_as_list: true
OrganisationInfo:
description: Information about the organisation the author is affiliated with
slots:
- organisation_name
- address
- ror_id
OrganisationInfoCollection:
mixin: true
description: >-
A holder for OrganisationInfo objects
attributes:
organisation:
description: a collection of the name and address of organisations authors are affiliated with
range: OrganisationInfo
multivalued: true
inlined_as_list: true
GrantReference:
description: Information about grant ID and funding body that funded the study
slots:
- grant_id
- funder
GrantReferenceCollection:
mixin: true
description: >-
A holder for GrantReference objects
attributes:
grants:
description: a collection of grant ids and funder names associated with the study
range: GrantReference
multivalued: true
inlined_as_list: true
Links:
description: Links related to the study
mixin: true
slots:
- link_url
- link_description
FileLevelMetadata:
description: metadata atributes that must be detailed at the file level
slots:
- annotation_id
- annotation_type
- source_image_id
- transformations
- spatial_information
- annotation_creation_time
FileLevelMetadataCollection:
mixin: true
description: >-
A holder for FileLevelMetadata objects
attributes:
file_metadata:
description: a collection of the file level metadata for each annotation
range: FileLevelMetadata
multivalued: true
inlined_as_list: true
Annotations:
tree_root: true
description: >-
A set of annotations for an AI-ready dataset
mixins:
- AuthorCollection
- FileLevelMetadataCollection
slots:
- annotation_overview
- annotation_type
- annotation_method
- annotation_criteria
- annotation_coverage
- annotation_confidence_level
Version:
description: >-
Information about the dataset version
slots:
- version
- timestamp
- changes
- previous_version
slots:
title:
slot_uri: schema:title
description: The title for your dataset. This will be displayed when search results including your data are shown. Often this will be the same as an associated publication.
required: true
examples:
- value: An annotated fluorescence image dataset for training nuclear segmentation methods
- value: Embryonic mice ultrasound volumes with body and brain volume segmentation masks
description:
slot_uri: schema:description
description: Brief description of the detaset. Must contain the biological application of the dataset
required: true
examples:
- value: >-
Accurate quantification of nuclear features is critical for the characterization of tumours.
This dataset contains annotated fluorescent images of cell nuclei from a normal human skin cell line, and from different tumour samples from cancer patients.
This dataset can be used to train nuclear image segmentation algorithms.
- value: >-
The number of cells in an embryo at early stages of development can be used to understand its viability, however counting cells on DIC images is a challenging task.
This dataset contains annotated DIC images of mouse embryos to train machine learning models.
keywords:
slot_uri: schema:keywords
description: Keywords or tags used to describe the dataset
required: true
multivalued: true
examples:
- value:
- segmentation
- nuclei
- cancer
- value:
- image segmentation
- high-frequency ultrasound
- mouse embryo
- volumetric deep learning
license:
slot_uri: schema:license
description: The license under which the data are available
required: true
range: LicenseType
examples:
- value: CC0
- value: CC_BY
ai_models_trained:
description: Link to the models that have been trained with this dataset (if any). This could be links to GitHub or repositories like the BioImage Zoo.
range: uriorcurie
multivalued: true
examples:
- value: https://github.com/BIIFSweden/MariaUlvmar2020-1
- value:
- https://github.com/perlfloccri/NuclearSegmentationPipeline
- https://bioimage.io/#/?id=10.5281%2Fzenodo.6028097
funding_statement:
description: Description of how the study was funded
required: true
examples:
- value: >-
T.G. was supported by an EMBO Long-Term Fellowship (ALTF 1234-5678), and the European Union Horizon 2020 research and innovation programme
under the Marie Sklodowska-Curie grant agreement No 12345. This work was supported by a Cancer Research UK Programme Grant (DRCNPG-123\123456) to J.C.
- value: >-
This work was facilitated by an EraSME grant (project TisQuant) under the grant no. 844198 and by a COIN grant (project VISIOMICS)
under the grant no. 861750, both grants kindly provided by the Austrian Research Promotion Agency (FFG), and the St. Anna Kinderkrebsforschung.
Partial funding was further provided by BMK, BMDW, and the Province of Upper Austria in the frame of the COMET Programme managed by FFG.
grant_id:
description: The identifier for the grant
identifier: true
required: true
slot_uri: schema:identifier
examples:
- value: ALTF 1234-567
- value: "840767"
funder:
slot_uri: schema:funder
description: The funding body provididing support
required: true
examples:
- value: EMBO Long-Term Fellowship
- value: Marie Sklodowska-Curie Individual Fellowship
publication_title:
slot_uri: schema:title
description: The title for a publication associated to the dataset
required: true
examples:
- value: An annotated fluorescence image dataset for training nuclear segmentation methods
- value: "Deep Mouse: An End-to-End Auto-Context Refinement Framework for Brain Ventricle & Body Segmentation in Embryonic Mice Ultrasound Volumes"
publication_authors:
slot_uri: schema:author
description: Authors of associated publication
required: true
examples:
- value: Truman Grayson, Josiah Carberry
- value: J. Perez, F. Fulan, J. Doe, W. Wu and J. Novak
publication_doi:
identifier: true
slot_uri: WIKIDATA_PROPERTY:P356
range: uriorcurie
description: Digital Object Identifier (DOI)
examples:
- value: DOI:10.1242/jcs.033340
- value: DOI:10.1242/jcs.174946
publication_year:
description: Year of publication
examples:
- value: "2021"
- value: "1986"
pubmed_id:
slot_uri: WIKIDATA_PROPERTY:P698
range: uriorcurie
description: Identifier for journal articles/abstracts in PubMed
examples:
- value: PMID:18492790
- value: PMID:26136366
link_url:
description: URL of relevant link
range: uriorcurie
multivalued: true
required: true
examples:
- value:
- https://github.com/BioImage-Archive
- https://www.ebi.ac.uk/empiar/EMPIAR-10459/
link_description:
description: The description of the linked content
multivalued: true
examples:
- value:
- Image analysis code
- FIB SEM dataset matching the light microscopy data
acknowledgements:
description: Any people or groups that should be acknowledged as part of the dataset
examples:
- value: >-
We thank the Institute microscopy facility for support with image acquisition.
Flybase provided important information for this study.
- value: >-
We are grateful to Y. Papadopoulos for providing reagents and access to her confocal microscope.
We thank J. Dupont and E. Mustermann for comments on the manuscript, and G. Hong for useful discussions.
orcid_id:
slot_uri: WIKIDATA_PROPERTY:P496
range: uriorcurie
description: A unique identifier for an author
examples:
- value: ORCID:0000-0002-1825-0097
- value: ORCID:0000-0002-9079-593X
author_first_name:
slot_uri: schema:name
description: First name for an author
required: true
examples:
- value: Josiah
- value: Marie
author_last_name:
slot_uri: schema:name
description: Last name for an author
required: true
examples:
- value: Carberry
- value: "Skłodowska–Curie"
email:
slot_uri: schema:email
description: Email address of a person
pattern: "^\\S+@[\\S+\\.]+\\S+"
examples:
- value: jcarberry@example.com
- value: mariesc@example.com
organisation_name:
slot_uri: schema:name
description: Organisation name
required: true
examples:
- value: EMBL-EBI
- value: University of Toronto
ror_id:
range: uriorcurie
slot_uri: WIKIDATA_PROPERTY:P6782
description: Identifier for the Research Organization Registry
examples:
- value: ROR:02catss52
- value: ROR:03dbr7087
address:
slot_uri: schema:name
description: Organisation address
examples:
- value: Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK.
- value: 27 King's College Cir, Toronto, ON M5S, Canada.
role:
slot_uri: schema:roleName
description: Role of the author when creating the dataset
multivalued: true
examples:
- value:
- data annotation
- data acquisition
- conceptualization
- value: data curator
annotation_overview:
description: Short descriptive summary indicating the type of annotation and how it was generated
required: true
examples:
- value: Segmentation masks of human cell nuclei curated by experts from a model prediction
- value: Manual quantification of the number of cells in mouse embryos performed by experts
annotation_type:
required: true
range: AnnotationType
multivalued: true
description: Annotation type, for example class labels, segmentation masks, counts...
examples:
- value:
- segmentation_mask
- class_labels
- value: counts
annotation_method:
required: true
description: >-
Description of how the annotations were created. For example, were the annotations crowdsourced or expertly annotated,
produced by a human or software, what software was used, when were the annotations created,
protocols used for consensus and quality assurance
examples:
- value: >-
A machine learning-based framework was utilized to produce a first coarse annotation of nuclear contours.
Then, a disease expert curated all annotations by editing, adding, or removing polygons.
Finally, an expert pathologist revised all image annotations and a final version of the annotations was curated.
- value: >-
Cell counts were performed manually by expert biologists. For cases difficult to quantify,
a vote including all experts was taken and annotations were corrected according to the majority view.
annotation_criteria:
description: Rules used to generate annotations
examples:
- value: Only nuclei in focus were annotated, if parts of a nucleus were out of focus, only the part of the nucleus being in focus was annotated
- value: >-
During cell counting, cell division was tracked using anillin localization. Mitosis was considered completed when anillin stop localising
at the contractile ring and re-entered the cell nucleus, at which point the dividing cell was considered two different cells.
annotation_coverage:
description: Which images from the dataset were annotated, and what percentage of the data has been annotated from what is available
examples:
- value: All images were annotated
- value: "Images: image_1, image_2 and image_3 were not annotated due to the abnormal cell death rates compared to other samples."
source_image_id:
required: true
description: identifier of the source image from which the annotation originated
examples:
- value: IM1
- value: "12532806"
annotation_id:
description: The identifier for the annotation
identifier: true
required: true
slot_uri: schema:identifier
examples:
- value: AN1
- value: "12345678"
annotation_confidence_level:
description: Confidence on annotation accuracy
examples:
- value: Curators had 15 and 20 years of experience working in cancer pathology
- value: >-
To characterize the quality of gold standard segmentation annotations,
we calculated the average and standard deviation performance of the three independent annotators
relative to the gold truth that was established by merging the triplets of manual annotations.
For the whole dataset this metric was 0.904±0.081
spatial_information:
description: Spatial information for non-pixel annotations
examples:
- value: The pixels belonging to the mouse embryo related to this cell count are provided in the binary mask with annotation id AN1237946
- value: Each annotated object contains a category_id indicating the annotation class label, and an associated bounding box delineating the corresponding blod vessel
transformations:
description: Any transformations required to link the images to the annotations
examples:
- value: Rotation_angle = 37°
- value: Translation_horizontal_pixel_displacement = 150, Translation_vertical_pixel_displacement = 92
annotation_creation_time:
description: Date and time when the annotation was created
range: datetime
slot_uri: pav:authoredOn
examples:
- value: "2021-04-11T11:00:15-05:00"
- value: "2017-01-23T18:22:04-00:00"
version:
required: true
range: string
slot_uri: pav:version
description: Unique version number
examples:
- value: v1.1
- value: "3.0"
timestamp:
required: true
range: datetime
slot_uri: pav:authoredOn
description: Date and time when the version was created
examples:
- value: "2023-12-10T12:20:00-05:00"
- value: "2021-07-03T13:26:00-00:00"
changes:
range: string
description: Textual description of changes compared to previous version
examples:
- value: The segmentation mask for image 10 (segmentation_10.tif) has been updated with the annotation of two nuclei that were missing in the previous version
- value: New annotations for images 12047635 and 12063916 have been included in the dataset with annotation ids AN12047635 and AN12063916 respectively
previous_version:
range: uriorcurie
slot_uri: pav:previousVersion
description: Pointer to previous version
examples:
- value: example:version1.0
- value: example:version2.9
enums:
AnnotationType:
permissible_values:
class_labels:
description: tags that identify specific features, patterns or classes in images
todos:
- map this to an ontology
bounding_boxes:
description: rectangles completely enclosing a structure of interest within an image
todos:
- map this to an ontology
counts:
description: number of objects, such as cells, found in an image
todos:
- map this to an ontology
derived_annotations:
description: additional analytical data extracted from the images. For example, the image point spread function,the signal to noise ratio, focus information…
todos:
- map this to an ontology
geometrical_annotations:
description: polygons and shapes that outline a region of interest in the image. These can be geometrical primitives, 2D polygons, 3D meshes…
todos:
- map this to an ontology
graphs:
description: graphical representations of the morphology, connectivity, or spatial arrangement of biological structures in an image. Graphs, such as skeletons or connectivity diagrams, typically consist of nodes and edges, where nodes represent individual elements or regions and edges represent the connections or interactions between them
todos:
- map this to an ontology
point_annotations:
description: X, Y, and Z coordinates of a point of interest in an image (for example an object's centroid or landmarks).
todos:
- map this to an ontology
segmentation_mask:
description: an image, the same size as the source image, with the value of each pixel representing some biological identity or background region
todos:
- map this to an ontology
tracks:
description: annotations marking the movement or trajectory of objects within a sequence of bioimages
todos:
- map this to an ontology
weak_annotations:
description: rough imprecise annotations that are fast to generate. These annotations are used, for example, to detect an object without providing accurate boundaries
todos:
- map this to an ontology
other:
description: other types of annotations, please specify in the annotation overview section
todos:
- map this to an ontology
LicenseType:
permissible_values:
CC0:
description: No Copyright. You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission.
meaning: https://creativecommons.org/publicdomain/zero/1.0/
CC_BY:
description: >-
You are free to: Share — copy and redistribute the material in any medium or format. Adapt — remix, transform, and build upon the material
for any purpose, even commercially. You must give appropriate credit, provide a link to the license, and indicate if changes were made.
You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
meaning: https://creativecommons.org/licenses/by/4.0/