-
Notifications
You must be signed in to change notification settings - Fork 0
/
VODataService.tex
2957 lines (2335 loc) · 108 KB
/
VODataService.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\documentclass[11pt,a4paper]{ivoa}
\input tthdefs
% widen up the display a bit so that 75 column listings still fit on
% the page
\usepackage[width=14cm,left=4cm]{geometry}
\usepackage{listings}
\lstloadlanguages{XML}
\lstset{flexiblecolumns=true,tagstyle=\ttfamily,showstringspaces=False}
\usepackage{amsmath}
\iftth
\newcommand{\tapschema}{TAP\_SCHE\-MA}
\hyphenation{TAP\_SCHEMA}
\hyphenation{\tapschema}
\newcommand{\tapupload}{TAP\_UPLOAD}
\else
\newcommand{\tapschema}{\mbox{%
TAP\discretionary{-}{}{\kern-2pt\_}SCHEMA}}
\newcommand{\tapupload}{%
TAP\discretionary{-}{}{\kern-2pt\_}UPLOAD}
\fi
\title{VODataService: A VOResource Schema Extension for Describing
Collections and Services}
\ivoagroup{Registry}
\author[http://www.ivoa.net/twiki/bin/view/IVOA/RayPlante]{Raymond Plante}
\author[http://www.ivoa.net/twiki/bin/view/IVOA/MarkusDemleitner]{Markus Demleitner}
\author[http://www.ivoa.net/twiki/bin/view/IVOA/AurelienStebe]{Aurélien Stébé}
\author[http://www.ivoa.net/twiki/bin/view/IVOA/KevinBenson]{Kevin Benson}
\author[http://www.ivoa.net/twiki/bin/view/IVOA/PatrickDowler]{Patrick Dowler}
\author[http://www.ivoa.net/twiki/bin/view/IVOA/MatthewGraham]{Matthew Graham}
\author[http://www.ivoa.net/twiki/bin/view/IVOA/GretchenGreene]{Gretchen Greene}
\author[http://www.ivoa.net/twiki/bin/view/IVOA/PaulHarrison]{Paul Harrison}
\author[http://www.ivoa.net/twiki/bin/view/IVOA/GerardLemson]{Gerard Lemson}
\author[http://www.ivoa.net/twiki/bin/view/IVOA/TonyLinde]{Tony Linde}
\author[http://www.ivoa.net/twiki/bin/view/IVOA/GuyRixon]{Guy Rixon}
\editor{Markus Demleitner}
\editor{Ray Plante}
\previousversion[https://www.ivoa.net/documents/VODataService/20211102]{REC-1.2}
\previousversion[https://ivoa.net/documents/VODataService/20190715/PR-VODataService-1.2-20210223.html]{PR-2021-02-23}
\previousversion[https://ivoa.net/documents/VODataService/20190715/PR-VODataService-1.2-20190715.html]{PR-2019-07-15}
\previousversion[https://ivoa.net/documents/VODataService/20181026/]{WD-2018-10-26}
\previousversion[http://www.ivoa.net/Documents/VODataService/20101202]{REC
1.1}
\begin{document}
\begin{abstract}
VODataService refers to an XML encoding standard for a specialized
extension of the IVOA Resource Metadata that is useful for describing
data collections and the services that access them. It is defined as
an extension of the core resource metadata encoding standard known as
VOResource \citep{2018ivoa.spec.0625P} using XML Schema.
The specialized resource types defined by the VODataService schema
allow one to describe how the data underlying the resource cover the
sky and the frequency and time axes.
VODataService also enables detailed
descriptions of tables that includes information useful to the
discovery of tabular data. It is intended that the VODataService data
types will be particularly useful in describing services that support
standard IVOA service protocols.
\end{abstract}
\section*{Acknowledgments}
Versions 1.0 and 1.1 of this document have been developed with support from the
National Science Foundation's
Information Technology Research Program under Cooperative Agreement
AST0122449 with The Johns Hopkins University, from the
UK Particle Physics and Astronomy
Research Council (PPARC), from the European Commission's (EC)
Sixth
Framework Programme via the
Optical Infrared Coordination Network (OPTICON), and from EC's
Seventh Framework Programme
via its
eInfrastructure Science Repositories initiative.
Version 1.2 of this document was developed in part with support from the
German federal ministry for research and education's e-inf-astro project (BMBF
FKZ 05A17VH2).
\section*{Conformance-related definitions}
The words ``MUST'', ``SHALL'', ``SHOULD'', ``MAY'', ``RECOMMENDED'', and
``OPTIONAL'' (in upper or lower case) used in this document are to be
interpreted as described in IETF standard RFC2119 \citep{std:RFC2119}.
The \emph{Virtual Observatory (VO)} is a
general term for a collection of federated resources that can be used
to conduct astronomical research, education, and outreach.
The \href{http://www.ivoa.net}{International
Virtual Observatory Alliance (IVOA)} is a global
collaboration of separately funded projects to develop standards and
infrastructure that enable VO applications.
\section*{Syntax Notation Using XML Schema}
The eXtensible Markup Language, or XML, is a document syntax for marking
textual information with named tags and is defined by \citet{std:XML}.
The set of XML tag names and the syntax
rules for their use is referred to as the document schema. One way to
formally define a schema for XML documents is using the W3C standard
known as XML Schema \citep{std:XSD}.
The XML Schemas of VODataService as well as VOResource and its other
extensions are
available from the IVOA schema
repository\footnote{\url{http://www.ivoa.net/xml}} at any time.
Parts of the schema appear within the main sections of this document;
however, documentation nodes have been left out for the sake of brevity.
Where the content of the pieces of schema embedded in this text
diverges from the schema document in the IVOA document
repository, the version in the schema repository is authoritative.
References to specific elements and types defined in the VOResource
schema include the namespace prefix \xmlel{vr} as in
\xmlel{vr:Resource} (a type defined in the VOResource schema).
\section{Introduction}
The VOResource standard \citep{2018ivoa.spec.0625P} provides a means of
encoding resource metadata as defined by DataCite \citep{std:DataCite40}
and the VO-specific IVOA Resource Metadata \citep{2007ivoa.spec.0302H} in XML.
VOResource uses XML Schema \citep{std:XSD} to define
most of the XML syntax rules (while a few of the syntax rules are
outside the scope of Schema). VOResource also describes mechanisms
for creating extensions to the core VOResource metadata. This allows
for the standardization of new metadata for describing specialized
kinds of resources in a modular way without deprecating the core
schema or other extensions.
This document defines one such extension referred to as VODataService.
It provides types to define data services, their underlying tabular
structures, their service interfaces, and the location of the data
served in space, time, and energy.
The remainder of the document introduces the use cases addressed by this
specification, then provides a high-level overview over the concepts and
usage patterns in sect.~\ref{sect:model} and finally discusses the
concrete classes in sect.~\ref{sect:metadata}.
\subsection{The Role in the IVOA Architecture}
\begin{figure}
\centering
\includegraphics[width=0.9\textwidth]{role_diagram.pdf}
\caption{Architecture diagram for VODataService}
\label{fig:archdiag}
\end{figure}
Fig.~\ref{fig:archdiag} shows the role VODataService plays within the
IVOA Architecture \citep{2010ivoa.rept.1123A}.
VODataService directly depends on the following other VO standards
(unless specified otherwise, the dependency is on the major version of
the cited standard rather than on the exact version):
\begin{description}
\item[VOResource, v1.1 \citep{2018ivoa.spec.0625P}] VOResource gives
the fundamental types and structures extended here.
\item[STC, v1.33 \citep{2007ivoa.spec.1030R}] The deprecated mechanism
for declaring coverage through STCResourceProfile still uses concepts
from version 1 of the IVOA data model for Space-Time Coordinates. The
updated mechanism has no such dependence any more.
\end{description}
VODataService is closely related to the following other VO standards:
\begin{description}
\item[VOSI, v1.1 \citep{2017ivoa.spec.0524G}] VODataService defines the
schema for the responses on the table metadata endpoint. It also
defines the ParamHTTP interface type currently used in the capabilities of most
standard protocols.
\item[RegTAP, v1.1 \citep{2019ivoa.spec.1011D}] RegTAP maps the concepts
defined here into a relational structure. In that sense it is the
user interface to what is specified here. RegTAP will need an update
to support the space-time constraints added here.
\item[MOC, v1.1 \citep{2019ivoa.spec.1007F}] Multi-Order coverage maps
are used by VODataService to communicate spatial coverage.
\end{description}
\subsection{Purpose}
The purpose of this extension is to define common XML Schema
types -- in particular new resource types -- that are useful for describing
data collections and services that access data. One aspect of such a
description is the resource's \emph{coverage}: the parts of the
sky with which the data is associated and the time and frequency ranges that
were observed or modeled to create the data. Another important aspect
is the detailed metadata for tables underlying the resource, including
names, types, UCDs
\citep{2005ivoa.spec.1231D}, units,
and textual descriptions for the columns making them up.
Resource records using VODataService types are commonly used to register
services that support standard IVOA data access layer protocols such
as Simple Image Access \citep{2015ivoa.spec.1223D} and Simple Cone Search
\citep{2008ivoa.specQ0222P}. As of October 2018, there are more than
20000~resources of type \xmlel{vs:CatalogService} in the VO Registry.
While other VOResource extensions
define the protocol-specific metadata (encapsulated as a standard
\emph{capability} from VOResource), the general service
resource description shares the common data concepts such as
coverage and tabular data. Note, however, that a service described
using the VODataService schema need not support any standard
protocols. With the VODataService extension schema plus the core
VOResource schema, it is possible to describe a custom service
interface that accesses data.
As a legal extension of VOResource, the use
of VODataService is subject to the rules and recommendations for XML
\citep{std:XML}, XML Schema \citep{std:XSD},
and VOResource itself.
\subsection{Additional Use Cases for Version 1.2}
In the following, we collect use cases that guided the development of
VODataService to its version 1.2. We do not formally derive
requirements from them but briefly note which new features enable or
facilitate the specific use case.
A few of the changes in version 1.2 are necessary for consistency with other standards
such as TAP (extendedType interpretation, requirement to use ADQL
delimited identifier literals in names where appropriate) or VOTable
(arraysize interpretation). These were obviously not guided by specific
use cases.
\paragraph{What services have data for the Crab nebula covering the
H$\boldsymbol\alpha$
line taken in the second half of 2015?} In version 1.1, this use case
would have been covered by the \xmlel{stc:STCResourceProfile} type,
which, however, was never properly standardised or widely adopted. In the current
version, the \xmlel{spatial}, \xmlel{spectral}, and \xmlel{temporal}
children of \xmlel{coverage} enable discovery by coverage on the various
axes. It is worth noting that the spectral coverage is for the solar
system barycenter, so this use case does \emph{not} immediately enable
the discovery of, say, H$\alpha$ images of remote galaxies. Redshift
correction has to be applied by the client based on knowledge about the
object(s) investigated. At the time of writing, coverage also does not
directly address non-celestial reference systems, although in particular
planetary surfaces are considered in scope, and the coverage element's
\xmlel{@frame} attribute is defined to ensure non-ICRS coverages can
safely be declared as the need arises.
\paragraph{Find all ObsCore services publishing data taken at the
Telescope X.} This use case could be satisfied in version 1.1 through
the use of \xmlel{vs:DataCollection} records and relationships to the
respective TAP services. However, this scheme led to error-prone query
patterns, and few such data collections were actually registered; see
the IVOA Note on discovering data collections \citep{2019ivoa.spec.0520D} for
details. To better support the scheme proposed there, version 1.2 adds
the \xmlel{vs:DataResource} and \xmlel{vs:CatalogResource} types
that identify a resource as data-like but
permits the addition of various capabilities to the record (which
\xmlel{vs:DataCollection} did not). An analogous use case would be
``Find all TAP services publishing tables from Gaia DR2''.
\paragraph{Find a large-scale survey of sources between 20 and 40 GHz.}
While the spectral constraint is easily satisfied by the new coverage
children, the ``large-scale'' part is much harder to operationalize.
However, the plain table size often is a useful proxy in such discovery
problems. The new \xmlel{nrows} child of \xmlel{vs:Table} communicates
it.
\subsection{Additional Use Case for Version 1.3}
\paragraph{Find services serving time series.} In the previous registry
model, searches for a certain kind of data product were linked to
searches for certain kinds of services. For instance, clients looking
for spectra were querying the registry for SSAP services. This model
was severely outdated after ObsTAP services offered -- say -- spectra,
too. Also SSAP services increasingly served time series as well. Thus,
when the IVOA product type
vocabulary\footnote{\url{http://www.g-vo.org/rdf/product-type}} became
available, its adoption by VODataService was natural. It is now used
to let data collections and services explicitly declare which sort of
data they contain or serve.
\subsection{Additional Use Cases for Future Versions}
The following use cases were originally envisioned for VODataService
1.2, but were postponed because building multiply implemented solutions
for them seemed likely to unnecessarily delay the standardisation of, in
particular, the STC part of the present document. They will likely be
addressed by future versions of VODataService.
\paragraph{Find a resource that has sources in M51 down to 27 mag in V.}
The constraint about finding a resource that has V magnitudes for M51 is
expressible using spatial coverage and the column's UCDs. To express
something like ``down to $27^{\rm m}$'' one would at least need
VOTable-style \xmlel{VALUES} children for columns; however, metadata
sufficient to address the next use case would certainly be sufficient
here as well.
\paragraph{Plan a cross-service query.} Systems like OGSA-DAI
\citep{2011ASPC..442..579H} perform orchestration of SQL-like queries
between multiple services automatically, in particular cross-service
JOINs. In order to work efficiently, such services need column
statistics like histograms and the percentage of NULL values.
\paragraph{Facilitate discovery of full DALI services.} The issue here
is that DALI forsees synchronous and asynchronous endpoints as the
standard case for many protocols -- it already is standard for TAP.
Also, several auxiliary endpoints (mostly defined in VOSI) are declared as
separate capabilities and need to be matched with the functional
endpoints. This matching is becoming a problem when multiple
authentication schemes or mirror sites necessitate multiple sync/async
pairs. A longer treatment of this problem has been published while this
document was in WD \citep{note:caproles}, but at the time of writing the
process of finding consensus has just begun, so again a normative
solution has to be deferred to a later version of this standard.
\section{The VODataService Data Model}
\label{sect:model}
The VODataService extension in general enables the description of two
types of resources: Services that access data on the one side, and data
being accessed through services on the other.
For simple services just publishing a simple resource -- still a fairly
common pattern -- the metadata of the data published can be folded into
the service record.
Here is an example of such a record (abbreviated for the
purposes of illustration) that describes a service from the NASA
Extragalactic Database (NED) that provides measured redshifts for a
given object.
\lstinputlisting[language=XML,basicstyle=\footnotesize,numbers=left]{ipac-resource.xml}
This example illustrates some of the features of the VODataService
extension:
\begin{enumerate}
\item The specific type of resource indicated by
the value of the \xmlel{xsi:type} attribute; in this case
\xmlel{vs:CatalogService} indicates that this is
describing a service that accesses tabular data (line 1).
\item The extra namespace declaration for
VODataService metadata with the canonical prefix (line 5).
\item The location of the VOResource-related schema
documents used by this description (line 7ff.)
\item The core VOResource metadata (line 12ff.)
\item An interface described by the
VODataService type \xmlel{vs:ParamHTTP}; this
type can indicate input arguments the service supports (line
40ff.)
\item A description of the
coverage, including quantitative coverage
plus waveband keywords (line 62ff.)
\item A description of the table that is returned
by the service (line 73ff.)
\end{enumerate}
\subsection{The Schema Namespace and Location}
The namespace associated with VODataService extensions is
$$\mbox{\texttt{http://www.ivoa.net/xml/VODataService/v1.1}}.$$
As required by the IVOA schema versioning policies
\citep{2018ivoa.spec.0529H}, this namespace is identical to the one
associated with version 1.1 of this document. It is regrettable that a
misleading minor version is present in the namespace URI, but dropping
it would break existing software for creating and processing
VODataService instance documents. Hence, the namespace URI ending in
\verb|1.1| is also used for schema versions 1.2, 1.3, and so forth.
Resolving the VODataService namespace URI will redirect to a schema
document having the actual version number (for the schema associated
with this document version, this will end in VODataService-1.3.xsd).
Following the schema versioning policies, the minor version will be
declared in the \xmlel{version} attribute of this file's root element.
This information should not in general be used in production software;
all versions with the above schema URI are compatible with each
other in the sense defined in the IVOA schema versioning policies.
Authors of VOResource instance documents may choose to
provide a location for the VOResource XML Schema document and its
extensions using the
\xmlel{xsi:schemaLocation} attribute. While authors are free to
choose a location (as long as it resolves to the schema document), this
specification
recommends using the VODataService namespace URI as its location URL
(as illustrated in the example above), as in
\begin{verbatim}
xsi:schemaLocation="http://www.ivoa.net/xml/VODataService/v1.1
http://www.ivoa.net/xml/VODataService/v1.1"
\end{verbatim}
The canonical prefix for VODataService is \xmlel{vs}; this means, in
particular, that in non-XML contexts (e.g., relational mappings
like RegTAP) the VODataService types \emph{must} be qualified with
vs:.
As recommended by the VOResource standard, the
VODataService schema sets \xmlel{element\-Form\-Default} to \emph{unqualified}.
This means that element names defined
in this schema may not be used with a namespace prefix.
The only place the namespace prefix must be used is the
type names given as the value of an
\xmlel{xsi:type} attribute.
\subsection{Summary of Metadata Concepts}
\label{sect:summ}
VODataService defines several resource types and some auxiliary classes
necessary to describe resources of these new types.
\subsubsection{Auxiliary Classes}
The VODataService type \xmlel{vs:Coverage} allows the declaration of
the physical coverage of a resource on the sky (or on spherical bodies),
in time, and in the energy of the messenger particle. In addition, the
element should contain a rough indication of the messenger type
(e.g., ``Optical''), and it can contain a link to a footprint endpoint and an
indication of the spatial resolution within a service.
VODataService has several classes for the declaration of the tabular
structure underlying a service; this can be the tables in a
TAP-accessible resource,
or it can relate to the data structure returned by the service. This
metadata is held in
\xmlel{vs:TableSet}-typed elements consisting
of one or more (possibly anonymous)
\xmlel{vs:TableSchema} instances. These have some very basic metadata
(name, title, description, utype) and otherwise contain \xmlel{vs:Table}
instances. These in turn have simple metadata, but mainly give column
metadata (including UCDs and units) in \xmlel{vs:TableParam}-typed
children. These are the basis for enabling discovery queries like
``find all resources having redshifts and far infrared fluxes''.
Note that table and schema metadata is deliberately shallow. If the
main resource metadata is not enough to discover the table -- as is
fairly typical when a TAP service contains multiple unrelated tables --,
data providers should define separate records for them as described in
sect.~\ref{sect:discoverdata}.
VODataService further defines a specialized interface type
(inheriting from \xmlel{vr:Interface}) called
\xmlel{vs:ParamHTTP}. This type is used to describe
straightforward HTTP interfaces directly operating on
arguments encoded as
\emph{name=value} pairs. Such interface declarations can
enumerate a service's input arguments, which enables clients
to generate simple
user interfaces from VOSI capabilities responses.
To be able to express the types of table columns and service arguments
alike, VODataService defines several type systems. All of these are
basically just enumerations of type names, possibly with some additional
metadata like VOTable-style array sizes. In new resource records, only
\xmlel{vs:SimpleDataType} (for ParamHTTP parameters on non-DALI
interfaces) and
\xmlel{vs:VOTableType} (for table columns) should be used.
\subsubsection{VODataService Resource Classes}
\label{sect:rescls}
\begin{figure}
\includegraphics{resclasses.pdf}
\caption{The four major resource classes in VODataService and their
derivation tree}
\label{fig:rescls}
\end{figure}
While no common VO service discovery pattern relies on the XSD type of the
resource,\footnote{Of course, in non-service discovery (e.g., authorities
or standards), resource types are important.} resource
record authors should
nevertheless choose appropriate types for their resources. At the
very least, this helps Registry maintenance.
VODataService provides four major resource classes; their derivation
tree is shown in Fig.~\ref{fig:rescls}. The vertical distinction in
that figure reflects whether a resource has an associated tableset
(DataX vs.~CatalogX); you would typically use the DataX types when a
resource does not naturally admit a relational structure. The classical
example for this would be a collection of files on an FTP server. CatalogX,
on the other hand, has an associated tableset. That does not mean that
it is limited to what is conventionally thought of as a ``catalogue'',
i.e., a table of data on astronomical objects. On the
contrary, CatalogX should also be used for collections of images,
spectra, time series, etc, as long as their metadata is sufficiently
structured. That a data collection is published through the standard
IVOA discovery protocols (ObsCore,
SIAP, SSAP) certainly is a strong indication that this requirement is
satisfied.
The horizontal distinction (XResource vs.~XService) is somewhat more
subtle and will be discussed in sect.~\ref{sect:discoverdata}.
Two further resource classes defined VODataService 1.1,
\xmlel{vs:DataCollection} and \xmlel{vs:StandardSTC},
are deprecated in version 1.2.
Resource record authors are requested to migrate or discard resource
records using these deprecated types. If all such records have
disappeared from the VO by version 1.3 of this specification, their
type declarations may be removed from the schema.
\subsubsection{Discovering Data Within Other Services}
\label{sect:discoverdata}
The content models of CatalogResource and CatalogService are identical.
To understand why the two classes are present nevertheless, a short
historical excursion is in order. A full treatment of this is found in
the IVOA Endorsed Note on discovering data collections within services
\citep{2019ivoa.spec.0520D}, hereafter referred to as DDC.
In the early VO, there was almost throughout a $1:1$ relationship between
data collections and services. For instance, a catalogue of variable
stars had a single cone search interface, and the spectra from a given
spectrograph had a single SSA interface. This is the model reflected by
the CatalogService class, and it was, by and large, sufficient for
IVOA's ``simple'' protocols (SCS, SIAP, SSAP, SLAP), although even these
protocols have facilities for multi-collection services.
With the advent of services very naturally publishing what clearly
are multiple different resources -- TAP, with its multiple tables, and
Obscore building on top of it are the prime examples --, this model
proved inadequate; furthermore, publishers increasingly offered data
through multiple interfaces: An object catalogue now quite typically is
published as both a cone search service and within a TAP service; an
image collection could have both SIAP and Obscore interfaces.
Thus, the relationship between data collections and services gradually
became $n:m$. With this came the realisation that two classes of
discovery need to be supported; for all-VO queries, validation, and the
like, an enumeration of all \emph{services} (``give me all SSAP
services\dots'') needs to be performed. For data discovery (``Where are
images from instrument X of object Y?''), a selection over all
\emph{collections} needs to be made.
The solution eventually adopted for these problems is auxiliary
capabilities as introduced in DDC. They provide stubs merely
identifying an access protocol -- at the same time identifying the
capability as an auxiliary one -- and access URLs, delegating all
further service metadata to a separate record, linked to the original
resource through an \emph{isServedBy} relationship.
Thus, CatalogService-typed records should be used
\begin{itemize}
\item when a service serves a single data collection, in which case the
metadata on the record describes the data collection (e.g., ``These are
observations of\dots'').
\item for a service serving multiple data collections, in which case the
metadata on the record describes the service (e.g., ``This is the TAP
service of\dots''').
\end{itemize}
CatalogResource-typed records should be used for resources that only
have auxiliary capabilities.
Here is a sketch of a resource record of a table within a TAP service::
\begin{lstlisting}[language=XML,basicstyle=\footnotesize]
<ri:Resource
xsi:type="vs:CatalogResource"
<title>Sample Table</title>
<identifier>ivo://example/sample</identifier>
<curation>...</curation>
<content> ...
<relationship>
<relationshipType>IsServedBy</relationshipType>
<relatedResource ivo-id="ivo://example/tap"
>Example TAP service</relatedResource>
</relationship>
</content>
<capability standardID="ivo://ivoa.net/std/TAP#aux">
<interface role="std" xsi:type="vs:ParamHTTP">
<accessURL use="base"
>http://example.org/svcs/tap</accessURL>
</interface>
</capability>
<coverage>...</coverage>
<tableset>
<schema>
<name>someschema</name>
<table>
<name>someschema.sample</name>...
...
</table>
</schema>
</tableset>
</ri:Resource>
\end{lstlisting}
A complete example can be found in DDC.
Note that it is legal to add auxiliary capabilities to CatalogService
records as well. The classical example could be a cone search service
serving a single catalogue that is also queriable within a larger TAP
service.
Analogous considerations would apply to DataResource versus DataService,
although at this point no obvious use cases have been identified;
DataResource was included mainly for symmetry with CatalogResource.
\section{The VODataService Metadata}
\label{sect:metadata}
This section enumerates the types and elements defined in the
VODataService extension schema and describes their meaning.
\subsection{VODataService Resource Types}
\label{sect:resext}
For an overview of the systematics of the following resource types,
please see Sect.~\ref{sect:rescls}.
\subsubsection{DataResource}
\label{sect:DataResource}
The \xmlel{vs:DataResource} resource type is used to describe a
resource containing generic astronomical data without a dominating
tabular structure. An example might be a largely unstructured archive
of various observations. Resources that have structured metadata tables
(like most VO services) or are tabular in nature (like usual
astronomical catalogues) should use \xmlel{vs:CatalogResource}.
The type is derived from \xmlel{vr:Service}, which means that instances
can have
capabilities. For \xmlel{vs:DataResource}, only auxiliary capabilities
(e.g., DataServices serving multiple DataResources) or plain capabilities
with \xmlel{vr:WebBrowser} interfaces should be given. Resources with a
primary access mode dedicated to the resource's data content should use
\xmlel{vs:DataService}-typed resources.
In addition to \xmlel{vr:Service}'s content, DataResource adds
elements for declaring the observing facilities and/or instruments used
to obtain the data, and it supports the declaration of
the physical coverage of data via the \xmlel{coverage}
element.
% GENERATED: !schemadoc VODataService-v1.3.xsd DataResource
\begin{generated}
\begingroup
\renewcommand*\descriptionlabel[1]{%
\hbox to 5.5em{\emph{#1}\hfil}}\vspace{2ex}\noindent\textbf{\xmlel{vs:DataResource} Type Schema Documentation}
\noindent{\small
A resource publishing astronomical data.
\par}
\noindent{\small
This resource type should only be used if the resource has no
common underlying tabular schema (e.g., an inhomogeneous archive).
Use CatalogResource otherwise.
\par}
\vspace{1ex}\noindent\textbf{\xmlel{vs:DataResource} Type Schema Definition}
\begin{lstlisting}[language=XML,basicstyle=\footnotesize]
<xs:complexType name="DataResource" >
<xs:complexContent >
<xs:extension base="vr:Service" >
<xs:sequence >
<xs:element name="facility" type="vr:ResourceName" minOccurs="0"
maxOccurs="unbounded" />
<xs:element name="instrument" type="vr:ResourceName" minOccurs="0"
maxOccurs="unbounded" />
<xs:element name="coverage" type="vs:Coverage" minOccurs="0" />
<xs:element name="productTypeServed" type="xs:token" minOccurs="0"
maxOccurs="unbounded" />
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
\end{lstlisting}
\vspace{0.5ex}\noindent\textbf{\xmlel{vs:DataResource} Extension Metadata Elements}
\begingroup\small\begin{bigdescription}\item[Element \xmlel{facility}]
\begin{description}
\item[Type] string with ID attribute: vr:ResourceName
\item[Meaning]
The observatory or facility used to collect the data
contained or managed by this resource.
\item[Occurrence] optional; multiple occurrences allowed.
\end{description}
\item[Element \xmlel{instrument}]
\begin{description}
\item[Type] string with ID attribute: vr:ResourceName
\item[Meaning]
The instrument used to collect the data contain or
managed by a resource.
\item[Occurrence] optional; multiple occurrences allowed.
\end{description}
\item[Element \xmlel{coverage}]
\begin{description}
\item[Type] composite: \xmlel{vs:Coverage}
\item[Meaning]
Extent of the content of the resource over space, time,
and frequency.
\item[Occurrence] optional
\end{description}
\item[Element \xmlel{productTypeServed}]
\begin{description}
\item[Type] string: \xmlel{xs:token}
\item[Meaning]
Collections of data products (e.g., images or spectra)
or services serving them define the type of products
they contain or serve here. This information is intended
to enable global product discovery. Hence, services
failing to define their product type(s) may be skipped
by clients in global discovery. Product type names must
be taken from the IVOA vocabulary
http://www.ivoa.net/rdf/product-type.
\item[Occurrence] optional; multiple occurrences allowed.
\end{description}
\end{bigdescription}\endgroup
\endgroup
\end{generated}
% /GENERATED
\subsubsection{DataService}
The \xmlel{vs:DataService} resource type has the same content model as
\xmlel{vs:DataResource}. It should be used instead of
\xmlel{vs:DataResource} when the resource's capabilities give
access to (essentially) only the resource's data, as for instance
``ftp service for the XY instrument'', or for a service giving access
to multiple resources; in this latter case, the resource-level
metadata should pertain to the service itself rather than any specific
data collection served by it.
As with \xmlel{vs:CatalogResource}, instances SHOULD
declare the metadata of the table(s) underlying the service or
delivered by it in a \xmlel{tableset} element.
Whenever the service operates on clearly definable tabular
data, the resource should use the \xmlel{vs:CatalogService} resource type
in preference to \xmlel{vs:DataService}, and that tabular structure
should be given in a table set.
DataService's content model is exactly the one of
\xmlel{vs:DataResource}; please refer to sect.~\ref{sect:DataResource}
for the motivation of this duplication.
\subsubsection{CatalogResource}
\label{sect:CatalogResource}
The \xmlel{vs:CatalogResource} resource type is used to describe a
resource containing astronomical data or metadata in a set of one or
more tables. It extends \xmlel{vs:DataResource} and thus has metadata
on coverage as well as the facilities and instruments that produced the
resource. Additionally, \xmlel{vs:CatalogResource} instances SHOULD
declare of the metadata of the table(s) making up the data
\xmlel{tableset} element.
As with \xmlel{vs:DataResource}, this type should only be used if all
capabilities declared in the resource are either auxiliary or
nonstandard. This is typically the case for catalogues or data
collections within larger TAP, ObsTAP, or perhaps SIAP services. When
a service only publishes a single resource, use the
\xmlel{vs:CatalogService} type.
% GENERATED: !schemadoc VODataService-v1.3.xsd CatalogResource
\begin{generated}
\begingroup
\renewcommand*\descriptionlabel[1]{%
\hbox to 5.5em{\emph{#1}\hfil}}\vspace{2ex}\noindent\textbf{\xmlel{vs:CatalogResource} Type Schema Documentation}
\noindent{\small
A resource giving astronomical data in tabular form.
\par}
\noindent{\small
While this includes classical astronomical catalogues,
this resource is also appropriate for collections of observations
or simulation results provided their metadata are available
in a sufficiently structured form (e.g., Obscore, SSAP, etc).
\par}
\vspace{1ex}\noindent\textbf{\xmlel{vs:CatalogResource} Type Schema Definition}
\begin{lstlisting}[language=XML,basicstyle=\footnotesize]
<xs:complexType name="CatalogResource" >
<xs:complexContent >
<xs:extension base="vs:DataResource" >
<xs:sequence >
<xs:element name="tableset" type="vs:TableSet" minOccurs="0" >
<xs:unique name="CatalogService-schemaName" >
<xs:selector xpath="schema" />
<xs:field xpath="name" />
</xs:unique>
<xs:unique name="CatalogService-tableName" >
<xs:selector xpath="schema/table" />
<xs:field xpath="name" />
</xs:unique>
</xs:element>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
\end{lstlisting}
\vspace{0.5ex}\noindent\textbf{\xmlel{vs:CatalogResource} Extension Metadata Elements}
\begingroup\small\begin{bigdescription}\item[Element \xmlel{tableset}]
\begin{description}
\item[Type] composite: \xmlel{vs:TableSet}
\item[Meaning]
A description of the tables that are accessible
through this service.
\item[Occurrence] optional
\item[Comment]
Each schema name must be unique within a tableset.
\end{description}
\end{bigdescription}\endgroup
\endgroup
\end{generated}
% /GENERATED
\subsubsection{CatalogService}
The \xmlel{vs:CatalogService} resource type is used to describe a
service publishing astronomical data or metadata based on tabular
representations. Its relationship to \xmlel{vs:CatalogResource}
is as between \xmlel{vs:DataResource}
and \xmlel{vs:DataService}: Use \xmlel{vs:CatalogService} when either
the resource's capabilities are exclusive to the resource described by
the resource-level metadata (``SSAP service for the XY instrument'') or
if the service publishes multiple other CatalogResources; in that latter
case, again, the resource-level metadata should not refer to any
specific data collection contained.
This is the type that should be used for one-resource VO
services.
CatalogService's content model is exactly the one of
\xmlel{vs:CatalogResource}; please refer to sect.~\ref{sect:CatalogResource}
for the motivation of this duplication.
\subsubsection{DataCollection}
\label{sect:datacollection}
The \xmlel{vs:DataCollection} type is deprecated and should no longer be
used in new resource records. It was used in version 1.1 to define
simple collections of data, much like \xmlel{vs:CatalogResource} in the
current version. It did not admit capabilities, though, and since the
addition of capabilities would essentially have created a CatalogService
clone with a different child sequence, it was decided to abandon rather
than repair it.
Existing \xmlel{vs:DataCollection}-typed
records should be migrated to records of type \xmlel{vs:DataResource} or
\xmlel{vs:CatalogResource} as appropriate. If \xmlel{accessURL}
children are present in the
\xmlel{vs:DataCollection}\/s,
they can be replaced with a plain capability with a
\xmlel{vr:WebBrowser}-typed interface.
This type may be removed from the schema when all resource records using
it have vanished from the VO Registry.
\subsubsection{StandardSTC}
The \xmlel{vs:StandardSTC} type is deprecated and should no longer be
used in new resource records. Since the XML serialisation of the STC 1
data model was never promoted to an IVOA recommendation, there also is
no properly standardised way of creating such records. Since no such
records ever existed in the Registry, this type will probably be removed
from the schema in version 1.3 of this specification.
\subsection{Coverage in Space, Time, and Spectrum}
\label{sect:cover}
The \xmlel{vs:Coverage} type summarily describes how the data served is
distributed on the
sky, in energy, and in time.
In addition, there is the \xmlel{waveband} element that originally
contained a qualitative indication of the location of the resource's
coverage on the electromagnetic spectrum. As more resources cover
non-electromagnetic messengers, the element's meaning has somewhat
shifted, and it now admits values from an extendable vocabulary of
messengers\footnote{\url{http://www.ivoa.net/rdf/messenger}} that, for
instance, includes Neutrinos.
Historically, the quantitative footprints were expected to be given in
the element of type \xmlel{stc:STCResourceProfile}\footnote{Defined in
\url{http://www.ivoa.net/xml/STC/stc-v1.30.xsd}}. As discussed in
\citet{note:regstc}, this expectation turned out to be erroneous,
and the underlying standard, STC-X \citep{note:stcx}, never proceeded to become
a recommendation. Hence, this version of VODataService deprecates the
use of \xmlel{STCResourceProfile}.
Instead, resource record authors are strongly encouraged to provide
coverage information in the \xmlel{spatial}, \xmlel{spectral}, and
\xmlel{temporal} children of \xmlel{coverage}.
Spatial coverage is conveyed as a MOC. To enable easy embedding into
resource records written in VOResource (i.e., XML),
VODataService uses the string serialisation defined in section 2.3.2 of
\citet{2019ivoa.spec.1007F}.
By default, the MOCs are to be interpreted in the ICRS. Future
extensions to non-celestial frames (e.g., on planet
surfaces) will use the \xmlel{frame} attribute.
However, the only celestial reference system allowed is
ICRS. If a resource's native coordinates are given for another frame (e.g.,
Galactic or FK4 for some equinox), it is the resource record author's
responsibility to convert the spatial coverage into the ICRS.
An important characteristic of MOCs is the order of the smallest scale
(the ``MOC resolution''). Higher orders yield more faithful
representations of the actual coverage, but also lead to a possibly
significant increase of the size of the serialized MOC. We suggest a
``typical resolution'' of the Registry of about a degree (i.e., MOC
order 6), but resources are free to choose a higher maximum orders if
appropriate and the resource record size remains reasonable.
Resources that need to communicate high-resolution spatial coverage,
perhaps for some non-discovery use case, can use the \xmlel{footprint}
element with its \xmlel{ivo-id} attribute set to
\nolinkurl{ivo://ivoa.net/std/moc} to declare a URL giving a
footprint in one of the approved MOC serialisations and of arbitrary
level and size.
Time and spectral coverage are modeled as unions of simple
intervals over real numbers; the serialisation here is a space-separated
pair of floating point numbers as governed by the DALI \emph{interval}
xtype.
Times are given in Barycentric Dynamical Time (TDB) at the solar system
barycenter. They must be specified as Modified Julian Dates. Since
discovery use cases in which high-precision times are required are not
forseen, resource record authors are encouraged to pad their actual
temporal coverage such that differences in time scales (of the order of
10s of seconds) or reference positions (of the order of minutes between
ground-based observatories and the barycenter) do not matter. In other
words, the temporal resolution of the Registry at this point should be
assumed to be of order 10 minutes.
Deviating from common VO practice (which currently fairly consistently
uses wavelengths of electromagnetic waves in vacuum), spectral limits are
given in Joules of messenger energy. This is intended to allow seamless
integration of non-electromagnetic messengers. The reference position
for the spectral axis is the solar system barycenter. Again, discovery
use cases on a level where the difference between reference frames of
ground-based observatories versus the solar system barycenter matters
are not forseen, and resource record authors are advised to pad their
intervals on that level.
% GENERATED: !schemadoc VODataService-v1.3.xsd Coverage
\begin{generated}
\begingroup
\renewcommand*\descriptionlabel[1]{%
\hbox to 5.5em{\emph{#1}\hfil}}\vspace{2ex}\noindent\textbf{\xmlel{vs:Coverage} Type Schema Documentation}
\noindent{\small
A description of how a resource's contents or behavior maps