![Page 1: 11 Sep 2006 NVO Summer School 20061 Managing data in the VO Matthew J. Graham CACR/Caltech T HE US N ATIONAL V IRTUAL O BSERVATORY](https://reader037.vdocuments.us/reader037/viewer/2022110116/5514d79f550346b0338b53f6/html5/thumbnails/1.jpg)
11 Sep 2006
NVO Summer School 2006 1
Managing data in the VO
Matthew J. GrahamCACR/Caltech
THE US NATIONAL VIRTUAL OBSERVATORY
![Page 2: 11 Sep 2006 NVO Summer School 20061 Managing data in the VO Matthew J. Graham CACR/Caltech T HE US N ATIONAL V IRTUAL O BSERVATORY](https://reader037.vdocuments.us/reader037/viewer/2022110116/5514d79f550346b0338b53f6/html5/thumbnails/2.jpg)
11 Sep 2006
NVO Summer School 2006 2
The importance of data
• Data is the raison d’être of the VO• LSST is the data source nonpareil
– data rates of 540MB/s ~16TB in 8 hrs– final archive > 3PB of data
VO Wheel™
• Well-established ways of handling distributed data:
– SRB– PVFS– OGSA-DAI
![Page 3: 11 Sep 2006 NVO Summer School 20061 Managing data in the VO Matthew J. Graham CACR/Caltech T HE US N ATIONAL V IRTUAL O BSERVATORY](https://reader037.vdocuments.us/reader037/viewer/2022110116/5514d79f550346b0338b53f6/html5/thumbnails/3.jpg)
11 Sep 2006
NVO Summer School 2006 3
Requirements
• A distributed storage mechanism that allows easy reference to data without concerns about physical location.
• Primary use cases:– User wants to easily publish and share own data– Data need to reside close to computation nodes
• Data use cases:– Client has data:
• stored locally: transfers it to service• stored locally: service retrieves it• stored elsewhere: service retrieves it
– Service generates data:• stores it locally: notifies client of location• transfers it to the client’s local store• transfers it to a client-designated store
![Page 4: 11 Sep 2006 NVO Summer School 20061 Managing data in the VO Matthew J. Graham CACR/Caltech T HE US N ATIONAL V IRTUAL O BSERVATORY](https://reader037.vdocuments.us/reader037/viewer/2022110116/5514d79f550346b0338b53f6/html5/thumbnails/4.jpg)
11 Sep 2006
NVO Summer School 2006 4
Logical architecture
• User view• Logical namespace• Physical storage
![Page 5: 11 Sep 2006 NVO Summer School 20061 Managing data in the VO Matthew J. Graham CACR/Caltech T HE US N ATIONAL V IRTUAL O BSERVATORY](https://reader037.vdocuments.us/reader037/viewer/2022110116/5514d79f550346b0338b53f6/html5/thumbnails/5.jpg)
11 Sep 2006
NVO Summer School 2006 5
VOSpace
• Provides a uniform interface to existing or new data storage locations (Facade pattern)
• Structured/unstructured data both first level• A peer network of VOSpace servers
![Page 6: 11 Sep 2006 NVO Summer School 20061 Managing data in the VO Matthew J. Graham CACR/Caltech T HE US N ATIONAL V IRTUAL O BSERVATORY](https://reader037.vdocuments.us/reader037/viewer/2022110116/5514d79f550346b0338b53f6/html5/thumbnails/6.jpg)
11 Sep 2006
NVO Summer School 2006 6
Data structures - I
• Each data object is represented as a node:<node/>
• Nodes are identified by a vos://[service]/[name] identifier:<node uri=“vos://nvo.caltech!vospace/mydata1”/>– Why not ivo://nvo.caltech/vospace/mydata1?
– RFC2396 - hierarchy
![Page 7: 11 Sep 2006 NVO Summer School 20061 Managing data in the VO Matthew J. Graham CACR/Caltech T HE US N ATIONAL V IRTUAL O BSERVATORY](https://reader037.vdocuments.us/reader037/viewer/2022110116/5514d79f550346b0338b53f6/html5/thumbnails/7.jpg)
11 Sep 2006
NVO Summer School 2006 7
UnstructuredDataNode
Data structures - II
• Each node contains a map of key:value properties:<node uri=“vos://nvo.caltech!vospace/mydata1”>
<properties><property
uri=“ivo://net.ivoa.vospace/properties/create.date”>2006-09-11T13:35:51Z</property>
</properties></node>
• There are currently four types of node:<node/><node xsi:type=”vos:DataNode”/><node xsi:type=“vos:UnstructuredDataNode”/><node xsi:type=“vos:StructuredDataNode”/>
Node
DataNode
StructuredDataNode
readonly=“true”
![Page 8: 11 Sep 2006 NVO Summer School 20061 Managing data in the VO Matthew J. Graham CACR/Caltech T HE US N ATIONAL V IRTUAL O BSERVATORY](https://reader037.vdocuments.us/reader037/viewer/2022110116/5514d79f550346b0338b53f6/html5/thumbnails/8.jpg)
11 Sep 2006
NVO Summer School 2006 8
Data structures - III
• Data nodes contain a list of data views (formats) that the node can accept and provide:<node xsi:type=“vos:UnstructuredDataNode”
uri=“vos://nvo.caltech!vospace/mydata1”>…<views>
<accepts><view uri=“ivo://net.ivoa.vospace/views/any”/></accepts><provides><view uri=“ivo://net.ivoa.vospace/views/votable-
1.1”/></provides>
</views></node>
![Page 9: 11 Sep 2006 NVO Summer School 20061 Managing data in the VO Matthew J. Graham CACR/Caltech T HE US N ATIONAL V IRTUAL O BSERVATORY](https://reader037.vdocuments.us/reader037/viewer/2022110116/5514d79f550346b0338b53f6/html5/thumbnails/9.jpg)
11 Sep 2006
NVO Summer School 2006 9
Data structures - IV
<node xsi:type=“vos:StructuredDataNode” uri=“vos://nvo.caltech!vospace/mydata1”>
…<views>
<accepts><view uri=“ivo://net.ivoa.vospace/views/votable-1.1”/>
</accepts><provides>
<view uri=“ivo://net.ivoa.vospace/views/votable-1.1” original=“true”/><view uri=“ivo://net.ivoa.vospace/views/votable-1.0”/>
</provides></views>
</node>– Why not use MIME type?
• Easier to define new astronomy specific data types
![Page 10: 11 Sep 2006 NVO Summer School 20061 Managing data in the VO Matthew J. Graham CACR/Caltech T HE US N ATIONAL V IRTUAL O BSERVATORY](https://reader037.vdocuments.us/reader037/viewer/2022110116/5514d79f550346b0338b53f6/html5/thumbnails/10.jpg)
11 Sep 2006
NVO Summer School 2006 10
Data structures - V
• Data transfers are represented by transfers:<transfer/>
• The format of the data transfer is specified by a view:<transfer>
<view uri=“ivo://net.ivoa/vospace/views/votable-1.1”/></transfer>
• The protocol of the data transfer is specified by a protocol:<transfer>
…<protocols>
<protocol uri=“http://net.ivoa/vospace/protocols/http-get”><endpoint=“http://192.168.1.33:7007/vospace”/>
</protocol><protocols>
</transfer>
![Page 11: 11 Sep 2006 NVO Summer School 20061 Managing data in the VO Matthew J. Graham CACR/Caltech T HE US N ATIONAL V IRTUAL O BSERVATORY](https://reader037.vdocuments.us/reader037/viewer/2022110116/5514d79f550346b0338b53f6/html5/thumbnails/11.jpg)
11 Sep 2006
NVO Summer School 2006 11
Data structures - VI
• The space has a list of which protocols the service can accept to fetch data and what protocol endpoints it provides:
<protocols><accepts>
<protocol uri=“ivo://net.ivoa.vospace/protocols/ftp-get”/><protocol uri=“ivo://net.ivoa.vospace/protocols/ftp-put”/><protocol uri=“ivo://net.ivoa.vospace/protocols/http-get”/><protocol uri=“ivo://net.ivoa.vospace/protocols/http-put”/>
</accepts><provides>
<protocol uri=“ivo://net.ivoa.vospace/protocols/http-get”/><protocol uri=“ivo://net.ivoa.vospace/protocols/http-get”/>
</provides></protocols>• Why not use protocol schemes?
![Page 12: 11 Sep 2006 NVO Summer School 20061 Managing data in the VO Matthew J. Graham CACR/Caltech T HE US N ATIONAL V IRTUAL O BSERVATORY](https://reader037.vdocuments.us/reader037/viewer/2022110116/5514d79f550346b0338b53f6/html5/thumbnails/12.jpg)
11 Sep 2006
NVO Summer School 2006 12
Operations - I
• Service metadata:– getProtocols(): <protocols>– getViews(): <accepts>, <provides>– getProperties(): <accepts>, <provides>, <contains>
• Creating and manipulating nodes– createNode(<node>): <node>– deleteNode(uri): -– listNodes(token, limit, detail, <nodes>): token, limit,
<nodes> – moveNode(uri, <node>): <node>– copyNode(uri, <node>): <node>
![Page 13: 11 Sep 2006 NVO Summer School 20061 Managing data in the VO Matthew J. Graham CACR/Caltech T HE US N ATIONAL V IRTUAL O BSERVATORY](https://reader037.vdocuments.us/reader037/viewer/2022110116/5514d79f550346b0338b53f6/html5/thumbnails/13.jpg)
11 Sep 2006
NVO Summer School 2006 13
Operations - II
• Manipulating node metadata– getNode(uri): <node>– setNode(<node>): <node>
• Transferring data– pushToVoSpace(<node>, <transfer>): <node>,
<transfer>– pullToVoSpace(<node>, <transfer>): <node>– pushFromVoSpace(uri, <transfer>): -– pullFromVoSpace(uri, <transfer>): <transfer>
![Page 14: 11 Sep 2006 NVO Summer School 20061 Managing data in the VO Matthew J. Graham CACR/Caltech T HE US N ATIONAL V IRTUAL O BSERVATORY](https://reader037.vdocuments.us/reader037/viewer/2022110116/5514d79f550346b0338b53f6/html5/thumbnails/14.jpg)
11 Sep 2006
NVO Summer School 2006 14
Authentication and authorization
• WS-Security• Access policies:
– No access control– No authorization but authentication– Clients may not create or change nodes– Nodes are considered to be owner by the
user who created them.
![Page 15: 11 Sep 2006 NVO Summer School 20061 Managing data in the VO Matthew J. Graham CACR/Caltech T HE US N ATIONAL V IRTUAL O BSERVATORY](https://reader037.vdocuments.us/reader037/viewer/2022110116/5514d79f550346b0338b53f6/html5/thumbnails/15.jpg)
11 Sep 2006
NVO Summer School 2006 15
Forthcoming attractions
• Containers• Links• Asynchronous transfers• Querying• Replicas
![Page 16: 11 Sep 2006 NVO Summer School 20061 Managing data in the VO Matthew J. Graham CACR/Caltech T HE US N ATIONAL V IRTUAL O BSERVATORY](https://reader037.vdocuments.us/reader037/viewer/2022110116/5514d79f550346b0338b53f6/html5/thumbnails/16.jpg)
11 Sep 2006
NVO Summer School 2006 16
Federation by links