![Page 1: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/1.jpg)
An OWL based schema for personal data protection policies
Giles Hogben
Joint Research Centre, European Commission
![Page 2: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/2.jpg)
Overview
• Introduction – what is P3P and the Base Data Schema
• Why do we need a generic data schema for personal data (outside of P3P)?
• Other schemas available• Modelling the schema in OWL
– Model– Reasoning– Validation
• Further work
![Page 3: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/3.jpg)
Intro• P3P – Platform for Privacy Preferences• W3C XML standard for expressing web site privacy policies (2001)• Statements about data practices by data type• Example of use of data schema<STATEMENT>
<PURPOSE><develop/></PURPOSE> <RECIPIENT><ours/></RECIPIENT> <RETENTION><indefinitely/></RETENTION><DATA-GROUP>
<DATA ref="#dynamic.cookies“/> </DATA-GROUP>
</STATEMENT>
![Page 4: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/4.jpg)
Requirements• P3P data schema works OK within P3P 1.0 and
1.1 but many uses outside of P3P scope. • EPAL (Enterprise Privacy Authorization
Language) • CC/PP • PRIME
– Obligations – Credential metadata – Data-handling
![Page 5: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/5.jpg)
Requirements– Reasoning about credential types (e.g. Driver’s licence
valid => Over 18) – Reasoning about data handling: e.g. purpose marketing,
opt-out -> Risk of spam. – Obligation management – attach obligations to triples
without revealing content. – Automatic form-filling – implies reasoning about data
type equivalences between data store, data request and client preferences
– Identity management and privacy enhancing access control rules – reasoning about pseudonyms and linkability related to classes of data revealed.
![Page 6: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/6.jpg)
Requirements• Reuseable data structures
• Type validation
• Efficient and extensible definition format
• Metadata on types
• Abstraction layer between privacy rules and enterprise data structures
![Page 7: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/7.jpg)
Existing Schema Formats
• P3P1.0 Schema– Quirky syntax only understood by 3 people worldwide– Semantics understood by 2 people worldwide– Customization format understood by 0 people
worldwide– But all other versions share the same semantics as they are
required by the use cases (Reuseable, extensible, non-subclassed data structures)
E.g.<DATA-DEF name="business.contact-info" short-description="Contact Information
for the Organization" structref="#contact"><CATEGORIES><physical/><online/></CATEGORIES>
</DATA-DEF>
![Page 8: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/8.jpg)
Existing Schema Formats
P3P1.1 SchemaUses XML syntax + informal semantics:E.g. <datatype>
<dynamic><cookies> <CATEGORIES type="preference"/> </cookies>
</dynamic>
</datatype>
![Page 9: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/9.jpg)
Existing Schema Formats
• Existing Schema Formats• RDFS Schema for P3P (
http://www.w3.org/TR/p3p-rdfschema ) • Models every single class in the class hierarchy• Models classes of data as properties.
– Difficult to describe instance data– Metadata for properties less natural
• Email can be seen as a property, but what is the Dynamic/Cookies property?
![Page 10: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/10.jpg)
OWL Schema
• Models semantics of P3P 1.0 data schema
• Allows reference from RDF -> reasoning
• Allows type validation
• Simplifies syntax esp extensibility syntax
BUT
• Modelling P3P semantics exactly => Modal logic which makes some reasoning nasty
![Page 11: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/11.jpg)
Structure of Existing Schema
Personname Bdate
User
Gender
Thirdparty
Cert
Entity May CollectDataClassX User
Name
Given Prefix
Some Values From OnlysubClass
• A hierarchy of sorts
• but NOT subclass hierarchy
• Essentially semantic and syntactic validation scheme.
EmployerAddress
Thirdparty
Name
PrefixGiven
![Page 12: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/12.jpg)
How to model the existing structure
• Formal set theory definition
))((, lAiiLl laLlAa :,
Personname Bdate
User
Gender
Thirdparty
Cert
For A (User) SVFO L (Cert,Personname…)
![Page 13: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/13.jpg)
Shortcut
<owl:Class rdf:ID="A">
<customNS:SVFO rdf:parseType="Collection">
<B/>
<C/>
</customNS:SVFO>
</owl:Class>
![Page 14: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/14.jpg)
Data handling statements and reasoning use case
Entity
May CollectDataClassX
User
Name
Given Prefix
subClass
A service states that it may collect any values from the class User data
A user agent rule says to block transfer to any services which might collect Given name data.
Note the modal predicate May collect, which changes the expected logic
![Page 15: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/15.jpg)
Data handling statements and reasoning use case
Entity
May CollectDataClassX
User
Name
Given Prefix
subClass
The agent needs to deduce:
if a service may collect values from User data, it may also collect values from Name
Applying the same rule again, if a service may collect values from Name, it may also collect values from GivenName
->
If a service may collect values from User, it may collect them from GivenName
For discussion of how this was achieved using Jena and OWL, see paper
![Page 16: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/16.jpg)
Quickfix: Using shortcut classes
• Use of shortcut/convenience classes:<owl:Class rdf:ID="User.Name.Given">
<rdf:type rdf:resource="#Instantiateable"/>
<owl:intersectionOf rdf:parseType="Collection">
<owl:Class rdf:about="#User"/>
<owl:Class rdf:about="#Name"/>
<owl:Class rdf:about="#Given"/>
</owl:intersectionOf>
</owl:Class>
![Page 17: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/17.jpg)
Advantage: More compact RDF<prime-PII:hasData>
<prime-PII:User.Name.Given ><rdf:value>Bob</rdf:value>
</prime-PII:User.Name.Given></prime-PII:hasData>
Instead of
<prime-PII:hasData><prime-PII:User >
<rdf:value>Bob</rdf:value><rdf:type rdf:resource=“Name”><rdf:type rdf:resource=“Given”>
</prime-PII:User></prime-PII:hasData>
(Important for adoption and acceptance by policy authors)
![Page 18: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/18.jpg)
Advantage 2. Makes reasoning use case trivial
• Practical use cases only require matching concrete classes (described by the shortcut classes) with their ancestors in the hierarchy.
• By using shortcut classes in OWL, this is simply acheived since a standard OWL reasoner concludes:
<owl:Class rdf:ID="User.Name.Given"> <rdf:type rdf:resource="#Instantiateable"/> <owl:intersectionOf rdf:parseType="Collection">
<owl:Class rdf:about="#User"/><owl:Class rdf:about="#Name"/><owl:Class rdf:about="#Given"/>
</owl:intersectionOf>
</owl:Class>
-> User.Name.Given rdfs:subClassOf User
![Page 19: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/19.jpg)
Validation
• Structure provides some semantic validation through disjoint classes (e.g. City disjoint from Gender – so if something is typed as both city and gender data, it flags an error)
• OWL supports XSD datatyping for syntactic validation (e.g. string, numeric and allows customized types through Regex such as email addresses)
![Page 20: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/20.jpg)
Summary
• We need an ontological model which satisfies the requirements of the P3P 1.0 data schema
• We can use OWL for this
• OWL satisfies (with difficulty) reasoning requirements
• provides validation features not provided by P3P syntax
![Page 21: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/21.jpg)
Further work• Rethink structure without trying to be
backward compatible?• Multi language HR strings• Support for numerical reasoning
– e.g. not just Drivers’ Licence -> Majority age, but ?x has Drivers’ Licence -> [?a >= 18 <- ?x has ?a, ?a isA age] so e.g. Drivers’ licence => age > 16.
• Other more complex reasoning– e.g. ?x collects User.Name.Prefix -> [?x collects User.CivilStatus <-
User.Name.Gender = ‘female’]
![Page 22: An OWL based schema for personal data protection policies Giles Hogben Joint Research Centre, European Commission](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649dd05503460f94ac59f8/html5/thumbnails/22.jpg)
That’s all folks
????????????????????????????????????????????????????????????????????????????????????????????????????????????