Performance of Secure and Privacy-preserving Execution Model for Data Services

Get Complete Project Material File(s) Now! »

Security enforcement for mediators

As the distribution and sharing of information over the World Wide Web becomes increasingly important, the needs for eﬃcient yet secure access of data naturally arise. Unfortunately, several approaches of mediation that have been proposed do not take into account the security issues and privacy requirements. Moreover, they suppose that the security and privacy issues are only handled locally at each data source level.
Considerable research eﬀorts have been made to handle the issues related to access control in a mediated environment [16, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55]. Several proposed approaches aim to secure the mediation protocol mainly aspects of authentication and communication between clients and datasources [16, 52, 53, 54]. For instance [54] aims to preserve the anonymity of clients and confidentiality of data when transmitting data from datasources to clients via a mediator. Authors of [54] present three approaches that allow a mediator to compute a JOIN operation on encrypted relations, based on specific encryption schema (Database-As-Service, commutative en-cryption and homomorphic encryption/private matching). However, there are few works that investigate the aspect related to the security query evaluation. TIHI [56](Trusted Interoperation of Healthcare Information) es-tablishes security by having a single security mediator that applies rules based pre-processing (processing of query) and post-processing (processing of result) data filter-ing to external client requests. The security mediator is the responsible for the security in the mediation system. It insures that unauthorized data does not exit the mediator system. It filters incoming and out-going data from TIHI’s data sources. The security mediator is under the control of the security oﬃcer who may override the security me-diator’s transaction. The security oﬃcer may process the request manually in the case where the security rules are inadequate to process the request.
CHAOS [51](Configurable Heterogeneous Active Object System) tries to address the limitation of TIHI by integrating objects that are dynamically loaded into a mediator. CHAOS incorporates the security policies into the data objects as active nodes to form active objects. In this case the active objects are represented by a special type of XML objects that encompass data elements as well as active elements. An active object contains one active node by mean of a Java class that must be interpreted by a runtime environment. When a query is passed to one of the active objects, its active node is dynamically loaded and executed by the security mediator. CHAOS does not need to rely on a set of primitive security rules. To establish security policies CHAOS provides an API to set the security policies with an active object. The CHAOS model moves the responsibility of security to the source data provider, rather than through a central authority.
Yang et al. [46, 47] proposed an approach based on RBAC [57] by utilizing the mediation system’s Access Control and View Expander components and the mediation spec database. Requests that are sent by the external clients are processed by the Access Control unit that utilizes RBAC authorization. The View Expander enforces proper client access right to insure that only proper database views are made available to the external client [58].

Security of query evaluation

In this section we made an attempt to identify necessary building blocks for securing mediators, mainly the aspect of security query evaluation. Then we present some existing approaches that are based on these blocks and that could be used and/or adapted for the security of mediation system.
Current mediators rely two main entities: (i) a user query and (ii) data sources. For securing mediators we need a new entity that represents the security policy. Secure mediator consists of three building blocks: Data D indicates data sources that contain the answers which users are looking for. Query Q describes the information that users want. It is expressed in the query language of associated mediator. Security policy P represents the access control rules describing the security policy associated to each user. It represents also the privacy policy preferences in case of private data sources.
Note that D, Q and P are independent components, and thus can be located in-dependently and processed separately. Based on these elements we present diﬀerent ways to enforce the security policy, namely, view-based approach, pre-processing, post-processing.

View-based approach

View based approach aims to process security policy P and data D first (see figure 2.3). The idea of view-based enforcement is to create and maintain a view for each user who is authorized to access a specific portion/block of data. The view is generated by using the set of authorizations granted to the user to filter oﬀ the nodes that the user should not access. The view contains exactly the set of data that the user is authorized to access. During runtime, each user can simply run his queries against his view. Although views can be prepared oﬄine. View-based enforcement has two serious limitations: (1) not scalable in managing and maintaining views when there are a large number of roles (or users), (2) high storage cost [59]. The examples of view-based approaches recently proposed for XML documents include [60, 61, 62, 63, 64, 65]. Depending on the details of the algorithms, the views can be maintained either physically or virtually. In the case of XML, view-based approaches identify accessible XML nodes for each user (role) to create a view and evaluate user queries on the view. Such approaches provide fast access to the authorized data, especially when views are materialized, but need to deal with view maintenance issues. Most proposals are actually based on view materialization. In this case, for each user, the base of XML documents is transformed to extract the sub-part called the authorized view which is compliant with the access control policy. The query is then evaluated on the authorized view without modification. Unfortunately, it is generally considered that the view materialization process creates an intolerable overhead with respect to performance.

Pre-processing approach

This approach aims to handle a user query Q and security policy P prior to data D as illustrated in figure 2.4. It consists in rewriting Q based on the corresponding P to construct a secure query Q0 which will be executed directly over D. The returned result of Q0 is conform to the security policy P .
Pre-processing approaches are also known in the literature as query rewriting, query modification and query transformation. An interesting approach for relational DBMS based on query transformation was suggested by Stonebraker [66]. In this case, the query transformation is specified by adding conditions to the WHERE clause of the original query. [66] assumes that a similar mechanism would apply to both select and update operators, which is not generally true (see chapter 4 for more detail). He does not handle the problem of consistency between data selection and update.
An interesting variant to transform SQL queries was suggested by Oracle with VPD [67] (Virtual Private Database) mechanism. In this case, the security policy is specified through the definition of predicates in PL-SQL that will apply as filters to transform the query. The approach suggested by Oracle requires to know PL-SQL in order to implement the access control policy. This may lead to security policies complex to define and maintain. VPD supports the fine-grained access control (FGAC) through policy functions. When a query is issued, it is dynamically modified by appending predicates, returned by the policy function, to the where clause of the query. Moreover, the policy functions are stored and managed locally.
A more recent approach was proposed by Wang et al. [68] where the objective is to securely maximize the answer provided to the user. [68] proposed a formal notion of correctness for fine-grained access control in relational databases. They presented three correctness criteria (sound, secure and maximum) that should be satisfied by any query processing algorithm in order to be “correct”. Soundness : An algorithm is sound if and only if its rewritten query Qrw returns only correct answers i.e. answers to the initial query. Maximality : An algorithm is maximum if and only if Qrw returns as much informa-tion as possible. Security : An algorithm is secure if and only if the result of Qrw respects the security and privacy policy of the queried system. LeFevre et al. [69] presented an approach that enforces limited disclosure expressed by privacy policy in the case of Hippocratic databases. It is implemented by rewriting queries. When a query Q is issued, it is transformed to Q0 so that the result of Q0 respects the cell-level disclosure policy P . Their approach is based on replacing all the cells that are not allowed to be seen by P with N U LL. After that, Q0 is evaluated as a normal query with evaluation rules “N U LL 6= N U LL” and “N U LL 6= cst” for any constant value cst. [69] does not take into account the privacy requirement accuracy. They only replace unauthorized data with “NULL”. Moreover, [69] does not satisfy the sound property and maximum property [68]. Another issue is that they only mask variables (fields) used in the header of the SQL query. They do not handle the qualification portion. For instance, for the following query SELECT name FROM Patients WHERE age=25, they only mask unauthorized name. However, if a data-owner Alice, who is 25 years old, choses to disclose her name but not her age, anyone looking at the results concludes that Alice is 25 years old. One way to correct this problem is to normalize the SQL query by adding all variables used in the where clause to the corresponding header of the query, e.g. the query above is normalized as follows: SELECT name, age FROM Patients WHERE age=25. Then we rewrite the normalized query and applies filter after transformation so that Alice will not appear in the result.
In the case of XML documents, more recent proposals suggest using query trans-formation, see for instance [70] that shows how to transform XPath queries and QFil-ter [71] which is based on query rewriting. [71] is based on non-deterministic finite au-tomata (NFA) and rewrites user’s queries such that parts violating access control rules are pre-pruned. However, there is a main diﬀerence compared to RDF and SPARQL: XML documents correspond to oriented graphs. As noticed in [65], this may lead to complication to protect some relationships in an XML document. This issue has been addressed using two diﬀerent approaches: In [70], protection of XML relationships is embedded in document transformation whereas [65] suggests specifying access control policies using the concept of blocks in order to break some relationships that must be protected.

Table of contents :

1 Introduction
1.1 Motivation and background
1.2 Contributions
1.3 Outline of the dissertation
2 Preliminaries and State of the Art
2.1 Background
2.1.1 RDF
2.1.2 OWL
2.1.3 SPARQL
2.1.4 SPARQL/Update
2.2 Semantic Mediation
2.3 Security enforcement for mediators
2.4 Security of query evaluation
2.4.1 View-based approach
2.4.2 Pre-processing approach
2.4.3 Post-processing approach
3 SPARQL Select Query Rewriting to Enforce Data Confidentiality
3.1 Introduction
3.2 Rewriting SPARQL Query: Basic Principles
3.3 Notations, Definitions and Theorems
3.4 Security policy
3.4.1 Permission
3.4.2 Prohibition
3.5 fQuery: Our query rewriting model
3.5.1 Case of simple condition !
3.5.2 Case of involved condition !
3.5.3 Case of complex condition !
3.5.4 Composition of simple and involved conditions
3.6 Conclusion and Contribution
4 Rewriting of SPARQL/Update Queries for Securing Data access
4.1 Introduction
4.2 Motivating example
4.3 Principle of our approach
4.3.1 Update access control
4.3.2 Consistency between consultation and modification
4.4 Conclusion and Contribution
5 SPARQL Query Rewriting Instrumented by an Access Control Model
5.1 Introduction
5.2 The OrBAC model
5.2.1 Basic predicates
5.2.2 Role, activity and view definition
5.2.3 Context definition
5.2.4 Hierarchy and inheritance
5.3 Principle of the approach
5.4 Modelling RDF Condition within OrBAC
5.4.1 RDF condition as context
5.4.2 RDF condition as view
5.4.3 RDF condition as view and context
5.5 Rewriting Query Instrumented by OrBAC rules
5.6 Conclusion
6 Privacy policy preferences enforced by SPARQL Query Rewriting
6.1 Introduction
6.2 Approach principle
6.3 Privacy-aware Ontology
6.4 The correctness criteria
6.5 Rewriting Algorithm principle
6.5.1 Normalization of triple patterns
6.5.2 Preferences acquisition
6.5.3 Preferences enforcement
6.5.4 SPARQL query without filter
6.5.5 SPARQL query with filters
6.6 Conclusion
7 Privacy query rewriting algorithm instrumented by a privacy-aware access control model
7.1 Introduction
7.2 The privacy-aware OrBAC model (PrivOrBAC)
7.2.1 Consent
7.2.2 Purpose
7.2.3 Accuracy
7.3 Our approach: PrivOrBAC query rewriting algorithm
7.3.1 PrivOrBAC services
7.3.2 PrivOrBAC SPARQL Service
7.4 Conclusion
8 Secure and Privacy-preserving Execution Model for Data Services
8.1 Introduction
8.1.1 Motivating Scenario
8.1.2 Challenges
8.1.3 Contributions
8.2 A Secure and Privacy-Preserving Execution Model for Data Services
8.2.1 Model Overview
8.2.2 Semantic models for data services and policies
8.2.3 RDF views rewriting to integrate security and privacy constraints
8.2.4 Rewriting the extended view in terms of data services
8.2.5 Enforcing security and privacy constraints
8.3 Conclusion and Perspectives
9 Architectures and Implementations
9.1 Implementation of fQuery
9.1.1 MotOrBAC tool
9.1.2 Implementation of fQuery-AC
9.1.3 Implementation of fQuery-Privacy
9.2 Performance of fQuery-Privacy instrumented by PrivOrBAC
9.2.1 Architecture
9.2.2 Use case
9.2.3 Experimental results
9.3 Performance of Secure and Privacy-preserving Execution Model for Data Services
9.3.1 Implementation
9.3.2 Evaluation
9.4 Use cases
9.4.1 AGGREGO Server
9.4.2 PAIRSE
10 Conclusion and perspectives
A fQuery: réécriture de requêtes SPARQL
A.1 Introduction
A.2 Généralités
A.3 fQuery-AC: Principe de base
A.4 fQuery-Privacy: Principe de base
A.5 Le cas des services de données
A.6 Implémentation
B Proof of theorem 3 chapter 3
C fQuery-AC Aspect
D fQuery-AC Visitor
List of Publications
Bibliography