Blind data aggregation from distributed, protected sources
AuthorAjayi, Oluwafemi; Sinnott, Richard O.; STELL, ANTHONY; Young, Alan
Source TitleUK e-Science All Hands Meeting
PublisherNational e-Science Centre, University of Glasgow
AffiliationComputing and Information Systems
Document TypeConference Paper
CitationsAjayi, O., Sinnott, R. O., Stell, A., & Young, A. (2008). Blind data aggregation from distributed, protected sources. In UK e-Science All Hands Meeting, Edinburgh, UK.
Access StatusOpen Access
This is a pre-print of a paper from UK e-Science All Hands Meeting 2008. http://www.allhands.org.uk/2008/index.html
Successful e-health research depends on access to and usage of a wide range of clinical, biomedical, social, geo-spatial, environmental and other data sets. In large scale, multi-centre clinical studies crossing geographical and organizational divides, the need to access, link and aggregate data securely is essential. Whilst the e-Science community have come up with a wide variety of technologies that support authentication and authorization, past experiences from working with organizations such as the National Health Service (NHS) in projects such as the MRC funded Virtual Organizations for Trials and Epidemiological Studies (VOTES) project, have shown that irrespective of the technological advances and capabilities offered by the e-Science community, data providers themselves are typically unwilling to provide direct access to their data sets, i.e. through penetration of the NHS firewall for example from HE/FE. There are many reasons for this which we outline in this paper, both pragmatic and technological. Ultimately, data providers and the key stakeholders in this space are acutely aware of confidentiality and ethics concerns on data access and usage. They will only release their data provided it can be ensured that it is not possible to link it with other data sets that can result in potential violations of patient confidentiality for example through statistical disclosure. This paper presents a novel approach and its implementation that directly addresses these issues, providing a so-called Virtual Anonymisation Grid for Unified Access to Remote Clinical Data (Vanguard). Key features of Vanguard are its support for pull models of interaction with data providers such as the NHS, who do not necessarily have to open up their firewalls and thereby open themselves up to risks of attack; support of secure, anonymous data aggregation; support for novel ways in which data release to users undertaking research allows them to obtain and use data in a secure, disclosure free environment where third parties cannot access/use any released data. We demonstrate this through case studies applying the Vanguard system to clinical scenarios and systems working with the NHS in Scotland.
Keywordse-health research; data sets; data access; clinical studies; confidentiality; ethics; Vanguard system
- Click on "Export Reference in RIS Format" and choose "open with... Endnote".
- Click on "Export Reference in RIS Format". Login to Refworks, go to References => Import References