Presentation at the Workshop on Data
Management held at King Saud University Riyad Saudi Arabia on April 30, 2013 by
Professor Omar Hasan Kasule Sr. MB ChB (MUK), MPH (Harvard), DrPH (Harvard)
Chairman Institutional Review Board and Department of Bioethics King Fahad
Medical City Riyadh EM: omarkasule@yahoo.com
INTRODUCTION
·
Data
includes words, numbers, images, and voice most often in an electronic form.
·
Modern
information technology handling large and multiple datasets has spawned new
ethical issues that researchers dealing with a single research data base in one
institution did not face. These issues are: data costs, data ownership, data
confidentiality, and patient safety based on data validity.
·
These
ethical issues arise at the stages of data sourcing / collection, data editing,
and data storage and retrieval.
·
They
also arise in the processes of data sharing and data integration.
USES
OF EXISTING DATA SOURCES
·
Use
of existing data to analyze policy[6]
·
Use
of existing data for research on specific conditions: heart failure,[7]
diabetes[8] [9] [10],
HIV[11],
asthma[12],
knee arthroplasty[13]
·
Genome
data interpretation using internet and public data bases[14]
QUALITY
ISSUES OF EXISTING DATA SOURCES
·
Assessing
quality of existing data sources[17]
·
Assessing
coverage and completeness of existing data[18]
·
Assessing
data sources on occupational fatality[19]
·
Comprehensiveness
of data sources[20]
·
Under
registration of death in Thailand[21]
ETHICS OF USING
EXISTING DATA SOURCES
·
Operational data generated by hospitals, health
insurance companies, and administrative units is not collected with due care to
ensure research-quality accuracy (accuracy, coverage). The defects of these
data bases can be overcome by instituting quality control programs and using
multiple sources for cross validation. Is it ethical to have low quality
routine operational data that cannot be used for research or valid policy
analysis?
·
Operational
data lies unused in data banks while researchers apply for and get grants to collect
new data for their research purposes. Is it ethical to collect new data when we
have a lot of routinely collected data?
·
Ethical
issues in genetic data collection from critically ill patients[22]
·
Using
human samples for environmental bio-monitoring[23]
·
Data
ownership and responsible research[24].
Which researcher has a right to use data in the charts.
·
Trade
in data: can we buy data from subjects or from collectors? Can we sell data for
research? Sell/trade/purchase?
·
Data
ownership?: data from public publications, data in patient charts (epidemiological
& clinical analysis), case reports
·
The
data can be used for research by permission of the legal owner; laws and
regulations are not yet clear on this issue because potential owners include
the patients, the physicians, and the institutions.
·
The
issue of ownership leads to another question whether routinely collected data
can be sold to researchers. A corollary to this is whether researchers can
engage in selling or buying data with other researchers or commercial marketing
and advertising agencies.
ACCESS TO MORE
DATA BY DATA INTEGRATION AND DATA SHARING 1
·
Data integration and data sharing,
facilitated by modern information technology, enable access to more data in
other institutions for analysis and standard setting.
·
Information
technology provides the algorithms for fast integration and sharing.
·
Both
integration and sharing are an ethical imperative to advance knowledge that
benefits patients.
·
Integration
and sharing have been used mostly in genomic sequencing, nuclear mapping,
imaging, clinical trials, and organ transplantation research.
DATA
INTEGRATION
·
Three
levels of integration (within species and interspecies) in plant science to
gain different understanding[25]
·
Comparing
and integrating data from various sources[26]
·
Algorithm
for fast integration of data from various sources[27]
·
Integration
of distributed data sources[28]
·
Integration
of data from various sources on adverse effects[29]
·
Interoperability
and data integration[30]
·
Methodology[31]
DATA
SHARING 1
·
Benefits
of data sharing: making data sharing count[32],
making data sharing work[33],
sharing online data among people with epilepsy[34],
medical research data sharing: the public good[35]
·
Processes
of data sharing: data sharing not simple[36],
different practices of data sharing among scientists[37], bottle
necks in data sharing[38], data
standardization[39]
·
Technology
of data sharing: electronic data sharing[40], platform
for data sharing[41],
cryptographic approach to data sharing[42], informatics
to enable data sharing[43],
Secure data sharing portals[44]
·
Policies
and regulations: Need for data sharing policy[45],
Institute wide data sharing policy[46]
·
Examples
of data shared: genetic[47],
nuclear mapping[48],
mass spectrometry[49],
neuroimaging[50],
neuroimaging data[51],
Health care[52],
prevention of alcohol related violence[53],
Data sharing on lung injury[54],
Sharing of clinical trials data[55],
Data sharing with e-health[56],
Analysis of organ sharing data[57],
Sharing autism research data[58]
·
Analysis:
Post liver transplant obesity based on analysis of shared data[59],
Analysis based on data sharing[60]
ETHICAL ISSUES
IN DATA SHARING 1
·
Data
sharing: ethical and practical issues[61]
·
Intellectual
property, informed consent, privacy, and confidentiality. Codes, standards,
policies and mapping at local and international levels are being developed to
address these issues.
·
Owners
of data collected at great expense are reluctant to share with others without
proper acknowledgement of intellectual property.
·
Informed
consent from patients is needed for data sharing unless fully anonymised.
·
Data
privacy and confidentiality are assured by use of secure data portals and
cryptography.
ETHICAL ISSUES
IN DATA SHARING 2
·
Sharing
HIV data: confidentiality and acceptability[62]
·
Genetic
data: benefits vs risks[63],
code for sharing international genetic data[64],
Consent for data sharing in genomic research[65],
risks in sharing aggregate genetic data[66]
·
Intellectual
property in data sharing[67]
·
Data
sharing vs protection of privacy[68],
Privacy compliance in data sharing[69],
Preserving privacy with data sharing[70]
·
Misuse
of Data use for dual use[74]
DATA PROCESSING
·
Data processing within one research project has
its own ethical issues. The data manager could introduce biases, random or
non-random, in the processes of adjusting for missing data, data
transformation, and creation of derived variables.
·
Data
processing mistakes underlie several types of bias: misclassification,
selection bias, and sampling bias. Any mistakes introduced at this stage will
impact the final research results and eventually affect patient safety due to
clinical interventions based on false research.
·
The
usual procedures for data privacy and data confidentiality must be respected.
As far as possible personal identifiers should not be accessible except to a
few selected members of the research team. The data should be kept locked up or
in password protected computers. If the computers are connected online the
institution should have policies and software to assure data security.
DATA
EDITING AND VALIDATION
·
Data
editing can lead to biases that will eventually impact on patient safety
through wrong research data and conclusions.
·
Potential
biases in dealing with missing data
·
Sampling
biases
·
Biases
in data transformation: qualitative vs quantitative, ordinal vs nominal,
continuous vs discrete, creating new variables by combining variables or by
mathematical transformations
DATA
VALIDATION
·
Inconsistencies
between administrative and clinical data[82]
requires validation by examination[83],[84]
·
Difference
in 2 national data bases on mental health[85]
·
Capture-recapture
using different sources on adverse effects[86]
CONFIDENTIALITY:
·
Disclosure
·
Sharing
REFERENCES
[4] Traffic Inj Prev. 2012;13
Suppl 1:57-63.
[62] Implement Sci. 2012 Apr 19;7:34..
[64] Genome Med. 2011 Jul 14;3(7):46