Motivating data sharing of biomedical data, best practices for data
management and dissemination. Investigate barriers to sharing biomedical
data and software, collect requirements for sharing software and tools,
design a process and workflow for collecting and sharing the software
and tools, develop a web application for that process.
Qualifications needed: literature review, information synthesis, workflow development
Skills to be learned: synthesizing information into a
concrete plan for developing an environment that encourages sharing,
understanding of the privacy requirements for sharing biomedical data,
applying user-centered design, understanding factors involved in data
HUMAN COMPUTER INTERACTION
Creating an online marketplace of data and software for research use
of biomedical data that is intuitive, motivating, and effective. iDASH
seeks to integrate biomedical data and tools in a secure and private
environment for research use. Motivating people with data to share that
data and motivating people with software to share that software involves
creating an environment that provides benefit to not only the future
users of the donated software and data but also to those donating the
data. Working with potential users and contributors, design a software
and data ecosystem.
Qualifications needed: training in human-computer interaction and user-centered design.
Skills to be learned: knowledge of existing types of
data and software relevant for biomedical research, knowledge about
successful ecosystems, knowledge about factors involved in motivating
and enabling biomedical data sharing within a private and secure
Implementation of a cryptographic shuffling network.
Why: Institutions are often hesitant to disclose data that reveal
information about the practices of the institution. One area where this
is an issue in particular is device and procedure risk assessment. In
this context our view is that of having a central data warehouse or
central processing unit (CPU), and a collection of participating data
sources (PDSs). Anonymous submission of data means that the neither the
CPU, nor any other PDS, can tell from which PDS a particular result
What: the intern will design and implement a software toolset that
will allow the participating sources to form a cryptographic cloud in
which queries and data are routed in a random manner before being
presented to the central processing unit.
How: the implementation will be in Python using standard public key
cryptography libraries as well as standard networking protocols.
Qualifications needed: Python programming skills, some understanding of (inter)networking and cryptography.
Skills to be learned: networking, practical data privacy, public key cryptography, health care data sharing.
SEMI-AUTOMATED QUANTIFICATION OF PULMONARY EMBOLISM LESIONS
Pulmonary emboli (PE) are a life-threatening condition created when a
thrombus formed in the deep veins of the leg breaks off, travels
through the circulation, and lodges in a pulmonary artery. The current
standard of care for detecting PE is a volumetric CT image of the lung
vasculature. This project is part of a larger project aimed at
developing automated computer detection of the PE lesions in CT images.
We will develop and evaluate image segmentation schemes to capture the
full volumetric extent of PE lesions and then to compute meaningful
summary measurements of these lesions.
Qualifications needed: Proficiency in C++ (preferable),
Python, or Java. Familiarity with Linux operating system is also
desirable. Knowledge of medical imaging would be nice.
Skills to be learned: Volumetric medical image segmentation, basics of computer-aided detection.
NATURAL LANGUAGE PROCESSING
Knowledge representation, extraction algorithms, and visualizations
for clinical reports that enable sharing of clinical data originally
generated in textual format. Several projects in this area are
available, including developing a repository for annotations of clinical
reports using common information models, developing an intelligent
front-end interface for assisting a user in adapting an information
extraction system to a new domain, and creating a repository of
knowledge extracted from clinical reports and an interface for querying
Qualifications needed: database design, computer programming with Java or Python
Skills to be learned: natural language processing
techniques necessary for processing clinical texts, user needs for
applying natural language processing, knowledge representation models
being developed for information described in clinical texts.
Visualization of biomedical data for exploration, retrieval, and
analysis. Structured and unstructured data contained in clinical and
biomedical data repositories are often not used, because they are
difficult to visualize or understand. The data are complex mixtures of
text, numbers, and codes all in the context of moving time. Develop
methodologies for visualizing the information described in the data for
clinical researchers to have more efficient access to the data.
Qualifications needed: programming in Java
Skills to be learned: the nature of clinical data,
methodologies for extracting features from data (such as NLP and image
feature extraction), user needs for biomedical research.
INTEGRATING GENOTYPIC, PHENOTYPIC, AND BEHAVIORAL DATA FOR DISCOVERY
To develop a more comprehensive understanding of complex chronic
human diseases like diabetes, we will collect, integrate, and analyze
diverse data from sources including clinical tests, medical devices,
body sensors, genomics, and patient surveys.
Qualifications needed: basic programming skills and statistical knowledge
Skills to be learned: data security and deidentification; data integration and visualization; knowledge about diabetes, wireless health, genomics
Creating an informed consent process that offers subjects choices in
how they participate in research studies involving biospecimen banks and
longitudinal databases. Develop an electronic library of informed
consent forms and a repository of signed consents. Create a broker that
determines if proposed uses of data and biospecimens match the subjects’
Qualifications needed: web application design, iPad
application development, ontologies; or interest and knowledge in human
subjects research ethics, law; or clinical study design
Skills to be learned: Human subjects research regulations and ethics, mobile application design, use of ontologies in designing intelligent systems
For more information, please contact:
Hyeon-eui Kim, PhD, RN, MPH
Assistant Professor, Department of Biomedical Informatics