Projects

The short term projects below are open to interns and those signing for independent studies.

DATA SHARING

Motivating data sharing of biomedical data, best practices for data management and dissemination. Investigate barriers to sharing biomedical data and software, collect requirements for sharing software and tools, design a process and workflow for collecting and sharing the software and tools, develop a web application for that process.
Qualifications needed: literature review, information synthesis, workflow development

Skills to be learned: synthesizing information into a concrete plan for developing an environment that encourages sharing, understanding of the privacy requirements for sharing biomedical data, applying user-centered design, understanding factors involved in data sharing.

HUMAN COMPUTER INTERACTION

Creating an online marketplace of data and software for research use of biomedical data that is intuitive, motivating, and effective. iDASH seeks to integrate biomedical data and tools in a secure and private environment for research use. Motivating people with data to share that data and motivating people with software to share that software involves creating an environment that provides benefit to not only the future users of the donated software and data but also to those donating the data. Working with potential users and contributors, design a software and data ecosystem.
Qualifications needed: training in human-computer interaction and user-centered design.

Skills to be learned: knowledge of existing types of data and software relevant for biomedical research, knowledge about successful ecosystems, knowledge about factors involved in motivating and enabling biomedical data sharing within a private and secure environment.

PRIVACY

Implementation of a cryptographic shuffling network.
Why: Institutions are often hesitant to disclose data that reveal information about the practices of the institution. One area where this is an issue in particular is device and procedure risk assessment. In this context our view is that of having a central data warehouse or central processing unit (CPU), and a collection of participating data sources (PDSs). Anonymous submission of data means that the neither the CPU, nor any other PDS, can tell from which PDS a particular result originates.

What: the intern will design and implement a software toolset that will allow the participating sources to form a cryptographic cloud in which queries and data are routed in a random manner before being presented to the central processing unit.

How: the implementation will be in Python using standard public key cryptography libraries as well as standard networking protocols.

Qualifications needed: Python programming skills, some understanding of (inter)networking and cryptography.

Skills to be learned: networking, practical data privacy, public key cryptography, health care data sharing.

SEMI-AUTOMATED QUANTIFICATION OF PULMONARY EMBOLISM LESIONS

Pulmonary emboli (PE) are a life-threatening condition created when a thrombus formed in the deep veins of the leg breaks off, travels through the circulation, and lodges in a pulmonary artery. The current standard of care for detecting PE is a volumetric CT image of the lung vasculature. This project is part of a larger project aimed at developing automated computer detection of the PE lesions in CT images. We will develop and evaluate image segmentation schemes to capture the full volumetric extent of PE lesions and then to compute meaningful summary measurements of these lesions.
Qualifications needed: Proficiency in C++ (preferable), Python, or Java. Familiarity with Linux operating system is also desirable. Knowledge of medical imaging would be nice.

Skills to be learned: Volumetric medical image segmentation, basics of computer-aided detection.

NATURAL LANGUAGE PROCESSING

Knowledge representation, extraction algorithms, and visualizations for clinical reports that enable sharing of clinical data originally generated in textual format. Several projects in this area are available, including developing a repository for annotations of clinical reports using common information models, developing an intelligent front-end interface for assisting a user in adapting an information extraction system to a new domain, and creating a repository of knowledge extracted from clinical reports and an interface for querying that knowledge.
Qualifications needed: database design, computer programming with Java or Python

Skills to be learned: natural language processing techniques necessary for processing clinical texts, user needs for applying natural language processing, knowledge representation models being developed for information described in clinical texts.

DATA VISUALIZATION

Visualization of biomedical data for exploration, retrieval, and analysis. Structured and unstructured data contained in clinical and biomedical data repositories are often not used, because they are difficult to visualize or understand. The data are complex mixtures of text, numbers, and codes all in the context of moving time. Develop methodologies for visualizing the information described in the data for clinical researchers to have more efficient access to the data.
Qualifications needed: programming in Java

Skills to be learned: the nature of clinical data, methodologies for extracting features from data (such as NLP and image feature extraction), user needs for biomedical research.

INTEGRATING GENOTYPIC, PHENOTYPIC, AND BEHAVIORAL DATA FOR DISCOVERY

To develop a more comprehensive understanding of complex chronic human diseases like diabetes, we will collect, integrate, and analyze diverse data from sources including clinical tests, medical devices, body sensors, genomics, and patient surveys.
Qualifications needed: basic programming skills, statistical knowledge, ER modeling, familiarity to healthcare/biology concepts

Skills to be learned: data integration and standardization, data modeling, ontology in intelligent systems, HCI, data visualization

INFORMED CONSENT

Creating an informed consent process that offers subjects choices in how they participate in research studies involving biospecimen banks and longitudinal databases. Develop an electronic library of informed consent forms and a repository of signed consents. Create a broker that determines if proposed uses of data and biospecimens match the subjects’ consents.
Qualifications needed: web application design, iPad application development, ontologies; or interest and knowledge in human subjects research ethics, law; or clinical study design

Skills to be learned: Human subjects research regulations and ethics, mobile application design, use of ontologies in designing intelligent systems

For more information, contact:

Hyeon-eui Kim, PhD, MPH, RN
Assistant Professor
Department of Biomedical Informatics
hyk038@ucsd.edu
(858) 822-4368