Seminar Presentations in 2011

Dec 9, Mary Devereaux, PhD - Director of Biomedical Ethics, UC San Diego
The Ethics of Medical Informatics
The secondary use of patient health records raises a host of ethical issues. Clinicians gather personal patient information to prevent, diagnose and treat disease, advancing patient health. Some of this information may be highly confidential, including family history, genetic testing results, a diagnosis of addiction, or illegal immigration status. But all medical information, whether sensitive or not, is governed by legal and ethical requirements for privacy, confidentiality, and security, e.g., HIPAA. With the move to electronic health records (EHRs) and the growing capacity to gather and process terabytes of medical information, researchers understandably wish to access aggregated data to analyze patient outcomes, health demographics, and the economics of health care (Safran et al. 2007). Assuming that it's possible to de-identify information and protect patient privacy – and that patients give their consent – are there any other ethical issues researchers need to consider?

Dec 2, Jason Young, PhD - Assistant Professor, UC San Diego Center for Aids Research
Open-source software solutions for clinical research: Applications for HIV
Modern clinical research requires flexible software tools for the management of complex data types, rapid testing of hypotheses, and secure access to and sharing of information across institutions via the web. For this purpose, we have developed an extensible, secure, and free Open source Clinical Content Analysis and Management System (OCCAMS). At its core, OCCAMS provides methods for visit schedule and form creation, form and data versioning, integrated quality control logic and workflows, real-time web reporting of data accrual, data export, and granular permission controls. Most importantly though the modularized design and open source nature of OCCAMS encourages community development of plugins to provide new functionality addressing the specific needs of any particular area of research. As examples, OCCAMS HIV-specific plugins under development for mobile data collection, specimen storage and tracking, sequence analysis, and transmission network visualization will be discussed.

Nov 18, Kamalika Chaudhuri, PhD - Assistant Professor of Computer Science, UC San Diego
The Price of Privacy: Sample Complexity Bounds for Differentially Private Classification

A natural question to ask is: What is the sample requirement of a learning algorithm that guarantees a certain level of privacy and accuracy? Previous work studied this question in the context of discrete data distributions, and provided an upper bound, and lower bounds for some specific classes. Dr. Chaudhuri shows a new lower bound that holds for general classification problems which obey certain fairly-loose conditions and considers this question in the context of learning infinite hypothesis classes on continuous data distributions. This talk is based on joint work with Dr. Daniel Hsu.

Nov 4, Nik Schork, PhD - Professor, The Scripps Research Institute
Functional Human DNA Sequence Variation and Global Diversity
Many studies have considered genomic differences between individuals from different geoethnic groups as a way of shedding light on human origins and possible explanations for differences in intra-individual disease susceptibility. However, most of these studies have focused on coding variations in a subset of all genes in the human genome. We consider data from 52 whole human genomes obtained from individuals from 10 different global populations. We assess the likely functional significance of both coding and non-coding variants across these genomes using a suite of bioinformatics techniques. Our findings showcase the utility of whole genome sequencing in population genetic studies and, importantly, emphasize the marked influence that genomic background diversity can have on the anticipated clinical interpretation of whole genome sequencing.

Oct 28, Xiaoqian Jiang, PhD - Postdoctoral Fellow in Biomedical Informatics, UC San Diego
New Directions of Privacy Preserving Data Dissemination
There is an increasing concern in disclosing sensitive information when clinical data are disseminated, given the potential for breach of individual privacy. Data sharing has become critical in the acceleration of biomedical research and healthcare quality improvement. However, it is not always possible to share critical information "as is". Dr. Jiang will discuss new directions of privacy protection in the context of data release/sharing (i.e., type, size, and usage) to reduce information loss and make disseminated data more meaningful.

Oct 21, Bjoern Peters, PhD - Assistant Professor, La Jolla Institute for Allergy and Immunology
The Immune Epitope Database - Representing Experiments Using the Ontology of Biomedical Investigations
The Immune Epitope Database catalogs experiments curated from journal articles. This presentation will give an overview of the IEDB as a whole, and focus on two areas of interest for medical informatics. First, we have established an efficient document categorization pipeline to identify journal articles of interest in PubMed, and assign them to a subject category. This is achieved through a cost-sensitive document classification system consisting of a hierarchical set of support vector machines. Second, we are continuously increasing our use of formal ontologies for knowledge representation, and are primarily using the Ontology for Biomedical Investigations. This presentation will demonstrate how replacing a database controlled vocabulary with OBI classes can be used to increase consistency in data curation, avoid duplicates, improve documentation to external users and enhance search capabilities.

Oct 14, Teresa Helsten, MD - Assistant Clinical Professor, UC San Diego, Moores Cancer Center
EMRs: An Oncologist's Perspective
The Moores UCSD Cancer Center (MCC) is at the forefront of oncology practice with regard to implementation of an electronic health record in a large, academic medical center. In particular, MCC was among the earliest to adopt and adapt Epic’s oncology application, Beacon, for the management and administration of chemotherapy. Dr. Helsten will describe how the UCSD Health Information Service customized Beacon for use at MCC as well as the advantages, pitfalls, and future needs of electronic oncology practice.

Oct 7, Aziz Boxwala, MD, PhD - Associate Professor of Biomedical Informatics, UC San Diego
Clinical Research Informatics at UCSD
Dr. Boxwala will describe clinical research activities at UCSD. In the first part of the presentation, he will describe clinical research informatics tools and services available at UCSD via the Clinical and Translational Research Institute. These include tools for data management, for clinical trials management, and for querying clinical data. In the second part of the presentation, he will survey research projects in the Division of Biomedical Informatics that are advancing the field of clinical research informatics. The research spans a wide-array of topics including multi-site collaborative studies, informed consent process, privacy of human subjects, data sharing, and data analysis.

Sep 30, Josh Peterson, MD, MPH - Assistant Professor of Biomedical Informatics, Vanderbilt University
Personalizing Evidence-Based Medicine with Advanced Clinical Decision Support
Personalized Medicine offers the promise of improving drug efficacy and safety, enabling a tailored approach to evidence-based therapies for patients. To apply genomic data into routine practice, many HIT platforms, including electronic health records, e-prescribing systems and clinical decision support (CDS) need to be extended to integrate this new class of data with existing physiologic determinants of drug response. The presentation will review examples of successful CDS to personalize drug prescribing which has been implemented and evaluated over the last decade at Vanderbilt University Medical Center, and report on our early experience with PREDICT (Pharmacogenomics Resource for Enhanced Decisions in Care and Treatment).

Sep 23, Wendy Chapman, PhD - Associate Professor of Biomedical Informatics, UC San Diego
Natural Language Processing for Analysis of Clinical Text: Challenges and Current Directions
After 50 years of research, natural language processing is still not widely applied in the clinical context. Researchers in clinical NLP are working towards making NLP more accessible through common data models, shared datasets, and web services. Dr. Chapman will provide a short tutorial on NLP, describe challenges in applying NLP to clinical text, describe current directions addressing those challenges, and summarize the vision for an NLP ecosystem hosted at UCSD. The goal of the ecosystem is to provide a cyber-environment for easier development, application, and benchmarking of clinical NLP tools.

Jun 3, Olivier Harismendy, PhD
Targeted Sequencing in Cancer Genomics
The identification of cancer somatic mutations via targeted sequencing of exons offers great opportunities for biomarker research and advanced clinical care. We will present both whole exome sequencing and candidate genes deep sequencing approaches to identify somatic mutations in clinical samples and will discuss their implications for basic as well as translational research.
May 27, Amarnath Gupta, PhD
Using Ontologies for Sharing Biomedical Research Data Effectively
“(Biomedical) Data sharing is essential for expedited translation of research results into knowledge, products and procedures to improve human health”. However, the process of sharing data for the benefit of others, is often unclear to a scientist who is willing to share data. In many cases, although a scientist has made her research finding available to others through the web or through supplementary materials of a journal, the data cannot be readily found, or cannot be readily used. We believe that data sharing can be effectively performed through semantic annotation, a process by which elements of the data are formally associated with known ontologies. In this talk, we present the principles of semantic annotation to facilitate data sharing and show how ontologies can be used to make the shared data discoverable and utilizable. We use concrete examples to illustrate how different kinds of ontologies play a role in the process of semantic annotation for data sharing.
May 20, Robert El-Kareh, MD, MS, MPH
Clinical Informatics Research at UCSD
Clinical decision support systems have the potential to lead to improved physician
performance and delivery of higher quality and more cost-effective care. However,
these interventions have yielded mixed results when implemented in real clinical
settings. The barriers for these systems are both technical and non-technical. This presentation will provide a brief overview of some of the issues related to the effective design and implementation of clinical decision support targeting physicians. Three example interventions will be discussed to highlight effective and ineffective approaches.

May 6, Dallas Thornton, MEng, MBA
HIPAA and FISMA-Compliant Research Cyberinfrastructure Services at SDSC.
Late this summer, working with the Division of Biomedical Informatics, SDSC will release a new secure private cloud environment to support research with HIPPA and/or FISMA compliance requirements at UCSD. This exciting new environment will allow researchers and organizations to safely store, manage, analyze, and deliver at-scale content that must meet these important security standards, including PHI and human sequencing data. The environment leverages the experience, design expertise, high network bandwidth, and resources of SDSC, the Division, and UCSD in a way that provides individual researchers capabilities, security, and economies of scale. This presentation will provide an overview of the services available, architecture, management processes, and ways to collaborate in this environment.

Apr 29, Ingolf Krueger, PhD
Healthcare Cyber-infrastructures
Our ability to rapidly obtain, correlate, mine and create new data from various traditional and non-traditional sources, and to make derived information available to patients, practitioners and researchers under stringent observance of policy constraints has become a key driver for quality healthcare. Converting and integrating scattered health records into an electronic medical record (EMR) via health information exchanges (HIEs) is one of many important steps towards realizing this goal. Others range from integration of wireless devices into novel health monitoring, alerting, and intervention workflows that empower patients, healthcare providers and researchers alike, to utilization of cloud-based networking, computation and storage resources.
This talk will outline central Software Engineering challenges in developing CyberInfrastructures for healthcare. From critical requirements for scalability, fault-tolerance and information assurance we will derive a policy-enabled architecture blueprint for rapidly building integrated healthcare solutions, and show how this blueprint maps to traditional and cloud-based deployment architectures. We demonstrate viability of the blueprint and deployment via three case studies: the Physical Activity Level Monitoring System (PALMS), the Cyberinfrastructure for Comparative Effectiveness Research (CyCore) and CitiSense - Adaptive Services for Community-Driven Behavioral and Environmental Monitoring to Induce Change.

Apr 22, Omar Bouhaddou, PhD
The Nationwide Health Information Network and San Diego Initiatives
There is currently an unprecedented opportunity for healthcare IT and informatics to help transform the healthcare system into a paperless industry. Everywhere in the world, there is an increased awareness associated with public-private investments and provider incentives to adopt and meaningfully use interoperable Electronic Health Records (EHRs). This presentation provides an overview of the United States Nationwide Health Information Network (NwHIN) initiative - a secure, standard-based, Internet-based, non-centralized infrastructure for health information exchange using the Internet. Providers, consumers, hospitals, pharmacies, laboratories, and others can setup a gateway conformant to the NwHIN specifications and use a common set of protocols, messages, and a common trust agreement for data sharing including identify shared patients, represent and enforce patient preferences, and find and retrieve health information from other participants. The first bi-directional production implementation took place in San Diego. They demonstrate these standards can be implemented consistently and provide lessons learned which are informing a scalable, more plug and play scalability of the network.

Apr 15, Marcio von Muhlen, PhD
Social Media, Medicine, and the Physician
What exactly is social media? Why are blogs and Facebook more popular than forums and wikis? I will review the underlying forces driving the spread of social media, aiming to inform the physician seeking to both understand its power and leverage it to improve healthcare. I will also introduce a new social media service under development at DBMI aimed at physician communities.

Apr 8, Gary Siuzdak, PhD
A Bioinformatics Platform for Metabolomics
Quantitative unbiased analysis of endogenous metabolites from cells, tissues, fluids or whole organisms - metabolomics, has become an integral part of functional genomics efforts lending itself as a tool for clinical diagnostics, toxicology, agriculture, biofuels, and understanding fundamental biochemistry. Where the genome and proteome represent upstream biochemical events, the metabolites correlate with the most downstream biochemistry and therefore most closely represent the phenotype. This has been proven by the broad success of metabolite analysis in clinical diagnostics. One of our aims is to obtain a comprehensive quantitative view of the metabolome to expand our understanding of what pathways are altered in specific diseases. We have developed multiple novel mass spectrometry-based informatics and technology platforms for metabolomics including both solution-based approaches and surface-based mass spectrometry, such as nanostructure-initiator mass spectrometry (NIMS) for tissue imaging, to address this problem. The development of a novel open source bioinformatics software called XCMS (with over 30,000 downloads) allows for alignment, statistical evaluation, and metabolite characterization from LC/MS data. In addition his group has constructed Metlin currently the largest online metabolite MS/MS database (with over 300,000 hits). At this meeting I will introduce a new, more comprehensive informatics platform for metabolomics.

Apr 1, Alan Calvitti, PhD
EMR Usability and Cognitive Task Loading
Electronic Medical Records (EMR)s confer advantages over paper records. But EMR usability must be distinguished from utility. In the context of delivery of quality care in ambulatory settings, specifically in time-constrained environment of the office visit, EMRs impose administrative and cognitive costs on physicians who, in an ideal patient-centric encounter, would focus primarily on communicating with the patient. We develop a systems level approach to study EMR usability and physician cognitive task load in the context of patient-provider communication during office visits. The model takes into account the level of the physician’s EMR activity, the degree of task switching between EMR and patient, and the complexity of EMR activity. Task analysis is developed by considering both hierarchical and sequential task analysis. Multimodal, time-domain measurements of activity such as EMR mouse click activity, physician’s gaze and patient and provider vocalizations, are compared along the common time axis of the visit. System-level process outcome measures that may help guide future EMR improvements include distribution of EMR mouse click activity for the whole visit across EMR functional categories (eg: notes, meds, orders) and transition graphs for quantitative profiling and comparison of EMR tasks between visits or conditional on gaze or vocalization codes within a given visit.

March 11, Naveen Ashish, PhD
Mediation Technology for Neuroinformatics Data Integration: An (F)BIRN Perspective
In this talk I will present the work done in providing integrated data
access to multiple neuroscience data sources, residing at different institutions and of different kinds, using the "information mediation" technology and approach. This work is part of the "BIRN" (Biomedical Informatics Research Network) project, as a collaboration between USC/ISI and UC Irvine. We have provided integrated data access to sources such as the Human Imaging Database "HID", the eXtensible Neuroscience Archive Toolkit (XNAT) and others using mediation technology from USC/ISI. I will present the desig and implementation details in developing thsi integrated application, as well as leasons learned and directions for (ongoing) further research.

March 4, Daniella Meeker, PhD
Mining Online Social Network Data: Health Risk Behavior in Adolescents and Young Adults
Peer effects have long been identified as key factors in health risk behavior, predicting outcomes such as smoking, drinking and obesity. However, results of health outcomes research based on analysis of network data are challenging to interpret conclusively. Online social networking sites have generated massive quantities of data that have been extensively mined for marketing purposes, development of recommender systems, and search optimization. The highest adoption and use of social media is among young people that are also in critical phases of developing health habits that will have lifelong impact. Can these data sources also be mined to better predict health risk behavior or identify potential targets and methods for intervention? We are conducting two longitudinal studies that link online social graphs to semantic analysis and survey data in teens and young adults. The first, a survey of high school students linked to their facebook and myspace graphs investigates how self-reported smoking and drinking is associated with different modalities and contexts of peer influence, such as romantic interest or perceived popularity in addition to reported friendship. The second is and analysis of 300 randomly selected international egocentric myspace networks beginning in 2006. Semantic analysis of profile content was conducted to investigate how publicly reported smoking and underage drinking was related to behavior of peers, music, and film preferences. I will describe analytic approaches we have applied from classical statistics and machine learning and the implications of our early findings from the standpoint of methodology and public health.
February 25, Craig Morioka, PhD & Frank Meng, PhD
Sentence Clustering & AAA Screening at GLA
Part I Text sequence patterns can be used to capture the variations in syntactic structure of similar sentences or phrases. The resulting patterns can be used as the basis for a pattern-based classifier, as a starting point to bootstrap the pattern building process for regular expression-based classifiers, to reveal the variation characteristics of sentences and phrases within a particular domain, or to expedite tagging large data sets. We present a methodology for generating such patterns from medical documents along with preliminary results and possible applications.
Part II Our work improves Veteran patient care by facilitating the coding of radiology results for abdominal aortic aneurysm (AAA) screening patients. The main research objective is to assist the co-management of patients by structuring imaging results, classification of pertinent positive and negative AAA findings, discover trigger words that will assist in the automatic coding of numerical results.
February 18, Yang Huang, PhD
Developing NLP Applications in Clinical Settings - Kaiser Experience
There has been tremendous progress made in improving natural language processing (NLP) technologies in the biomedical domain for the past decade; yet relatively few NLP applications have been successfully deployed in clinical settings, compared to numerous research applications. I will examine the challenges specific to developing NLP applications for clinical environments, and share the solutions and lessons learned at Kaiser Permanente Southern California.
February 11, Philip Payne, PhD
Knowledge Synthesis for In-Silico Science and Personalized Healthcare
The modern healthcare and life sciences environments are characterized by rapidly expanding collections of heterogeneous and large-scale data, information, and knowledge. Such resources represent significant opportunities for hypothesis generation, in-silico science, and the facilitation of personalized healthcare delivery paradigms. However, realizing such benefits requires the application if integrative informatics methods and technologies in order to synthesize knowledge from a full spectrum of data and information types and sources. In this presentation, a set of examples, drawn from the experiences of The Ohio State University Medical Center, will be use to illustrate the challenges and opportunities associate with such informatics-enabled knowledge synthesis. These examples will include case studies related to: 1) the creation of distributed data and information sharing “fabrics” spanning the clinical, research, and educational domains and crossing traditional organizational boundaries; 2) the application of knowledge-based agents to leverage domain knowledge derived from publically available databases, ontologies, and literature extracts in order to enable hypothesis generation and systems-level in-silico science; and 3) the derivation and testing of marker-complexes that link bio-molecular and phenotypic factors to support personalized healthcare delivery.

February 4, Jane Burns, MD
Unraveling the Mysteries of Kawasaki Disease with Genetic/Genomic Tools

January 28, Lisa Madlensky, PhD, CGC
What Do Patients Do with All That Data? Lessons from Cancer Genetics and Genomics
This seminar will include a discussion of the patient perspective of genetic information, using data from studies in the oncology and cancer risk assessment settings. Topics that will be covered include uptake of genetic/genomic testing; patients' understanding of genetic test results; and whether patients are likely to change their health behaviors or medical management as the result of genetic information. The discussion will also examine the current ethical and psychosocial issues that are increasingly important as large-scale genomic information becomes more accessible.

January 21, Eugene Yeo, PhD
RNA binding networks in mammalian neurons
RNA binding proteins govern the fate of genes in the cell by interacting directly with sequence and structural motifs embedded within pre- and mature messenger RNAs. This mode of post-transcriptional regulation is most prevalent in the nervous system, and dysregulation underlies many human disorders such as ALS and Spinal Muscular Atrophy. I will present our work on identifying the targets of a cohort of RNA binding proteins expressed in the brain using genome-wide biochemical and molecular approaches. We integrate RNA-seq, splicing-arrays and CLIP-seq data to reveal unexpected molecular mechanisms by which these RBPs regulate gene expression in mammalian neurons.
January 14, Todd Stout
Understanding Emergency Medical Service (EMS) Data and Its Use
EMS data has been used by epidemiologists for more than 10 years to monitor for symptoms of the use of weapons of mass destruction (WMD) in real time. Despite this history, due in large part to the local, ‘fractured’ nature of EMS in North America, as well as evolving technologies, the use of EMS data for public health surveillance and situational awareness has only become more commonly accepted within the last few years. Todd Stout, a former paramedic and EMS manager-turned software developer and EMS data evangelist, will explain the current state of data and technology in EMS, provide an overview of existing research (including work done by UCSD Super Computer Center researchers), current research objectives, as well as future needs and directions. He’ll show current operational and public health usage of EMS data, and examples of EMS records and types. Finally, he will discuss the potential for EMS data availability for meaningful research projects, using the 40+ million public safety records in databases right here in San Diego County.

January 7, Carl Stepnowsky, PhD
Sleep Apnea and Telemedicine: An Overview
The goal of this talk is several fold. First, some background on sleep apnea and its treatment will be provided. Then, given that background, we will discuss the role of telemedicine for both diagnostic and treatment methods related to sleep apnea. Finally, our own interventional approach will be discussed and demonstrated.