Seminar Presentations in 2010

December 3, Kelly Frazer, PhD
Human Genome Variation

November 19, Chanchun Xiao, PhD
MicroRNA Control in the Immune System
MicroRNAs (miRNAs) have emerged as a major class of trans-regulators that control gene expression at the messenger RNA level. Hundreds of miRNAs have been identified and bioinformatic studies predict that more are present in the human and mouse genomes. We focus on studying the identity, function, and molecular mechanisms of miRNAs in the immune system. We use the Illumina deep sequencing technology to examine miRNA expression profiles in the mouse immune system and in lymphoma specimens from a large cohort of patients with diffuse large B cell lymphoma, aiming to establish a comprehensive view of small RNAs (including miRNAs) expressed in lymphocytes under health and disease conditions, and to discover small RNAs with diagnostic and prognostic values. We employ mouse genetics and other experimental approaches to study the functions of individual miRNAs in lymphocyte development, immune responses, autoimmune diseases, and lymphomagenesis. Our studies have generated insights into the roles of individual miRNAs in the immune system and the molecular mechanisms underlying miRNA functions. I will discuss our findings in the context of general miRNA biology.

November 12, Claudio Joazeiro, PhD
A Ribosome-associated E3 Ubiquitin Ligase Implicated in Protein Quality Control
Our laboratory is interested in assigning function to, and elucidating mechanisms of, E3 ubiquitin ligases, the components of the ubiquitin-proteasome system which confer specificity to protein ubiquitylation. I will present our work leading to the annotation of over 600 putative E3s encoded in the human genome; this has led to the realization that more than half of human E3s have not been studied at any level, so I will also describe functional genomic tools that we have developed in order to accelerate discovery in the field (1,2). In the second part of the talk, I will present our recent findings on the role of a ribosome-associated E3 ubiquitin ligase in protein quality control(3).
1. Li et al. 2008. PLoS One 3(1):e1487
2. Deshaies and Joazeiro 1009. Annu Rev Biochem. 78:399-434.
3. Bengtson and Joazeiro 2010. Nature 467:470-3.

November 5, William Hsu, PhD
Using Disease Models to Support Context-Sensitive Visualization and Evidence-Based Medicine
Given the large quantity of diverse, heterogeneous data in a typical patient record, clinicians spend much of their time and effort finding relevant information to help accomplish their tasks. One of the challenges in today’s healthcare environment is matching the increased capability of gathering patient data with a comparable ability to understand, analyze, and act rationally upon this information. Our group is developing informatics tools to facilitate the extraction and structuring of observational data collected during clinical practice. My research focuses on utilizing this data to build disease models and end-user applications that leverage these models to answer questions related to the diagnosis and treatment of individual patient cases. In this talk, I will describe two ongoing projects: 1) a tool for abstracting and visualizing results of clinical trials reported in literature to support disease model creation, and 2) a model-driven application that visually summarizes trends in the patient data.
* contact site manager for presentation slide

October 29, Staal Vinterbo, PhD
Count Queries, Their Use, Challenges, and Opportunities
Count queries are queries that ask for the number of records in a data base or data warehouse that match a given predicate. Such queries are useful for business intelligence applications, inventory control, and study cohort finding. As count queries do not have any inherent privacy preserving properties, care must be taken when answering such queries on patient data. The UCSD Criquet tool is designed to efficiently serve the needs of health sciences researchers and simultaneously protect patient privacy. We will present this tool and discuss design choices in the context of privacy. Count queries can also be used as a general data access mechanism. We will present such a count based query mechanism, as well as an example of how the use of count queries can be used to parallelize sequential algorithms that can be formulated in terms of computations on histograms.

October 22, Richard Belew, PhD, Department of Cognitive Science, UCSD
HIVortal: Building a Shared Resource with Deep Semantics for Multi-disciplinary Knowledge of a Shared Pathogen
HIV of course has many particular biological features that distinguish it and its infected host. But HIV is also special: it has captured enormous human attention, during a time of unprecedented growth in our collective biological knowledge. So while the biological knowledge concerning HIV is unique in many respects, the long-term, deep, multi-disciplinary process by which it has been acquired represents an important example of knowledge regarding a larger class of modern pathogens with which the human species must become familiar. The rapid growth of biological knowledge has also generated many conceptions as to what biological knowledge looks like. "Systems biology" is a brave effort to find similarity which gives biological theory the same smooth shapes we associate with our knowledge of the physical universe; a “General Systems Theory of Biology” (GSTB) would be its goal. The HIVortal – HIV vertical portal – counter-balances the genericity GSTB approaches often impose, with efforts to integrate discipline-specific particulars of the HIV system towards a tool that facilitates collaboration across disciplines.

October 15, Charles Elkan, PhD, Department of Computer Science and Engineering, UCSD
Adding Latent Features to Log-linear Models to Make Predictions in Social Networks
Given a network, we often want to predict labels for nodes. In a social network of bloggers, a node label might be whether a blogger is leftwing or rightwing. We also often want to make predictions for edges. For example, in a protein network we might predict whether or not two proteins interact. From a machine-learning point of view, an unusual aspect of these tasks is that nodes possess unique identifiers, but often few features describing nodes are available. In this talk, I will present a new method for learning to make predictions concerning nodes and edges. The technical novelty of the method is that it adds hidden features to log-linear models. The method is called LFL for "latent feature log linear." The new method is the first to satisfy several important desiderata: (a) labels may be multiclass, (b) both feature values and unique identity are exploited, (c) latent features are inferred for nodes, (d) bias in selecting training data is unimportant, (e) outputs are well-calibrated probabilities, and (f) the method scales to large datasets. Experiments show that the new method achieves state-of-the-art accuracy for several collaborative filtering, link prediction, and label prediction tasks.

October 8, Jonathan Mack PhD, RN, NP, West Wireless Health Institute
Wireless Health Technology; Current Technology and Emerging Trends.
Our healthcare system is in a state of crisis. This is true not just in the United States, but also around the world in countries struggling to care for aging populations as well as regional, where doctors, clinics, and basic health care access is scarce or non-existence . There is no single solution to this crisis, but a convergence is taking place between medicine and wireless technology, making it possible to change healthcare delivery as we know it. By harnessing innovation in the wireless space; along with pervasive technologies such as ubiquitous sensing, cloud computing and social networks we can fundamentally shift the paradigm in health care delivery. Through application of this emerging technology, we have the opportunity to create an infrastructure-independent model of health care that enables remote diagnosis, treatment, and monitoring. This presentation will provide an overview of current technology, research and application of wireless technology as it is applied to disease management. Although the focus is creating wireless devices, in order to ensure clinician utilization and insurance reimbursement, a more immediate need is to define the disease management echo system where the devices will be applied Our discussion will include current challenges of implementing wireless technology specifically: device system architecture, clinical decision support, and device application.

October 6, Rich Wilson, MS, RHIA, Major, U.S. Army
Biomedical Informatics Issues in the Army Medical Department
The U.S. Department of Defense is the largest health care provider in the world. Both because of its size and the variety of environments that medical services are provided - ranging from battlefields in Iraq and Afghanistan to outpatient clinics - a variety of interesting informatics challenges arise. To ensure that future medical care is optimized, the Army Medical Department is collecting large masses of clinical data from which new knowledge can be derived. In this presentation I will first provide an overview of the Army Medical Department, including current operations, the state of health information technology, and the Army's biostatistical data repository. Second, I will present a project in which natural language processing of history and physical reports was used to automatically identify patients with personal and/or family history of mesothelioma. Finally, I will outline my future research directions in clinical information retrieval, as well as areas of potential areas for collaboration with the Army Medical Department.

October 1, Ricky Taira, PhD
Modeling and Visualization of Neuro-Oncology Cases
We describe the development of a prototype tool for the abstraction of patient information for the purpose of formally documenting the course of disease for a given patient. The test domain is neuro-oncology. The features of the tool include: 1)the use of natural language processing tools to assist a clinical case abstractor in structuring report information; 2) Specification of a target neuro-oncology situational ontology; 3) an interface for intra and inter-coreference tagging of findings/problems; 4) an environment for storing/retrieving and editing cases that have been previously or currently being structured. The results of this abstraction process are used in an application (to be described in a later seminar by Dr. Hsu) for visualizing both patient and population trends. Insightful anecdotal user comments and discussions of future work and scope are presented.
* contact site manager for presentation slide

September 24, John Fontanesi, PhD
Implications of a theory of healthcare quality for computational medicine
There is a rich history of Service Science that has improved the quality of IT, Call Center and Marketing Industries. What Service Science is, how it can be translated to health care quality and the implications for various aspects of computational medicine will be explored in this presentation.

Special Seminars, Summer 2010

September 16, Anupam Goel, MD
Implementing a Health Information Exchange across San Diego County – Informatics, Clinical and Political Hurdles
San Diego was selected as one of the 17 sites across the country to serve as a Beacon community to see how exchanging clinical information can improve health care quality and reduce health care costs. Over the next three years, several UCSD faculty will be partnering with clinical organizations across San Diego to share clinical information in real-time and to generate a reporting strategy to measure the program’s effectiveness. The presenter will speak about the various challenges that the project team has faced so far and their attempts to overcome those challenges.

8/27/2010, Genevieve Melton-Meaux, MD
Leveraging Empiric Unstructured Data to Inform Biomedical Standards
Biomedical standards along with associated structured data are vital for advancing clinical care and secondary functions including research, decision support, healthcare quality measurement, and other automated tasks. Despite the continued advancement and importance of biomedical standards, a large amount of standard development continues to occur in a top-down fashion through input and curation by expert stakeholders with select “use cases”. Unstructured data in the form of text provides an opportunity to inform standards in a bottom-up manner through empiric observational experiments with narrative in practice. I will describe a complementary set of studies leveraging unstructured data to inform standards in three settings: HL7/LOINC Document Ontology to exchange of clinical documents; HL7 Clinical Statement and HL7 Clinical Genomic Family History Models for family history information; and Omaha System to document clinical care as an interface standard in community practice.

7/6/2010, Matt Williams, MD, PhD
Arguing about Cancer
The rate of accumulation of medical knowledge makes it difficult to keep up to date with the literature, which is also often contradictory and incomplete. Current solutions, such as the development of reviews, Meta?analyses and guidelines provide aggregated evidence, but are time?consuming to construct, prone to becoming outdated and do not allow users to interact with the evidence. Argumentation is a relatively new approach to defeasible reasoning. We have developed a system, OAF, that integrates a Description?logic ontology with a Defeasible Logic argumentation formalism to allow us to construct arguments about treatment choices based on the results of clinical trials. Patient classes, treatments and outcomes are all represented in the ontology, and rules use only terms from the ontology. Subsequent work has focused on extending this to deliver a richer argument?based KR formalism to allow reasoning with and about clinical trials. Our eventual aim is to develop the idea of a 'Logical Review' of the evidence, as opposed to current statistical approaches. This talk will cover some of the clinical motivations, discuss argumentation theory in general and then discuss its use in our domain, including a summary of our current research topics.

6/4/2010, Ravindra Mehta, MD
Decision making in Acute Kidney Injury: From Risk Assessment to Therapeutic Intervention
Acute kidney Injury (AKI) is a common event in hospitalized patients contributing to adverse events including mortality rates > 50%. Several studies have identified risk factors for AKI and risk scores have been developed for specific settings e.g. contrast nephropathy. Despite the availability of these risk scores their utilization in clinical practice is variable. This seminar will describe the development of a risk assessment model for AKI in ICU patients and discuss the potential applications of this score for risk assessment, surveillance and guided therapeutic interventions. The potential incorporation of these scores in decision support systems will be defined.

5/28/2010, Lilia Iakoucheva Sebat, PhD
“-omics” Approaches for Studying Psychiatric Diseases
Genes play an important role in the etiology of psychiatric disorders. Recent genome-wide association studies and studies of copy number variants in psychiatric diseases implicated a large number of different genes including common and rare variants. Our main goal is to understand how these seemingly unrelated candidate genes contribute to psychiatric diseases, such as autism and schizophrenia. We hypothesize that defects in multiple genes that are diverse in their individual functions, but interact within the context of common cellular pathways/networks/functional modules are important for disease pathogenesis. To discover these common pathways, we are starting to build disease-focused protein-protein interaction networks using experimental and computational approaches. Our final goals are: (1) to construct comprehensive maps of protein-protein interaction networks for autism and schizophrenia candidate genes; (2) to define the splicing repertoire of the autism and schizophrenia candidate genes; (3) to integrate the splicing interactomes with the traditional interactomes; (4) to investigate the perturbations of the disease networks and functional modules by mutations identified from the patients

5/21/2010, David Perkins, MD, PhD
Omics in Medicine

5/14/2010, Vineet Bafna, PhD
Algorithmic problems in genotype-phenotype correlations
The availability of inexpensive genotyping and sequencing techniques allow us the opportunity to sample the genomes of large cohorts of individuals. The associations of these genotypes with phenotypes offers novel computational challenges due to the large volumes of data, and complex inheritance patterns of the phenotypes. In this talk, I will describe recent, mostly unpublished, algorithms for correlating genotypes with phenotypes. Specifically, I will discuss the discovery of rare variants, locus locus interactions, and tests of selection.
contact the presenter at vbafnaATucsdDOTedu for the presentation materials

5/7/2010, David Chang, PhD
Under-triage of Elderly Trauma Patients
OBJECTIVE: To determine whether age bias is a factor in triage errors. DESIGN: Retrospective analysis of 10 years (1995-2004) of prospectively collected data in the statewide Maryland Ambulance Information System followed by surveys of emergency medical services (EMS) and trauma center personnel at regional EMS conferences and level I trauma centers, respectively. PATIENTS: Trauma patients were defined as those who met American College of Surgeons physiology, injury, and/or mechanism criteria and were subjectively declared priority I status by EMS personnel. MAIN OUTCOME MEASURE: Undertriage, defined as when trauma patients were not transported to a state-designated trauma center. RESULTS: The registry analysis identified 26 565 trauma patients. The undertriage rate was significantly higher in patients aged 65 years or older than in younger patients (49.9% vs 17.8%, P < .001). On multivariate analysis, this decrease in trauma center transports was found to start at age 50 years (odds ratio, 0.67; 95% confidence interval, 0.57-0.77), with another decrease at age 70 years (odds ratio, 0.45; 95% confidence interval, 0.39-0.53) compared with patients younger than 50 years. A total of 166 respondents participated in the follow-up surveys and ranked the top 3 causal factors for this undertriage as inadequate training, unfamiliarity with protocol, and possible age bias. CONCLUSIONS: Even when trauma is recognized and acknowledged by EMS, providers are consistently less likely to consider transporting elderly patients to a trauma center. Unconscious age bias, in both EMS in the field and receiving trauma center personnel, was identified as a possible cause.

4/30/2010, Yunan Chen, PhD
Documenting Transitional Information in EMR
An observational study was conducted to examine EMR-based documentations in an Emergency Department (ED), with an emphasis on computerized documentation activities in complex flow of clinical processes. This study revealed a huge gap between formal EMR documentation and actual clinical workflow, which forces ED staff to rely on intermediate - transitional artifacts to facilitate their work. The analysis of these transitional artifacts in four different clinical workflows show that the EMR system’s inability to document procedural information, capture key information, and present information according to actual clinical workflow lead to the use of transitional artifacts. The findings of this study call for designing EMR system not only for keeping formal patients’ records, but also for documenting transitional information in the chart-writing process.

4/23/2010, Anand Sarwate, PhD
Protecting privacy in informatics: are we there yet?
Privacy is a loaded term in the English language; if you ask 5 different people what they think it means, they will give you 5 different answers. In information processing systems, guaranteeing privacy most often means "protecting against re-identification." In medical informatics this is spelled out in the HIPAA Privacy Rule. In this talk I will give a survey of different ways in which sensitive information may be handled and describe approaches towards developing a quantitative definition of privacy. I will describe "state of the art" methods that guarantee statistical privacy and will show their limitations for clinical and research applications. Privacy in these settings must be addressed by systems incorporating statistical methods, access control, regulation, and repercussions for misuse.

4/16/2010, Grace Kuo, PharmD, MPH
Effects of EMR on Medication Safety in Primary Care Practices
Context: Evidence for effective strategies of medication reconciliation in the ambulatory care setting is lacking. Study Objectives: To compare medication reconciliation practices between primary care clinics using electronic medical record (EMR) and clinics using paper medical record (PMR). Study Design: Cross-sectional observational and time motion workflow study. Setting: Five primary care clinics in San Diego.Participants: 150 adult patients taking at least two medications completed the study (with a total of 1,238 medications). Essential feature of study: Outpatient medication reconciliation accuracy. Outcome measures: Frequency and type of medication reconciliation. Results: Of the 1,238 medications, the frequency of medication reconciliation by names only was similar between EMR and PMR clinics (96% vs. 97%, p=0.209) but the frequency of medication reconciliation by names plus directions was more in PMR clinics (42% vs. 65%, p<0.001). However, the accuracy of medication review was better in the EMR group. In the study, 57% of medications were reviewed and names reconciled (64%EMR vs. 47%PMR, p<0.001), 25% were medications reportedly used by patients but not recorded in the chart (19%EMR vs. 35%PMR, p<0.001), 18% were medications recorded in the chart but not used by patients (17%EMR vs. 18%PMR, p=0.580). Medication names were reviewed by 90% of nurses and 85% of providers (or 96% by any). Conclusions: Medication reconciliation occurred almost always by names in both EMR and PMR clinics but only half of the time by names plus directions – more frequently in PMR clinics. Medication reconciliation accuracy occurred in half of medications; the use of EMR appears to be significantly better compared with the use of PMR. Times spent by nurses and providers in reviewing medication names only and medication names plus directions were significantly different between groups but not significantly different within groups.

4/9/2010 Lola Ogunyemi, PhD
Clinical Informatics and Telemedicine in Urban Safety Net Clinics
Urban, safety net clinics work primarily with uninsured and underinsured patients who often have chronic conditions that need to be addressed. In this talk, I will discuss the role that informatics can play in enhancing the quality of care at such clinics; focusing on two projects that the Center for Biomedical Informatics at Charles Drew University has with seven urban, safety net clinics in South Los Angeles. One involves computerized decision support for diabetes care in a Los Angeles County health center and the other involves teleretinal screening for diabetic retinopathy at six federally qualified health centers.

4/2/2010, Brian Chapman, PhD
Eratosthenes and Medical Imaging Informatics
In vivo imaging technologies have transformed the practice of medicine. However, the full potential of in vivo imaging has not been realized, in large part because medical imaging remains primarily a qualitative rather than a quantitative discipline. Using illustrations drawn primarily from liver, intracranial and pulmonary imaging, I will address the role informatics can play in transforming medical imaging into a quantitative medical discipline.

4/2/2010, Wendy Chapman, PhD
Natural Language Processing of Clinical Reports
Natural language processing is critical to successful decision support, surveillance, and research from clinical records. I will discuss the three steps we have addressed in attempting to extract information from clinical reports, including concept identification, contextual property identification, and discourse processing. I will describe how we are addressing each of the challenges in the NLP applications we are developing in our research lab in order to apply NLP technology to biosurveillance from emergency department reports and automated charting of live dental exams.

3/23/2010, Peter Szolovits, PhD
Contemporary Clinical Research: An Informatics Approach
Clinical trials are very expensive, slow, and tend to be limited to a modest number of patients. Inspired by our field’s growing ability to perform high-throughput measurements of genes and their expression, we explore the possibility of using ordinary clinical data sets to stand as a proxy for high-throughput determination of phenotype and environment. If successful, this promises to speed up translational research, make better use of existing data, and reduce the costs of studies.
I describe our experience in applying this model to an exploratory study of rheumatoid arthritis. In addition to describing our process, I will also introduce the machinery developed by the Partners Healthcare i2b2 project to support such efforts, describe a clever and privacy-protecting mechanism developed at the Brigham and Women’s Hospital for using discarded blood samples to speed study accrual, and then elaborate on the medical natural language approaches we have taken to extracting meaningful data from the narrative text that forms much of clinical records.

Mar 12, Vibha Bhatnagar, MD, MPH
The Electronic Medical Record in the Post-genomic Era
The electronic medical record (EMR) can be used to create very large scale databases for genetic studies and “virtual cohorts” for the study of complex chronic diseases in a setting representative of clinical practice. Increasingly, genetic studies are expected to leverage the EMR for accurate patient phenotyping Hypertension and renal failure are examples of a complex traits, determined by the interplay of multiple genes, lifestyle and medical comorbid factors, that may be best studied in the primary care (or clinical cohort) arena rather than smaller experimental (clinical trial) settings. One of the key features of our work was to take advantage of the large-scale, comprehensive and integrated EMR in the Veteran’s Administration Healthcare System known as VistA Veterans Health Information Systems and Technology Architecture and explore the predictive value of genetic markers for antihypertensive drug response and hypertension related comorbid disease. The VA EMR can be accessed via a graphical user interface known as the Computerized Patient Record System CPRS, an intuitive single interface for health care providers to review and update a patient’s EMR. VASDHS is a part of the Veterans Integrated Service Network 22 (VISN-22), spanning Southern California and Southern Nevada. VISN-22 provides research support by abstracting electronic medical data from individual VA healthcare facilities and makes the data accessible to investigators for research purposes in a “data warehouse”.

Mar 05, Brian Clay, MD
Indication-Based Ordering in CPOE: Examples from the UCSD Inpatient Experience
Computerized provider order entry (CPOE) has the potential to improve clinicians' ordering of tests and treatments. The UC San Diego Medical Center Clinical Information Systems team has used the electronic medical record and computerized provider order entry system in place at UCSD to improve ease of ordering for clinicians, as well as to assist providers with adherence to best practices and quality measures. In addition to constructing order sets, which group related orders together for convenience, we have also used a strategy of indication-based ordering in selected instances to improve patient care and compliance with best practices. In this session, examples of inpatient indication-based ordering at UCSD will be reviewed, with emphasis on strategies to effectively drive provider practice, as well as demonstration of the clinical outcome improvements resultant from these interventions.

Feb 26, Aziz Boxwala, MD, PhD
The Clinical Data Warehouse for Research at UCSD
The Division of Biomedical Informatics is creating a repository of clinical data, known as the Clinical Data Warehouse for Research, to serve the needs of clinical and translational researchers at UCSD. In this presentation, Dr. Boxwala will outline the design of the CDWR and the implementation approach. He will describe some of the challenges in the design of such a tool, in particular, around securtiy, privacy, performance, and usability of such a tool, and the proposed solutions.

Feb 19, Karen Messer, PhD
Validation of Significant Gene Sets via Capture-Recapture across Independent Studies
Public repositories of large genomic data sets, such as Gene Expression Omnibus [gstest: Edgar and others, 2002] and the Oncomine repository for cancer studies [gstest:Rhodes and others, 2007], have enabled researchers to combine data from multiple genomic studies in order to gain power. In a common informal approach, data from independent studies are obtained by the researcher, and lists of the top-ranked genes selected within each study are compared across studies. Genes selected as highly ranked multiple times are considered signi¯cant. A successful example of this approach was given by Tomlins et al.[gstest:Tomlins and others, 2005], who re-analyzed data from 132 independent studies of solid tumors in Oncomine and listed the 10 highest- ranked genes in each study. They discovered 42 genes which appeared on two or more lists, and subsequently validated two of these as important fusion genes in prostate cancer. However, conduct of such a “capture-recapture" study remains ad hoc, in that power and false discovery rate are usually uncontrolled. In this paper, we define a capture-recapture hypothesis test, and present a simple test statistic which has an approximate Poisson distribution. We give criteria for identification of statistically significant sets of genes and give practical guidance on how to design a capture-recapture study for adequate power while controlling Type I error on the set of genes and the false discovery rate per gene. (Joint work with LOKI NATARAJAN, MINYA PU, ROMAN SASIK)

Feb 12, Nuno Bandeira, PhD
Next-generation Proteomics: The Computational Engine of Mass Spectrometry Discoveries
The ongoing quest for understanding the cellular role of proteins is riddled with Bioinformatics computational puzzles whose solutions could dramatically affect health care, bioenergy and biotechnology in general. The main generator of these protein puzzles is mass spectrometry – a technology whose development merited a Nobel Prize in Chemistry (2002) and continues to expand and improve at an impressive rate. In mass spectrometry, proteins are represented by ‘spectrum fingerprints’ whose identification is key to understanding cellular proteomics states – a task typically addressed by database search techniques. This talk will focus on a novel computational framework for the analysis of mass spectrometry data (Spectral Networks) and discuss the underlying algorithmic and machine learning problems. Consequent applications of these algorithms include protein identification in cataractous lens, characterization of monoclonal antibody drugs, sequencing snake venoms and high-throughput elucidation of bioactive cyclic peptides.

Feb 05, Kun Zhang, PhD
Targeted Sequencing and Applications
The rapid advances in next-generation DNA sequencing have dramatically accelerated the pace of collecting genomic, epigenomic and transcriptomic information, and relating such information to biological mechanisms or human diseases. Efficient parsing, interpreting and managing of such information has become increasingly challenging. Targeted sequencing represent a concept of achieving a high efficiency by selectively gathering information that is most relevant to the biology. In this talk I will present a few applications of targeted sequencing with an emphasis on stem cells.

Jan 29, Trey Ideker, PhD
Network-based Biomarkers for Human Health and Disease
Biomarkers are typically thought of as individual genes and proteins. However, complex phenotypes observed during development and disease are rarely due to single proteins. Recently, we have shown that protein networks are a source of powerful biomarkers, and that in many cases these biomarker networks are more predictive than any individual gene. We are investigating protein-network-based biomarkers in three areas: (1) Cancer stratification and improved diagnosis. Our approach is to project gene and/or protein expression profiles of each patient or tissue sample onto the known human protein network map to identify pathways that are predictive of cancer state. We have used this “network-based” biomarker approach to show improved accuracy in diagnosis of breast cancer and leukemia. (2) Tissue type specification. Biomarkers also play an important role in the process of tissue differentiation. We have recently used the network-based biomarker approach to identify a subnetwork of 15 interacting homeobox transcription factors that can predict the developmental origin of tissues. (3) Improved power and interpretation of genome-wide association studies (GWAS). Finally, protein networks may be the key to mining GWAS to understand complex diseases for which not one but many genetic loci play a role. We have recently used protein networks to translate GWAS into maps of functional interactions among protein complexes and pathways. For professional distribution of our network-based technologies, we are developers of the Cytoscape platform, an Open-Source software environment for visualization and analysis of biological networks and models

Jan 22, Yu-Tsueng Liu, MD, PhD
The Biomedical Informatic Challenges of Future Molecular Diagnostics
Advanced technologies alone cannot solve clinical problems. One must be able to identify and connect the key elements in a patient’s complex history, physical examination and laboratory tests. A clinician is likely to be overwhelmed by the enormous amounts of data generated by current and future genomic technologies. For example, the whole genomic sequence may soon become part of a person’s medical record. In this seminar, I will present our non-conventional approaches using comprehensive microarrays for molecular diagnostics in the areas of infectious diseases and cancer. The potential challenges of molecular target design, data analysis and interpretation with microarray and next-gen sequencing technologies for clinical diagnosis will be discussed.

Jan 15, Philip Bourne, PhD
The Changing Landscape of Scholarly Communication as it Relates to the Biosciences
The means by which science is disseminated and comprehended is changing. These changes are particularly prevalent in the biosciences. Databases are becoming more like journals and journals are becoming more like databases. Open access provides new opportunities for semantic enrichment and new ways to present data and knowledge. This includes the use of rich media (video and podcasts). I will discuss some of these changes with illustrations from our own work with respect to the RCSB Protein Data Bank, the Public Library of Science and SciVee. If successful I will have also have conveyed some of our latest findings in network pharmacology and evolution studied through protein structure.

01/08/10, Faramarz Valafar, PhD
The Gene Wiki: Community Intelligence Applied to Human Gene Annotation
Annotating the function of all human genes is a critical, yet formidable, challenge. Current gene annotation efforts focus on centralized curation resources, but it is increasingly clear that this approach does not scale with the rapid growth of the biomedical literature. The Gene Wiki utilizes an alternative and complementary model based on the principle of community intelligence. Directly integrated within the online encyclopedia, Wikipedia, the goal of this effort is to build a gene-specific review article for every gene in the human genome, where each article is collaboratively written, continuously updated and community reviewed. Previously, we described the creation of Gene Wiki ‘stubs’ for approximately 9000 human genes. Here, we describe ongoing systematic improvements to these articles to increase their utility. Moreover, we retrospectively examine the community usage and improvement of the Gene Wiki, providing evidence of a critical mass of users and editors. Gene Wiki articles are freely accessible within the Wikipedia web site, and additional links and information are available here . This talk will present a retrospective look at the Gene Wiki project after its first announcement a year ago. Consequences of the collected data and emerging results on gene function and gene regulatory networks will be briefly discussed. If time permits, other related projects currently under way at the Biomedical Informatics Research Center (BMIRC)at SDSU will be briefly presented.