SBC seminars 2008

Wed Jan 916:00 Olof EmanuelssonSBC
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Exploring potential chromatin-level regulation of gene clusters in Arabidopsis thaliana

DNA in multicellular organisms is densly packed around histones. The DNA-histone complex is called chromatin. For a gene to be expressed, the chromatin must be opened so that the transcription machinery is able to access and transcribe the genes. It has previously been shown that genes that are close to each other on the chromosome may be co-regulated through local changes in the chromatin structure inflicted by a chromatin-modifying protein. Here, the aim is to study whether a particular chromatin-modifying protein is involved in the regulation of certain clusters whose genes exhibit a tissue-specific activity in the Arabidopsis plant.

Wed Jan 2316:00 Bengt SennbladSBC
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Extending models for gene duplication and loss to occur inside hybrid species networks.
Abstract: We have previously developed the gene evolution model, a probabilistic model for gene duplication and loss that describes how a gene phylogeny evolves inside a species tree that describes the relationship of the species genomes from which the genes are sampled. This allowed us to compute the probability of a specific gene tree given a species tree as well as the probabilities of specific reconciled trees and orthology probabilities.
Species relationships resulting from evolution by speciations and extinctions described as a binary tree. However, so-called lateral events, such as, e.g., hybridizations beween species, also occur in organism evolution and may introduce a reticulate relationships among species.
I will present work on a new probabilistic model for species evolution including allopatric hybridization, i.e., yielding a hybrid species network, and describe preliminary work on an extension of the gene evolution model to occur inside such a hybrid species network. Combined in a Bayesian framework, these models allow the estimation of hybrid species networks from a set of gene trees, and I will present som preliminary analysis results illustrating this.
Wed Feb 616:00 Andrey AlexeyenkoSBC
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)FunCoup: networks of functional coupling in eukaryotes

FunCoup is a statistical framework of data integration for finding functional coupling (FC) between proteins. It is capable of transferring information from model organisms via orthologs. Data of different sources and various natures (contacts of whole proteins and individual domains, mRNA and protein co-expression, protein co-occurrence in the cell, miRNA and TF targeting, phylogenetic profiles etc.) are collected and probabilistically evaluated in a Bayesian network, trained on sets of known FC cases vs. the general population of protein pairs as background reference. FunCoup was optimized to address known drawbacks of Bayesian estimators. The number of simultaneously used model organisms (7) and individual datasets (65-70) has achieved the practical maximum.

FunCoup is a self-consistent framework that can incorporate nearly any kind of data from various data sources. It has thus been possible to generate networks for several organisms (human, mouse, rat, zebrafish, worm, fly, Arabidopsis, and yeast) in respect of different types of functional coupling. A network for Ciona intestinalis, which had neither training sets nor sources of its own data, was created as well. The networks are available at the FunCoup website.

Wed Feb 1316:00 Rickard SandbergCMB, Karolinska Institutet
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Analysis of transcript isoforms across human tissues using Solexa sequencing of mRNAs

Several mechanisms can produce alternative mRNA isoforms from a single gene locus. Identification of these isoforms and their functions is critical to our understanding of biology and our ability to address problems in human health and disease. We have performed ultra-high throughput sequencing of polyA+ mRNA isolated from a panel of 9 normal human tissues and 5 cell lines using the Solexa/Illumina platform and analyzed the over 300 million 32 base pair-long reads obtained. The reads were initially mapped to the genome and transcriptome. Analyses of these data indicated that the vast majority of human genes are alternative spliced and high levels of tissue-bias in the expression of mRNA isoforms. We also evaluated the data in terms of intrinsic biases and theoretical and empirical limitations of mRNA isoform detection, establish the Solexa/Illumina platform as a powerful technology capable of detecting alternative mRNA isoforms with unprecedented coverage and resolution.

Wed Feb 2016:00 Ali TofighSBC/CSC
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)A mathematical model for cancer progression

Cancer is a complex disease in which cells are transformed, through acquisition of different aberrations, to malignant types that are able to divide, invade, and metastasize. During the course of the disease, malignant tumors are able to acquire traits enabling the growth, sustenance, and spread of tumors within the body. To properly understand a certain cancer form, we need a representation of its patterns of possible progression.

A simple descriptive path model of colon cancer was suggested by Vogelstein in 1988. In recent years, several mathematical models for cancer progression have been suggested, often using directed trees as a basis. I will present HOTs, hidden-variable oncogenetic trees, that can be used to infer models of cancer progression from high-throughput data.

Wed Feb 2716:00 Arne ElofssonSBC/CBR
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)New challenges and methods for prediction of membrane protein topology

Since high-resolution structural data are still scarce, different kinds of theoretical structure prediction algorithms are of major importance in membrane protein biochemistry. But how well do the current prediction methods perform? Which structural features can be predicted and which cannot? And what can we expect in the next few years? Here, we will try to answer these questions, with a particular focus on the new types of substructures found in recently solved membrane protein structures, including reeentrant regions.

Wed Mar 1216:00 Åsa BjörklundSBC/CBR
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Nebulin - a case study of repeat evolution
Nebulin is a very large actin-binding proteins that is involved in the formation of thin muscle filaments. It is believed that nebulin acts as a thin filament "ruler" that regulates filament length. The nebulin protein consist of repeats of up to 190 nebulin domains and it is the intriguing evolution of these repeats that we are interested in. There are three other protein families that contain nebulin repeats, Nebulette, N-RAP and LASP. Interestingly, we find that a cassette of seven domains seem to have been duplicated in tandem several times in some regions of Nebulin and N-RAP. On the other hand, other parts of the proteins have evolved through duplication of one or two domains. Now the task at hand is understanding the forces that govern these tandem duplications. Therefore, we have compared the domain composition of nebulin containing proteins in a wide range of animals. With a good view of some of the duplications that occur we are now trying to identify the genomic regions were the repeating units have been duplicated. The final aim is to determine if there are any conserved patterns that facilitate duplications.
Wed Mar 1916:00 Jens LagergrenSBC/CSC
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Probabilistic analysis of orthology relations and duplication and loss rates two case studies
I have together with collaborators at SBC developed probabilistic models of and tools for analyzing evolution of gene families with respect to duplication and losses. Recently, there has been a number of articles focusing on similar tools for allele sorting or applying less sophisticated tools in studies of duplications among primates. I will discuss our model and tools but focus on a recent analysis of the major histocompatibility complex class 1 (MHC) gene family and a ATB-binding cassette transporter gene family, namely, subfamily A (ABCA). This is based on joint work with Bengt Sennblad.
Wed Apr 216:00 Diana EkmanSBC/CBR
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Species specific sequences in Fungi
Duplication, domain shuffling and de novo creation are some of the mechanisms involved in evolution of new proteins. It is believed that de novo creation is comparatively rare. Anyhow, only about three quarters of the residues in a typical proteome can be matched to a domain in the PfamA and PfamB databases, although this is one of the most sensitive methods available for identifying homology between proteins. Hence, there are parts of sequences and whole proteins where no domains are detected. Is it possible that these sequences are novel, species specific domains?

To answer the question we have estimated the amount of novel material in the proteome of Saccharomyces cerevisiae. Our results show that at least two thirds of the residues are aligned to non-fungal homologs, whereas only a small fraction is species specific. Further, many of the species specific sequences in S. cerevisiae are homologous and associated with transposable elements. Finally, newer sequences are often short, disordered sequences located at the termini. Therefore, we conclude that domain innovation is rare in yeast, however shorter sequences may be created. Other species, for example multicellular eukaryotes, have additional mechanisms for de novo creation, such as exonization of intronic sequences and may contain more novel sequences.

Wed Apr 2316:00 Hans EllegrenEvolutionary Biology Centre, Uppsala University
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)The Genomics of Phenotypic Diversity in Natural Populations
[no abstract yet]
Wed May 716:00 Mike HallettMcGill Centre for Bioinformatics, McGill University, Montreal, Canada
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Towards a systems approach to understanding breast cancer
It is increasingly evident that breast cancer outcome is strongly influenced by signals emanating from tumor-associated stroma. However, little is known about how gene expression changes in this tissue affect tumor progression. In this talk, we compare gene expression profiles from laser capture-microdissected tumor-associated versus matched normal stroma, and derive transcriptional profiles strongly associated with clinical outcome. We present a stroma-derived predictor that generates new information to stratify disease endpoint, independent of standard clinical prognostic factors and previously published predictors. Our predictor selects poor-outcome patients from multiple clinical subtypes, including node-negative patients, and predicts outcome in multiple published expression datasets generated from whole tumor tissue. Our predictor has increased accuracy compared to previously published predictors, and prognostic accuracy increases when these predictors are integrated using graphical models. Genes represented in the stroma-derived predictor reveal the strong prognostic capacity of differential immune responses as well as angiogenic and hypoxic responses.
The computational and statistical aspects underpinning this work are built upon a new approach to analyzing gene expression data that in some sense is "orthologonal" to traditional clustering based tools, and is general in the sense that a wide range of data types can be easily integrated into the system.
Wed May 1416:00 Ola SpjuthDepartment of Pharmaceutical Biosciences, Uppsala University
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Bioclipse - an open source workbench for chemo- and bioinformatics in the eScience era
Bioclipse (http://www.bioclipse.net) is a workbench for chemo- and bioinformatics (e.g. small molecules, sequences, proteins, spectra) and provides 2D-editing, 3D-visualization, file format conversion, calculation of properties and much more - all fully integrated into a desktop application. Bioclipse is equipped with a state-of-the-art plugin architecture, which means it can easily be extended for adding functionality in any direction whilst hiding technical solutions behind easy to use graphical interfaces for end users.

Bioclipse takes full advantage of many of the promises of eScience, like collaborative work, utilizing local and remote (Web) services, integrating various databases and software tools, and sophisticated data analysis prepared for high performance computing. The focus is data and software interoperability, in order to reduce manual work when performing bioinformatics tasks.

Bioclipse is an international collaboration with over 28 contributors, and has been awarded 3 international innovation awards. The development of Bioclipse version 2 can be followed on http://bioclipse.blogspot.com/ and http://wiki.bioclipse.net/index.php?title=Bioclipse2.

Wed May 2116:00 Aron HennerdalSBC/CBR
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Contact Prediction for Membrane Proteins
[no abstract yet]
Wed May 2816:00 Erik SjölundSBC
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Advices regarding CGI programming
Poorly written CGI scripts often contain security holes. This seminar gives some advices about how to make your perl CGI scripts more secure, e.g. the need to validate input and benefits of avoiding temporary files.
Wed Jun 416:00 Mathieu BlanchetteSchool of Computer Science, McGill University. McGill Center for Bioinformatics
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Whole-genome comparative and regulatory genomics
This talk will describe how a whole-genome computational prediction and analysis of human regulatory regions can yield important insights into gene regulation, and how genome evolution, and in particular computationally reconstructed ancestral DNA sequences, can help in this process. I will first describe a approach to the detection of cis-regulatory modules that exploits both inter-species comparison and binding site clustering. The analysis of the ~120,000 modules identified by this algorithm reveals a number of interesting observations regarding the overall distribution properties of the modules, but also regarding the properties of the individual transcription factors predicted to bind them. These properties include association to particular expression patterns or function, co-occurrences of binding sites for pairs of transcription factors, and broad regulatory network properties. In the second part of the talk, I will briefly introduce a joint project with Dr. David Haussler and Dr. Webb Miller, aiming to reconstruct the complete genome of ancestral mammals. I will focus on how this ancestral sequence information can help our study of the evolution of regulatory mechanisms in mammals, and how these sequences can be used to predict human regulatory regions more accurately.
Wed Jun 1116:00 Lukasz HuminieckiKI
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Genomic signatures for the emergence, development and diversification of the TGF-beta/BMP signaling pathway within the animal kingdom
The profound question of how genomic processes, such as gene duplication and losses, give rise to co-ordinated organismal properties, such as emergence of new body plans, organs and lifestyles, is of paramount importance in developmental and evolutionary biology. I will focus on the diversification of the transforming growth factor-b (TGF-b) pathway . one of the most fundamental and versatile metazoan signal transduction engines. Our results challenge the view of well-conserved developmental pathways. TGF-b signal transduction engine has expanded through gene duplication, continually accomplishing new functions, as animals grew in anatomical complexity, colonized new environments, and developed an active immune system.

Fri Aug 1511:00 David Bryant McGill Centre for Bioinformatics
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Exact and efficient algorithms for the probability of a marker under incomplete lineage sorting
Incomplete lineage sorting is known to complicate phylogenetic analysis of species radiations. Lineages from the same species can coalesce before the time of species divergence, leading to gene trees that are in conflict with the species tree. The standard models for the evolution of markers on a gene tree and for gene trees coalescing within species trees are computationally demanding since one has to integrate over all possible gene trees at each unlinked locus.

We have developed algorithms that avoid this integration over gene trees by using a variant of Felsenstein's pruning algorithm for the likelihood of a phylogeny. Given a species tree (with divergence dates and population sizes) we can compute the probability of a single binary marker, exactly and efficiently. Both finite site and infinite site models of mutation are handled. Thus, if the data consist of a collection of unlinked binary markers (such as SNP data) we can compute the likelihood of the species tree directly, bypassing the need to consider the gene tree histories. These likelihoods can then be used for Bayesian or ML inference on the species tree and its parameters.

Wed Sep 1016:00 Kristoffer ForslundSBC
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Conservation of domain architecture and protein sequence in orthologs

Among features relevant for protein function assignment, orthology relations and domain architecture are both important. A systematic, large-scale study of domain architecture conservation in orthologous versus paralogous proteins has so far been lacking. We perform an analysis of this type, using the Inparanoid framework as a basis. As a spin-off from this project, we've also begun a comparison of different strategies for handling false positive homology assignments as a result of sequences having very biased residue composition.

Wed Sep 1716:00 Patrik BjörkholmSBC
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Using multi-data hidden Markov models trained on local neighborhoods of protein structure to predict residue-residue contacts

We propose a novel hidden Markov model based method for predicting residue-residue contacts from protein sequences using as training data homologous sequences, predicted secondary structure and a library of local neighborhoods (local descriptors of protein structures). The library consists of recurring structural entities in-cooperating short-, medium and long-range interactions and is general enough to reassemble the cores of most proteins in PDB. The method is tested on an external test set of 606 domains with no significant sequence similarity to the training set as well as 151 domains with SCOP folds not present in the training set. Considering the top L/5 predictions (L = sequence length), our hidden Markov models obtained an accuracy of 22.8% for long range interactions in new fold targets, and an average accuracy of 28.6% for long-, medium and short-range contacts. This is a significant performance increase over state of the art methods.

Wed Oct 116:00 Lars ArvestadSBC/CSC
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Probabilistic analysis of gene family evolution -- gene duplications and sequence evolution

Probabilistic and Bayesian methods have gained popularity in phylogenetics in recent years. We present a probabilistic gene evolution model, PrIME-GSR, based on a birth-death process in which a gene tree evolves "inside" a species tree. The model is the basis for MCMC-based algorithms for probabilistic approaches to orthology analysis, tree reconciliation studies, and gene tree inference.

We believe this progress represents the "next generation" of phylogenetic analysis. It allows us to pose the question: what is the most probable gene tree explaining a set of sequences and respecting a known species tree? To date, model development in phylogenetics has concentrated on sequence evolution, leaving other types of data to be analyzed later in separate steps. We argue that joint analysis of data is desirable and a model integrating a species tree in the phylogenetic analysis is an important step forward.

Based on our model, we have implemented a Bayesian analysis tool. Our implementation is sound and we demonstrate its utility for genome-wide gene-family analysis by applying it to recently presented yeast data. We validate PrIME-GSR by comparing to previous analyses of this data that takes advantage of gene order information. The results demonstrate the value of a relaxed molecular clock and also suggest that synteny prediction can mislead gene tree estimation.

Wed Oct 1514:00 Stefan HohmannDepartment of Cell and Molecular Biology/Microbiology, Göteborg University
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Understanding signal transduction at a system level

We are studying specific MAPK signalling pathways as well as the AMPK pathway using yeast as a model organism. Our aim is to understand the dynamic regulation of these pathways and we employ besides classical genetics and molecular biology time course data and mathematical modelling. I will compare the yeast HOG and AMPK/Snf1 pathways with regard to their regulation and highlight similarities and differences and what we might be able to learn from those.

Wed Oct 2216:00 Leif AnderssonDepartment of Medical Biochemistry and Microbiology, Uppsala University
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Cis-acting regulatory mutations play a prominent role for shaping phenotypic diversity in domestic animals

Domestic animals provide unique opportunities to study genotype-phenotype relationships due to their long history (thousands of year) of strong phenotypic selection. Phenotypic differences between and within populations of domestic animals are primarily caused by alleles with no or only weak deleterious effects. Thus, alleles underlying phenotypic traits in domestic animals provide a valuable complement to the rich collection of loss-of-function mutations established in model organisms. A number of the recent gene identifications we have made are caused by cis-acting regulatory mutations. These include genes controlling muscle growth in pigs, white spotting in dogs, yellow skin in chicken and greying with age in horses. The strategies used for the identification of the causal mutations and our attempts to reveal the mechanism of action will be discussed.

Wed Nov 516:00 Gabriel ÖstlundSBC
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Network-based identification of novel cancer genes

Genes involved in cancer susceptibility and progression can serve as templates for searching protein networks for novel cancer genes. We here introduce a new method to rank cancer gene candidates by their connectivity to known cancer genes.

By using a comprehensive protein network, we searched for genes connected to a set of well-known cancer genes. The candidate list was refined by selecting genes with unexpectedly high levels of connectivity to cancer genes and without previous association to cancer. This produced a list of new cancer candidates, with up to 45 connections to known cancer genes.

We validated our method by cross-validation, GO term bias, and differential expression in cancer vs. normal tissue. Some examples are presented with detailed pathway positioning of the new candidates. Our study provides a ranked list of high-priority targets for further studies in cancer research.

Wed Nov 1213:30 David MessinaSBC
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)DAS, the distributed annotation system, and how to aggregate shared data with it

DAS is a simple protocol designed for easy sharing of biological data. In this talk I will introduce you to DAS, show some examples of data sources that offer their information via DAS, and describe my work on DASher, a viewer application that collects DAS-format annotations and displays them along a protein sequence.

In the time since I last spoke about this work, DASher's capabilities have expanded considerably, so I will probably focus on what's new.

Wed Nov 1916:00 Thomas BürglinKI
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Detection of Embryonic Gene Expression Dynamics

Gene expression is extensively studied in C. elegans at different levels. While protein and RNA detection methods reveal localization and changes in expression on a rather rough temporal scale, fluorescent reporter transgenes allow expression studies in vivo. Even though those reporters are unable to represent the final protein expression they give detailed information suitable to construct models of regulatory networks. In addition to the recently published genome wide promoterome screens (Hunt-Newbury et al., 2007; Dupuy et al., 2007; Reece-Hoyes et al., 2007) we propose a system to automatically obtain, map and compare embryonic promoterome data. In addition to the localization we emphasize on the detection and quantification of rapid changes in transcription during cell differentiation. We have focused on genes known or expected to be expressed during embryogenesis such as homeodomain containing transcription factors. We find that several homeobox genes show highly dynamic expression patterns that are easily overlooked using established methods. We have screened over 80 genes in over 200 individual transgenic animals. The expression patterns are normalized to a standard spatiotemporal (4D) coordinate system and can be compared. The temporal expression patterns obtained from different transgenic reporter individuals for a particular promoter usually correlate with r > 0.98. The success rate with our system to obtain a complete 4D expression pattern from a single transgenic embryo is greater than 95%. In conclusion, we have developed a generally applicable method and workflow for obtaining high-resolution 4D transcriptome and proteome data.

Wed Nov 2616:00 Karin JuleniusSBC
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Prediction of glycosylation sites in proteins

Estimations show that over half of all mammalian proteins are glycosylated. Of the estimated 22.216 human gene products, only 11.935 are well characterised as SwissProt entries. Of these, only 387 (3.2%) have experimentally verified glycosylation site information. To bridge this gap, prediction methods are needed.

There are many types of protein glycosylation, each defined by a) the nature of the glycan attached and b) the nature of the protein-glycan linkage. All but one takes place in extracellular proteins or extracellular parts of membrane proteins, one type is intracellular. They are often classified according to the nature of the attachment atom on the protein: N-glycosylation is glycan linkage to the side-chain nitrogen of asparagine residues, O-glycosylation to oxygen atoms of serines or threonines and C-mannosylation to one of the carbons of the pyranose part of tryptophans. This classification is somewhat misleading, since a large number of different types of O-glycosylation has been identified. Each type of glycosylation is catalyzed by one or more distinct glycosyltransferases and differences in recognition sequences between enzymes is often large. Therefore, we choose to develop glycosylation site predictors one glycosylation type at the time.

We have previously developed predictors for mucin-type O-glycosylation sites, NetOGlyc 3.0 (2005, 161 citations), and for C-mannosylation sites, NetCGlyc (2007, 4 citations). We are currently finishing the development of two proteoglycan site predictors. One specifically trained only on mammalian sequences and since the recognition sequences seem to be surprisingly evolutionary conserved, we have also developed a general predictor by adding data from C.elegans and chicken to the training set. We have also made an effort to develop a predictor on N-glycosylation sites that will out-perform any simple pattern rule. In this process we have gathered a data set consisting of 1825 experimentally verified positive and 18572 negative sites. Among the negative sites, there are 205 that follow the PROSITE N-glycosylation pattern N{P}S/T{P}. Using this data set, we have been able to verify previous findings by von Heijne et al that N-glycosylation is less likely to take place close to a transmembrane sequence (<20 aa). On the other hand, our data does not support that N-glycosylation is less likely to take place close to the C-terminal, a result of a similar study from von Heijne et al.

All our predictors are or will be available at www.cbs.dtu.dk/services

Wed Dec 316:00 Kristoffer IllergårdSBC
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Changing the traditional picture of alpha-helical membrane proteins

With the increasing number of available alpha-helical transmembrane (TM) protein structures, the traditional picture of membrane proteins has been challenged. First, this group of protein has been viewed as consisting of long alpha-helices connected by short loops, that together forms a bundle in the membrane. Second, it has been suggested that membrane proteins have a hydrophilic interior and a hydrophobic exterior and thus should be inside-out of globular proteins. We have performed three studies showing that this old view is incorrect. A) 7 % of the residues in the membrane core are coils. The coils are functionally conserved and frequently found within channels and transporters, where they introduce the flexibility and polarity required for transport across the membrane. B) The charged and strongly polar residues are buried and highly conserved in membrane. These residues are frequently found within channels and transporters where the polar groups often line the cavities. C) The amino acid distribution of solvent inaccessible sites is similar in membrane region as in globular region, while very different for the solvent accessible sites. This led us to develop the first accessibility-predictor that has reasonable performance in both membrane and globular regions.

Wed Dec 1016:00 Erik SonnhammerSBC
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Employing conservation of co-expression to improve functional inference

Observing co-expression between genes suggests that they are functionally coupled. Co-expression of orthologous gene pairs across species may improve function prediction beyond the level achieved in a single species. We used orthology between genes of the three different species S. cerevisiae, D. melanogaster, and C. elegans to combine co-expression across two species at a time. This led to increased function prediction accuracy when we incorporated expression data from either of the other two species and even further increased when conservation across both of the two other species was considered at the same time. To be able to employ the most suitable co-expression distance measure for our analysis, we evaluated the ability of four popular gene co-expression distance measures to detect biologically relevant interactions between pairs of genes. While the differences between distance measures were small, Spearman correlation gave most robust results. See PMID: 18808668.

Wed Dec 1716:00 Anders KarlströmKTH
Seminar room RB35 (Roslagstullsbacken 35, the SBC house)Neuroeconomics

No abstract yet


Kristoffer Forslund
Last modified: Feb 18 2008