2004

Date Time Name Affiliation Title
Wed 25 Feb15.00 Håkan ViklundSBC
Wed 3 Mar15.00Lars ArvestadSBC
Wed 10 Mar15.00Erik AurellKTH Physics
Wed 17 Mar15.00Olivia ErikssonSBC
Wed 24 Mar15.00Gunnar von HeijneSBC
Wed 31 Mar15.00Bob MacCallumSBCPredicting the Nuclear Proteome
Wed 7 Apr15.00Björn UrsingCenter for Genomics and Bioinformatics, KI EXProt/FetchProt - Finding proteins with an experimentally verified function
Wed 21 Apr15.00Sara LightSBC
Wed 28 Apr15.00 Markus BrameierSBC
Wed 12 May10.00 Alexander Schliep Max Planck Institute for Molecular Genetics, Berlin Gene expression over time: identification of groups
Wed 19 May15.00Nick Braun (NOTE! change of speaker!)SBC
Wed 26 May15.00Björn WallnerSBC
Wed 9 Jun15.00Maria Werner (NOTE! Change of speaker)SBCNumerical solution to the master equation using the linear noise approximation
Wed 16 Jun15.00Samuel AnderssonSBC
Wed 23 Jun15.00Erik GransethSBC
Wed Sept 815.15 Erik LindahlSBC Protein Folding: Probing Multi-Microsecond Timescales with Distributed Computing
Wed Sept 1515.15 Karin JuleniusSBC and KI Prediction, conservation analysis and structural characterization of mammalian mucin-type O-glycosylation sites.
Fri Sept 1714.00 Olivia ErikssonSBC Halftime seminar
Wed 22 Sept15.15 Anthony Poole Mol. Biol. & Funct. Genomics, SUThe Nature of the Last Universal Common Ancestor
Wed 29 Sept15.15 Peter Kundrotas CSB, Novum KIRelations between some properties of unfolded proteins and their sequences.
Wed Oct 615.15 Jens LagergrenSBC Probabilistic analysis of gene families from multiple species
Comparative genomics in general and orthology analysis in particular are becoming increasingly important parts of gene function prediction. Previously, orthology analysis and reconciliation has been performed only with respect to the parsimony model. This discards many plausible solutions and sometimes precludes finding the correct one. In many areas in bioinformatics probabilistic models have proven to be both more realistic and powerful than parsimony models. We introduce a probabilistic gene evolution model based on a birth-death process in which a gene tree evolves "inside" a species tree. Based on this model, we develop a tool with the capacity to perform practical orthology analysis, based on Fitch's original definition, and more generally for reconciling pairs of gene and species trees w.r.t duplications and losses. We develop a Bayesian analysis based on MCMC which facilitates approximation of the posterior distribution for reconciliations. This also gives a way to estimate the probability that a pair of genes are orthologs. The algorithm performs very well on synthetic as well as biological data. Using standard correspondences, our results carry over to allele trees as well as biogeography. When also lateral transfers are considered reconciliation is much harder. We give a combinatorial model and parsimony algorithms for gene duplications and lateral gene transfers. These algorithms detects lateral gene transfers with very low error rates.
Wed Oct 1315.15 Ann-Charlotte BerglundSBC Myostatin sequence evolution in ruminants
Myostatin (GDF-8) is a negative regulator of skeletal muscle development. This gene has previously been implicated in the double muscling phenotype in mice and cattle.

Analysis of nonsynonymous to synonymous nucleotide substitution rate ratios (Ka/Ks) indicates that positive selection may have been operating on this gene during the time of divergence of Bovinae and Antilopinae, starting from approximately 23 million years ago, a period that appears to account for most of the sequence difference between myostatin in this two groups.

Sites evolving under positive selection pressures were found both in the propeptide region and the C-terminal region of the gene.

Wed Oct 2015.15 Arne ElofssonSBC
Gene Function predictions a personal overview and future perspective.
The prediction of the function of a gene is one of the fundamental goals of bioinformatics today. Besides expensive experimental approaches there are three fundamentally different methods to obtain functional information: (i) function is often inferred by the detection of homology to a functionally classified protein, (ii) Some types of functions can be assessed by the predictions of local features of proteins, (iii) recently a number of network centered methods to predict functions has been developed. Here, I will present some of our methods used for all three types of gene functional classifications. This research will focus on three categories of method development (Homology Detection, Structure Predictions and Network analysis) and analysis of three biological problems (Membrane proteins, Comparative Genomics and Proteome Evolution).
Wed Oct 2713.00 Erik AurellTheo. Bio. Phys.
Simple and not so simple heuristic search on 3-SAT
Random 3-SAT is the problem to determine if a set of M propositions in N Boolean variables, all of which of the type "X OR Y OR Z" can simultaneously be satisfied. While the problem is hard in worst case, it is easy for most instances unless the ration M/N is close to 4.27. These statements hold as N and M tend to infinity, their ratio fixed, and proper definitions of hard and easy. Close to 4.27 has meant to within about 10%.

3SAT in the hard region is a paradigmatic combinatorial optimization problem. I will present a study of a well-known heuristic search called walksat (Selman, Kautz & Cohen) that in fact has linear in N behaviour in median computation time up to M/N=4.14. (Numerical) concentration of the measure results will be presented in further support of this statement. These results on walksat are appearently new.

I will then further compare walksat to a somewhat complex algorithm called "survey propagation", which I will deduce as a variant of Belief Propagation in a fairly odd system of beliefs. The original derivation was framed in a 1-step replica symmetry breaking scenario in an equivalent diluted spin glass (Mezard, Parisi & Zecchina), and will not presented here. Survey propagation in polynomial in N in median computation time up to above 4.20,

This is joint work with Scott Kirkpatrick, to be presented at NIPS 2004.

Wed Nov 1015.15 Ingemar ErnbergMTC, KI
Tumour Biology today and KICancer
Due to the advancement of molecular cell biology we basically understand what makes cancer today. It is a disease of cells and genes. And it is complex. High through put technologies provide large amounts of data which requires computational analysis. But the means to advance knowledge in cell biology with computer simulations are still very limited. An example of a tractable problem of a viral genetic switch with distant resemblance to the lambda switch, but which induces cancer development, will be presented.

Finally, a brief summary of the new network for all cancer researchers within KI , KICancer, will be presented.

Wed Nov 1715.15 Johannes Frey-SköttSBC
In silico analysis of the effect on the proteome of alternative splicing
Alternative splicing is the phenomenon that explains the huge difference between the number of proteins a particular proteome and the number of genes in the corresponding genome. The process occurs post-transcriptionally and alters the pre-mRNA by excising the exons and combining them into mature mRNA. When the exons are spliced together they may be arranged in different order or some may be excluded, thus, giving rise to different forms of mRNA. We study what effects alternative splicing might have on the proteome, both in which patterns the exons are combined and what features the sequences that are added/exchanged have.
Date Time Name Affiliation
Wed Nov 2415.15 Erik SandelinSBC
Extracting multiple alignments from pairwise alignments: A combinatorial optimization problem
Multiple Structural Alignments (MSTAs) provide position-specific information on the sequence and structural variability allowed by protein 'folds'. This information can be exploited to better understand the evolution of proteins and the physical chemistry of polypeptide folding. Most MSTA methods relies on a pre-computed library of pairwise alignments. This library will in general contain conflicting residue equivalences which not all can be realized in the final MSTA. Hence to build a consistent MSTA these methods have to select a conflict-free subset of equivalences.

Using a dataset with 327 families from SCOP 1.63 we compare the ability of two different methods to select an optimal conflict-free subset of equivalences. One is an implementation of Reinert et al.'s integer linear programming formulation (ILP) of the maximum weight trace problem (Reinert et al., 1997). This ILP formulation is a rigorous approach but its complexity is difficult to predict. The other method is T-Coffee (Notredame et al., 2000) which uses a heuristic enhancement of the equivalence weights which allow it to use the speed and simplicity of the progressive alignment approach while still incorporating information of all alignments in each step of building the MSTA. We find that although the ILP formulation consistently selects a more optimal set of conflict-free equivalences, the differences are small and the quality of the resulting MSTAs are essentially the same for both methods.

Wed Dec 115.15 David FredmanCGB, KI
Copy Number Polymorphism in the Human Genome
Copy number variation in phenotypically normal human genomes is a recently described major new form of polymorphism. It is becoming evident that perhaps as much as 5-10% of the human genome is assembled of long segments (kb to Mb in length) that vary in copy number between individuals. These regions contain genes, common repeats, SNPs, and other common features. Duplicated genes may alter gene expression, and are also subject to induced structural rearrangements with potential functional consequences.

In a recently published study, we extended this revelation to show that reported SNPs in such domains were often not SNPs at all, but false interpretations of multi-site variability (MSV), reflecting the underlying copy-number differences. Most worryingly of all, MSVs masquerade as SNPs when genotyped with most common genotyping methods in use today. These issues must now be properly addressed in disease association studies and haplotype map construction, in order to avoid missing true signals or drawing invalid conclusions.

References:
Fredman, D. et al. Complex SNP-related sequence variation in segmental genome duplications. Nature Genetics (2004)
Iafrate, A.J. et al. Detection of large-scale variation in the human genome. Nature Genetics (2004).
Sebat, J. et al. Large-scale copy number polymorphism in the human genome. Science (2004)

Wed Dec 815.15 Gunnar O KleinDept of Medicine Karolinska Institutet
Chairman of the global eHealth Standardization Co-ordination Group
Medical (clinical) informatics and Bioinformatics — Possible synergy
Wed Dec 1515.15 Thomas HelledayGMT, SU
Recombination Repair and a new treatment for BRCA2 tumours?
A hallmark in cancer treatment is killing of growing cells. The most successful anti-cancer drugs cause DNA damage, which are converted into toxic lesions at replication forks. Although this method to treat cancer is highly successful, we still know very little of the lesions formed at the damaged replication forks or how these lesions are repaired. Here, novel data will be presented on the role of recombination in repair of different replication lesions and the signalling pathways activating this repair pathway. Furthermore, data suggesting a role for poly(ADP-ribose) polymerase (PARP) in replication repair will be presented. These data suggest that inhibition of PARP alone efficiently and specifically leads to killing of BRCA2 defective tumours.