Degree | Type | Year |
---|---|---|
4313473 Bioinformatics | OB | 0 |
You can view this information at the end of this document.
Level B2 of English or equivalent is recommended.
This module focuses on the development of diverse bioinformatic tools and resources commonly used in Omics research. Our intention is that it covers several aspects of bioinformatics in a series of brief topics, in the form of "tastings". Therefore, it is not an accummulative module, but a transversal one, which should provide with a wide range of ideas and approaches that bioinformatics offers, through the hands of experts. The main objective is to provide students with the necessary foundation to apply bioinformatics to different areas of scientific research. Over time, each student will be able to gain all the depth they propose on any of these topics, the one which finally represents their research framework.
BLOCK 1. STATISTICS
Statistical Inference
Professor Antonio Barbadilla
- Statistics: bridge between data and models
- Data Types
- Population and sample
- Experimental design
- Data Quality
- Exploration of Data
- Sample distribution and law of large numbers
- Statistical inference
- Central Limit Theorem
- Point estimation
- Estimation of confidence interval
- Hypothesis
- Elements of a test: H0, H1, statistical test, p value, significance level, type I and II errors, power
- Z test, t test, chi-square test, correlation test, regression, analysis of variance
- Interpretation of statistical significance
- Parametric versus nonparametric tests
- Selecting the appropriate statistical test (decision tree)
- Multivariate Testing
- Resampling
Statistics and Stochastic Processes for Sequence Analysis
Professor Pere Puig
a. Probability basics
Sets and events. Properties. Conditional probability. Independence. Alphabet and sequences. Probabilistic models.
b. The multinomial model
Simulating a multinomial sequence. Estimating probabilities.
c. The seqinr package
d. Markov chain models
Concept and examples. Classification of states. R code. Simulating a Markov chain sequence. Estimating the probabilities of transition. The probability of a sequence. Using Markov chain for discrimination.
e. Higher order Markov chain models
Concept and examples. Estimating the probabilities of transition. Comparison of higher order Markov chains.
f. Hidden Markov chain models
Concept and examples. Parameter estimation. Hidden states estimation.
g. An introduction to Generalized Linear Models
GLM basics. The Logistic model. The Poisson model.
Bayesian Inference
Professor Emmanuele Raineri
1. Curve fitting.
- Estimation of parameters of probability distributions: binomial, poisson and gaussian.
- Example: fitting a noisy dataset.
- Cross validation, overfitting and regularization.
2. Dimensional reduction.
- Principal component analysis, multidimensional scaling.
- Example: distinguishing cell types using methylation profiles.
3. Lasso regression.
- Variable selection in linear models.
- Penalized regression: Lasso and Elastic Net.
- Example: lasso regression in R.
BLOCK 2. BASIC UTILITIES
The Human Genome
Professor Marta Puig
a. Introduction to genomes
Sequenced genomes. Organization and size of eukaryotic genomes. Building a genome: NGS methods for genomics and transcriptomics.
b. The human genome: where are we now?
Current assembly of the human genome. The ENCODE project: functional elements in the human genome. Repetitive content of the human genome.
Databases and Sequence Formats
Professor Oscar Conchillo
a. Sequence formats
Nomenclature. Text editors. FASTA format and its variants. Raw/Plain format. Genbank sequence format. EMBL sequence format. GCG, NBRF/PIR, MSA, PHYLIP, NEXUS. Format conversion.
b. Databases
Concept. Boolean searches. Wildcards and regular expressions. Identifiers and accession numbers. Classification. NAR databases compilation. GenBank and other NCBI databases. EMBL. DDBJ. Integrated Meta-Databases. Main nucleotide, protein, structure, taxonomy, etc. databases.
Software Engineering
Professor Miquel Àngel Senar
a. Version control system with Git and GitHub
b. Parallelization strategies and HPC
c. Cloud computing with Amazon Web Services
BLOCK 3. STRUCTURAL BIOINFORMATICS
Protein structure
Professors Leonardo Pardo and Óscar Conchillo
a. Introduction
Amino acids, proteins, and peptide bonds. Four levels of protein structure. Protein folding and stability. Molecular interactions. Experimental methods for structure determination.
b. Motifs and domains
c. Analysis
UNIPROT, PDB, PFAM, CATH, and SCOP databases. Protein alignment, morphing, molecular surfaces, molecular electrostatic potential.
d. Cell membrane
Membrane proteins, transmembrane segments
Molecular modeling
Professors Leonardo Pardo and Jean-Didier Maréchal
a. Homology modeling
b. Molecular modeling
Atomic models. Potential energy. Quantum and molecular mechanics. Conformational exploration techniques
BLOCK 4. GENOMICS
Introduction: Genome and omic data
Professor Jaime Martínez Urtaza
a. Main milestones in the genome sequencing project: sequencing, assembly and annotation.
b. Sequencing. Classic sequencing by Sanger method. Next-generation sequencing (NGS) techniques. Second generation techniques: 454/Roche (pyrosequencing), Illumina (reversible termination), SOLiD (sequencing by ligation), Ion Torrent (proton detection). Third generation techniques. Challenges and differences with second generation techniques. Pacific Biosciences (PacBio; Single Molecule Real Time, SMART). Oxford Nanopore (Minion).
c. Assembly. De novo assembly versus mapping against reference. Reads and contigs. Measurement of the quality of an assembly: quality of a base or Phred score (Q), redundancy (coverage), N50 and L50. Paired-end reads and scaffolds.
Population Genomics
Professor Isaac Salazar
a. Population genomics under neutrality in a finite population
Introduction. Genetic drift. Effective population size. Probability of fixationof neutral mutants.
b. Population genomics under selection
Natural selection. Probability of fixation of selected mutants. Fitness distribution of new mutants. Rate of evolution.
c. Adaptive evolution and population size
Phylogeny and Molecular Evolution
Professor Sebastián Ramos
a. Models of sequence evolution
DNA sequence. Jukes and Cantor model. More realistic models. Model selection.
b. Phylogeny
Concept. Species trees versus gene trees. Tree-reconstructionmethods: distance methods, maximum parsimony, maximum likelihood, Bayesian inference. Support. Phylogenomics. Building trees with R.
Systems Biology
Professor Isaac Salazar
a. Classical and Genomic age Systems Biology
The systems biology paradigm in light of technological developments over the last 100 years. Data integration bottlenecks.
b. Mathematical modeling of molecular circuits.
Conceptual models. From conceptual models to mathematical models. Mathematical formalisms. Data driven models.
c. Design and organization principles in molecular circuits.
Conceptof design principle. Mathematically controlled comparisons. Feasibility analysis. Design Spaces. Synthetic Biology.
Title | Hours | ECTS | Learning Outcomes |
---|---|---|---|
Type: Directed | |||
Solving problems in class and work in the biocomputing lab | 39 | 1.56 | 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 |
Theoretical classes | 39 | 1.56 | 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 |
Type: Supervised | |||
Performing individual and team works | 40 | 1.6 | 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 |
Type: Autonomous | |||
Regular study | 178 | 7.12 | 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 |
The methodology will combine master classes, solving practical problems and real cases, working in the computing lab, performing individual and team work, reading articles related to the thematic blocks, and independent self-study. The virtual platform will be used.
Annotation: Within the schedule set by the centre or degree programme, 15 minutes of one class will be reserved for students to evaluate their lecturers and their courses or modules through questionnaires.
Title | Weighting | Hours | ECTS | Learning Outcomes |
---|---|---|---|---|
Individual theoretical and practical test | 35% | 4 | 0.16 | 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 |
Soft skills | 10% | 0 | 0 | 1, 3, 6, 7, 9 |
Student's portfolio | 55% | 0 | 0 | 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 |
The evaluation system is organized in three main activities. There will be, in addition, a retake exam. The details of the activities are:
Main evaluation activities
Retake exam
To be eligible for the retake process, the student should have been previously evaluated in a set of activities equaling at least two thirds of the final score of the module. The teacher will inform the procedure and deadlines for the retake process. Please note that soft skills cannot be recuperated.
Not valuable
The student will be graded as "Not Valuable" if the weight of the evaluation is less than 67% of the final score.
Unique assessment
This subject/module does not provide for the single assessment system.
Updated bibliography will be recommended in each session of this module by the professor, and links will be made available on the Student's Area of the MSc Bioinformatics official website.
Updated software will be recommended in each session of this module by the professor, and links will be made available on the Student's Area of the MSc Bioinformatics official website.
Name | Group | Language | Semester | Turn |
---|---|---|---|---|
(PLABm) Practical laboratories (master) | 1 | English | first semester | morning-mixed |
(TEm) Theory (master) | 1 | English | first semester | morning-mixed |