Gesamtlänge aller Episoden: 3 days 1 hour 46 minutes
Does a given bacterial gene live on a plasmid or the chromosome? What other genes live on the same plasmid?
In this episode, we hear from Sergio Arredondo-Alonso and Anita Schürch, whose projects mlplasmids and gplas answer these types of questions.
Links:
In this episode, Benjamin Callahan talks about some of the issues faced by microbiologists when conducting amplicon sequencing and metagenomic studies. The two main themes are:
In this episode, Luke Anderson-Trocmé talks about his findings from the 1000 Genomes Project. Namely, the early sequenced genomes sometimes contain specific mutational signatures that haven’t been replicated from other sources and can be found via their association with lower base quality scores. Listen to Luke telling the story of how he stumbled upon and investigated these fake variants and what their impact is...
In this episode, I talk with Irineo Cabreros about causality. We discuss why causality matters, what does and does not imply causality, and two different mathematical formalizations of causality: potential outcomes and directed acyclic graphs (DAGs). Causal models are usually considered external to and separate from statistical models, whereas Irineo’s new paper shows how causality can be viewed as a relationship between particularly chosen random variables (potential outcomes)...
In this episode, we hear from Romain Lopez and Gabriel Misrachi about scVI—Single-cell Variational Inference. scVI is a probabilistic model for single-cell gene expression data that combines a hierarchical Bayesian model with deep neural networks encoding the conditional distributions. scVI scales to over one million cells and can be used for scRNA-seq normalization and batch effect removal, dimensionality reduction, visualization, and differential expression...
Even though the double-stranded DNA has the famous regular helical shape, there are small variations in the geometry of the helix depending on what exact nucleotides its made of at that position.
In this episode of the bioinformatics chat, Hassan Samee talks about the role the DNA shape plays in recognition of the DNA by DNA-binding proteins, such as transcription factors...
An αβ T-cell receptor is composed of two highly variable protein chains, the α chain and the β chain. However, based only on bulk DNA or RNA sequencing it is impossible to determine which of the α chain and β chain sequences were paired in the same receptor.
In this episode, Kristina Grigaityte talks about her analysis of 200,000 paired αβ sequences, which have been obtained by targeted single-cell RNA sequencing...
Modern genome assembly projects are often based on long reads in an attempt to bridge longer repeats. However, due to the higher error rate of the current long read sequencers, assemblers based on de Bruijn graphs do not work well in this setting, and the approaches that do work are slower...
In this episode, we hear from Jacob Schreiber about his algorithm, Avocado.
Avocado uses deep tensor factorization to break a three-dimensional tensor of epigenomic data into three orthogonal dimensions corresponding to cell types, assay types, and genomic loci. Avocado can extract a low-dimensional, information-rich latent representation from the wealth of experimental data from projects like the Roadmap Epigenomics Consortium and ENCODE...
The third Bioinformatics Contest took place in February 2019.
Alexey Sergushichev, one of the organizers of the contest, and Gennady Korotkevich, the 1st prize winner, join me to discuss this year’s problems...