Linear Digressions

In each episode, your hosts explore machine learning and data science through interesting (and often very unusual) applications.

http://lineardigressions.com

subscribe
share



 
 

      Open Data and Open Science


      One interesting trend we've noted recently is the proliferation of papers, articles and blog posts about data science that don't just tell the result--they include data and code that allow anyone to repeat the analysis. It's far from universal (for a timely counterpoint, read this article ), but we seem to be moving toward a new normal where data science conclusions are expected to be shown, not just told. Relevant...

      share





      16m
       

      Defining the quality of a machine learning production system


      Building a machine learning system and maintaining it in production are two very different things. Some folks over at Google wrote a paper that shares their thoughts around all the items you might want to test or check for your production ML system. Relevant links: https://research.google.com/pubs/pub45742.html

      share





      20m
       

      Auto-generating websites with deep learning


      We've already talked about neural nets in some detail (links below), and in particular we've been blown away by the way that image recognition from convolutional neural nets can be fed into recurrent neural nets that generate descriptions and captions of the images. Our episode today tells a similar tale, except today we're talking about a blog post where the author fed in wireframes of a website design and asked the neural net to generate the HTML and CSS that would actually build a website...

      share





      19m
       

      The Case for Learned Index Structures, Part 2: Hash Maps and Bloom Filters


      Last week we started the story of how you could use a machine learning model in place of a data structure, and this week we wrap up with an exploration of Bloom Filters and Hash Maps. Just like last week, when we covered B-trees, we'll walk through both the "classic" implementation of these data structures and how a machine learning model could create the same functionality.

      share





      20m
       

      The Case for Learned Index Structures, Part 1: B-Trees


      Jeff Dean and his collaborators at Google are turning the machine learning world upside down (again) with a recent paper about how machine learning models can be used as surprisingly effective substitutes for classic data structures. In this first part of a two-part series, we'll go through a data structure called b-trees. The structural form of b-trees make them efficient for searching, but if you squint at a b-tree and look at it a little bit sideways then the search functionality starts...

      share





      18m
       2018-01-22

      Challenges with Using Machine Learning to Classify Chest X-Rays


      Another installment in our "machine learning might not be a silver bullet for solving medical problems" series. This week, we have a high-profile blog post that has been making the rounds for the last few weeks, in which a neural network trained to visually recognize various diseases in chest x-rays is called into question by a radiologist with machine learning expertise. As it seemingly always does, it comes down to the dataset that's used for training--medical records assume a lot of...

      share





      18m
       2018-01-15

      The Fourier Transform


      The Fourier transform is one of the handiest tools in signal processing for dealing with periodic time series data. Using a Fourier transform, you can break apart a complex periodic function into a bunch of sine and cosine waves, and figure out what the amplitude, frequency and offset of those component waves are. It's a really handy way of re-expressing periodic data--you'll never look at a time series graph the same way again.

      share





      15m
       2018-01-08

      Statistics of Beer


      What better way to kick off a new year than with an episode on the statistics of brewing beer?

      share





      15m
       2018-01-02

      Re - Release: Random Kanye


      We have a throwback episode for you today as we take the week off to enjoy the holidays. This week: what happens when you have a markov chain that generates mashup Kanye West lyrics with Bible verses? Exactly what you think.

      share





      9m
       2017-12-24

      Debiasing Word Embeddings


      When we covered the Word2Vec algorithm for embedding words, we mentioned parenthetically that the word embeddings it produces can sometimes be a little bit less than ideal--in particular, gender bias from our society can creep into the embeddings and give results that are sexist. For example, occupational words like "doctor" and "nurse" are more highly aligned with "man" or "woman," which can create problems because these word embeddings are used in algorithms that help people find...

      share





      18m
       2017-12-18