Data Skeptic

The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.

https://dataskeptic.com

subscribe
share



 

recommended podcasts


Recurrent Relational Networks


One of the most challenging NLP tasks is natural language understanding and reasoning. How can we construct algorithms that are able to achieve human level understanding of text and be able to answer general questions about it? This is truly an open...


share





   19m
 
 

Text World and Word Embedding Lower Bounds


In the first half of this episode, Kyle speaks with Marc-Alexandre Côté and Wendy Tay about Text World.  Text World is an engine that simulates text adventure games.  Developers are encouraged to try out their reinforcement learning skills...


share





   39m
 
 

word2vec


Word2vec is an unsupervised machine learning model which is able to capture semantic information from the text it is trained on. The model is based on neural networks. Several large organizations like Google and Facebook have trained word embeddings...


share





   31m
 
 

Authorship Attribution


In a recent paper, Leveraging Discourse Information Effectively for Authorship Attribution, authors Su Wang, Elisa Ferracane, and Raymond J. Mooney describe a deep learning methodology for predict which of a collection of authors was the author of a...


share





   50m
 
 

Very Large Corpora and Zipf's Law


The earliest efforts to apply machine learning to natural language tended to convert every token (every word, more or less) into a unique feature. While techniques like stemming may have cut the number of unique tokens down, researchers always had to...


share





 2019-01-18  24m
 
 

Semantic search at Github


Github is many things besides source control. It's a social network, even though not everyone realizes it. It's a vast repository of code. It's a ticketing and project management system. And of course, it has search as well. In this episode, Kyle...


share





 2019-01-11  34m
 
 

Let's Talk About Natural Language Processing


This episode reboots our podcast with the theme of Natural Language Processing for the next few months. We begin with introductions of Yoshi and Linh Da and then get into a broad discussion about natural language processing: what it is, what some of...


share





 2019-01-04  36m
 
 

Data Science Hiring Processes


Kyle shares a few thoughts on mistakes observed by job applicants and also shares a few procedural insights listeners at early stages in their careers might find value in.


share





 2018-12-28  33m
 
 

Holiday Reading - Epicac


Epicac by Kurt Vonnegut.


share





 2018-12-25  21m
 
 

Drug Discovery with Machine Learning


In today's episode, Kyle chats with Alexander Zhebrak, CTO of Insilico Medicine, Inc. Insilico self describes as artificial intelligence for drug discovery, biomarker development, and aging research. The conversation in this episode explores the ways...


share





 2018-12-21  28m