Data Skeptic

The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.

https://dataskeptic.com

Eine durchschnittliche Folge dieses Podcasts dauert 31m. Bisher sind 530 Folge(n) erschienen. Jede Woche gibt es eine neue Folge dieses Podcasts.

Gesamtlänge aller Episoden: 11 days 3 hours 4 minutes

subscribe
share






recommended podcasts


Simultaneous Translation at Baidu


While at NeurIPS 2018, Kyle chatted with Liang Huang about his work with Baidu research on simultaneous translation, which was demoed at the conference.


share








 March 15, 2019  24m
 
 

Human vs Machine Transcription


Machine transcription (the process of translating audio recordings of language to text) has come a long way in recent years. But how do the errors made during machine transcription compare to the errors made by a human transcriber? Find out in this...


share








 March 8, 2019  32m
 
 

seq2seq


A sequence to sequence (or seq2seq) model is neural architecture used for translation (and other tasks) which consists of an encoder and a decoder. The encoder/decoder architecture has obvious promise for machine translation, and has been successfully...


share








 March 1, 2019  21m
 
 

Text Mining in R


Kyle interviews  about her path into data science, her book , and some of the ways in which she's used natural language processing in projects both personal and professional. Related Links


share








 February 22, 2019  20m
 
 

Recurrent Relational Networks


One of the most challenging NLP tasks is natural language understanding and reasoning. How can we construct algorithms that are able to achieve human level understanding of text and be able to answer general questions about it? This is truly an open...


share








 February 15, 2019  19m
 
 

Text World and Word Embedding Lower Bounds


In the first half of this episode, Kyle speaks with Marc-Alexandre Côté and Wendy Tay about Text World.  Text World is an engine that simulates text adventure games.  Developers are encouraged to try out their reinforcement learning skills...


share








 February 8, 2019  39m
 
 

word2vec


Word2vec is an unsupervised machine learning model which is able to capture semantic information from the text it is trained on. The model is based on neural networks. Several large organizations like Google and Facebook have trained word embeddings...


share








 February 1, 2019  31m
 
 

Authorship Attribution


In a recent paper, Leveraging Discourse Information Effectively for Authorship Attribution, authors Su Wang, Elisa Ferracane, and Raymond J. Mooney describe a deep learning methodology for predict which of a collection of authors was the author of a...


share








 January 25, 2019  50m
 
 

Very Large Corpora and Zipf's Law


The earliest efforts to apply machine learning to natural language tended to convert every token (every word, more or less) into a unique feature. While techniques like stemming may have cut the number of unique tokens down, researchers always had to...


share








 January 18, 2019  24m
 
 

Semantic search at Github


Github is many things besides source control. It's a social network, even though not everyone realizes it. It's a vast repository of code. It's a ticketing and project management system. And of course, it has search as well. In this episode, Kyle...


share








 January 11, 2019  34m