Data Skeptic

The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.

https://dataskeptic.com

Eine durchschnittliche Folge dieses Podcasts dauert 31m. Bisher sind 533 Folge(n) erschienen. Dieser Podcast erscheint wöchentlich.

Gesamtlänge aller Episoden: 11 days 5 hours 14 minutes

subscribe
share






recommended podcasts


[MINI] Dropout


Deep learning can be prone to overfit a given problem. This is especially frustrating given how much time and computational resources are often required to converge. One technique for fighting overfitting is to use dropout. Dropout is the method of...


share








 January 13, 2017  15m
 
 

The Police Data and the Data Driven Justice Initiatives


In this episode I speak with Clarence Wardell and Kelly Jin about their mutual service as part of the White House's Police Data Initiative and Data Driven Justice Initiative respectively. The was organized to use open data to increase transparency...


share








 January 6, 2017  49m
 
 

The Library Problem


We close out 2016 with a discussion of a basic interview question which might get asked when applying for a data science job. Specifically, how a library might build a model to predict if a book will be returned late or not.  


share








 December 30, 2016  35m
 
 

2016 Holiday Special


Today's episode is a reading of Isaac Asimov's .  As mentioned on the show, this is just a work of fiction to be enjoyed and not in any way some obfuscated political statement.  Enjoy, and happy holidays!


share








 December 23, 2016  39m
 
 

[MINI] Entropy


Classically, entropy is a measure of disorder in a system. From a statistical perspective, it is more useful to say it's a measure of the unpredictability of the system. In this episode we discuss how information reduces the entropy in deciding...


share








 December 16, 2016  16m
 
 

MS Connect Conference


Cloud services are now ubiquitous in data science and more broadly in technology as well. This week, I speak to , , and about various aspects of data at scale. We discuss the embedding of R into SQLServer, SQLServer on linux, open source, and a few...


share








 December 9, 2016  42m
 
 

Causal Impact


Today's episode is all about Causal Impact, a technique for estimating the impact of a particular event on a time series. We talk to about his research into the impact releases have on app and we also chat with about a project she helped us build to...


share








 December 2, 2016  34m
 
 

[MINI] The Bootstrap


The Bootstrap is a method of resampling a dataset to possibly refine it's accuracy and produce useful metrics on the result. The bootstrap is a useful statistical technique and is leveraged in Bagging (bootstrap aggregation) algorithms such as Random...


share








 November 25, 2016  10m
 
 

[MINI] Gini Coefficients


The Gini Coefficient (as it relates to decision trees) is one approach to determining the optimal decision to introduce which splits your dataset as part of a decision tree. To pick the right feature to split on, it considers the frequency of the...


share








 November 18, 2016  15m
 
 

Unstructured Data for Finance


Financial analysis techniques for studying numeric, well structured data are very mature. While using unstructured data in finance is not necessarily a new idea, the area is still very greenfield. On this episode, shares her thoughts on the potential...


share








 November 11, 2016  33m