From the 2014 Strata Conference + Hadoop World in New York City.

There are two essential skills for the data scientist: engineering and statistics. A great many data scientists are very strong engineers but feel like impostors when it comes to statistics. In this talk John will argue that the ability to program a computer gives you special access to the deepest and most fundamental ideas in statistics. John’s goal is to convince the non-statistician engineers in the audience that the road to statistical fluency is much, much shorter than they think.

About John Rauser:

John has been extracting value from large datasets for over 20 years at hedge funds, small data-driven startups, Amazon, and now Pinterest. He has deep experience in machine learning, data visualization, on-line experimentation, website performance and real-time fault analysis. An empiricist at heart, “Just do the experiment!” is his favorite call to arms.

Watch more from Strata + Hadoop World 2014:

Find out more about the conference:

Don’t miss an upload! Subscribe!

Stay Connected to O’Reilly Media by Email –

Follow O’Reilly Media:

Nguồn:https://cinausp.org/

Xem Thêm Bài Viết Khác:https://cinausp.org/cong-nghe

Instead of consulting wikipedia, you could just spend an hour learning about t-tests at Khan Academy.

How do you get the mosquitoes to drink the beer?

Hi all. Anyone can telle the name of the electronics tinkering toy of the girl at 11:17?

I just watched your "Statistics Without the Agonizing Pain." Simply awesome.

You are probably aware but just in case of the books

Think Stats

Think Bayes

by Allen B. Downey which uses Python programming to get stats ideas across.

They are available for free at GreenTeaPress.Com.

There is something that is not clearly said here. You don't need to remember the formula for the t-test, that a use for a computer. What is really important is the concept of density function and distribution function. In ten minutes or so one can explain those concepts and they will be applicable to many type of tests. Understanding the gist of what is an statistical test and how to interpreter the outcome is not difficult and you don't have to remember difficult formulas. That is what you see here is a straw man argument. If you only want to know the essence of statistical test, you can grasp it easily and one way to understand that is that formulas are like simulations. You should ask any statistician if you really what to use the formula that he show here, the answer is no way, you just use your computer but when you have the knowledge of the fundamental concept of density function, distribution function and sample mean all is more clear. Anyway, I understand that simulation is a good way of understanding problems and an easy way of testing difficult problems but in no way are difficult the essential concepts involved in a test for the mean.

I love this talk – spot on. Indeed solid understanding of the real stats behind things is highly desirable, but in terms of getting a better sense of the problem under consideration, this is a great approach for those with decent programming skills

Good presentation but it also shows that simulation without statistical understanding may be dangereous. Contrary to what is suggested in the video, the null hypotheses of the t-test and a random permutation test are not equivalent. In the former test, the null hypothesis is "the average treatment effect is zero". In the latter case, the null hypothesis is "the effect is zero for every individual".

@Bill Venables To clarify, I think mathematical statistics is beautiful and useful, it's just a terrible way to introduce statistical thinking (given modern computational options). I had only 10 minutes, and so the talk had to stay totally on the rails, otherwise I'd have expanded on the origins of the analytical approach, why it was invented, and why it is still useful. For more on this topic see George Cobb's lovely paper, The Introductory Statistics Course: A Ptolemaic Curriculum: https://escholarship.org/uc/item/6hb3k0nz.

I've been saying for years that, when I was in CS, they changed the curriculum to make Statistics an elective, and I felt at the time I was dodging a bullet, but now I feel I shot myself in the foot. This video makes me think I'm more in the dodge category again.