Rated 3 out of 5 by from annoying error that breaks reasoning chain in lecture 3 (page 47 of guide) the lecturer's words do no match the mathematical statement for independent events. She said "if a and b are independent, the probability of their union is the product of their probabilities", which is wrong. The math expression shown reads "probability of inersection of a and b equals product of their probabilities". I like the course a lot, as a review. It may be difficult as a standalone course for complete beginner due to errors.
Date published: 2020-11-22
Rated 5 out of 5 by from The title of the course is well thought out. Great Course well presented by the Professor - She knows her subject and I have enjoyed listening to her lectures.
Date published: 2020-11-08
Rated 4 out of 5 by from A good tour of a challenging subject As a retired engineer who has dodged around statistical inference in many applications, but has had no formal training in the subject, I found this to be a good survey of population parameter estimators, basic regression, analysis of variance/covariance, and time series analysis. Prof. Williams also provides a quick introduction to a few topics that I have never encountered: regression trees and spatial statistics. Altogether, it is a fine tour of the subject, but not meant as a substitute for a textbook. Back in 1965, I was a 17-year-old research assistant to a clinical psychologist. I recall working up experimental data for an Analysis of Variance (ANOVA) without any real understanding of what I was doing. I remember long nights calculating sums of squares on an electromechanical Monroe “Monroematic” Model 88N-213 calculator which made satisfying clunk & whir sounds when dividing. Years later, I became involved in sensitivity test statistics, e.g. the 1948 Dixon and Mood Bruceton protocol, for electro-explosive devices (EEDs), and the calculation of statistical tolerance intervals for population parameter estimates. I am still bothered by some of the shortcuts that the automotive inflatable restraints industry has taken with this particular method. Prof. Williams course provides a nice supplement to the collection of books on statistical methods that I have assembled over the years and am now wondering what to do with. It was good that she cited the important statistical methods developed by Sir Ronald Aylmer Fisher FRS (1890-1962) who is being maligned these days for his studies of racial differences. HWF, Mesa AZ.
Date published: 2020-09-26
Date published: 2020-08-31
Rated 4 out of 5 by from Engaging Lectures, Great Graphics, R Not Helpful Professor Williams' "Learning Statistics" is a nice supplement to Michael Starbird's "Meaning from Data: Statistics Made Clear" (also from The Great Courses) in that it is more technical, covers a wider range of statistical techniques, and very ably shows the analyst how to progress through the sequence of statistical output to improve the analysis and the results; in contrast, Professor Starbird's course does a marvelous job in capturing the insights of statistics in an intuitive and easily understood manner. It also should be noted that Professor Williams' course is about 25% programming and 75% statistical analysis which shouldn't surprise anyone since the title of the course mentions "Concepts and Applications in R" which is a free programming language that is said to be "transforming how statistics is done around the world." Professor Williams is an amiable lecturer who provides personal anecdotes to enliven her lectures. As mentioned, the course is pitched at a higher level than Starbird's course; she provides much of the underlying mathematics and shows the statistical output in all its complexity and ambiguity. I found the course particularly helpful in her analysis of the stages of output, with its p values, F-statistics and R-squareds, and how we use this output to interpret our interim analyses and then improve upon our models by dropping variables, changing the analysis from linear to non-linerar, etc. I particularly enjoyed her lectures on multiple regression and logistic regression. Logistic regression brought back memories from many years ago when I was involved in an economics research project where our "binary" dependent variable was whether a bank was a member of the Federal Reserve System or not. I think I now understand more how logistic regression works than I did back then when we had our programmers make the calculations. I also thought her lecture on "Bayesian Inference" was very insightful as she used a baseball player's batting average to show how we can continually update our beliefs , e.g., the player's batting average, by observing new data. On the critical side, while I'm familiar with decision trees, her discussion of related regression trees was not at all clear to me, and although I downloaded the R programming language, I didn't use it very much since at this stage of my life I don't have the time to overcome what Professor Williams herself says is a "steep learning curve." Finally, although she mentions the issue of "over-fitting" once or twice, I believe this should be emphasized in a statistics course. In this day of big data and easy computation, some computer scientists encourage "data mining" where we just explore the data for quantitative relationships with no guiding theory or hypothesis. I would venture to say that the social science journals are full of papers where the supporting research ran 300 or so regressions and published the 10 best with a post hoc theory.
Date published: 2020-08-17
Rated 4 out of 5 by from Excellent presentation, intrusive camera movement This is a very rewarding and informative course. Professor Williams adopts a friendly informality that never interferes with clarity and precision. Sadly, her excellent presentation is sabotaged by poor video direction. The angle changes every twenty seconds, which is both intrusive and distracting. A good lecture doesn't need such gimmicks.
Date published: 2020-06-18
Rated 5 out of 5 by from Excellent I took statistics many years ago and wanted to take a refresher on the latest concepts and techniques. Professor Williams is an excellent teacher that delivers and also challenges the student. I like the use of R to support my learning efforts. The course leaves me with the skills to perform my own engineering studies as well as investigating studies presented to me. I highly endorse this course.
Date published: 2020-06-04
Rated 5 out of 5 by from Enjoyable Excellent teacher who adapted the subject to the real world.
Date published: 2020-05-26 ##### Course Trailer ##### 1: How to Summarize Data with Statistics

Confront how ALL data has uncertainty, and why statistics is a powerful tool for reaching insights and solving problems. Begin by describing and summarizing data with the help of concepts such as the mean, median, variance, and standard deviation. Learn common statistical notation and graphing techniques, and get a preview of the programming language R, which will be used throughout the course....

30 min ##### 2: Exploratory Data Visualization in R

Dip into R, which is a popular open-source programming language for use in statistics and data science. Consider the advantages of R over spreadsheets. Walk through the installation of R, installation of a companion IDE (integrated development environment) RStudio, and how to download specialized data packages from within RStudio. Then, try out simple operations, learning how to import data, save ...

26 min ##### 3: Sampling and Probability

Study sampling and probability, which are key aspects of how statistics handles the uncertainty inherent in all data. See how sampling aims for genuine randomness in the gathering of data, and probability provides the tools for calculating the likelihood of a given event based on that data. Solve a range of problems in probability, including a case of medical diagnosis that involves the applicatio...

25 min ##### 4: Discrete Distributions

There's more than one way to be truly random! Delve deeper into probability by surveying several discrete probability distributions-those defined by discrete variables. Examples include Bernoulli, binomial, geometric, negative binomial, and Poisson distributions-each tailored to answer a specific question. Get your feet wet by analyzing several sets of data using these tools....

30 min ##### 5: Continuous and Normal Distributions

Focus on the normal distribution, which is the most celebrated type of continuous probability distribution. Characterized by a bell-shaped curve that is symmetrical around the mean, the normal distribution shows up in a wide range of phenomena. Use R to find percentiles, probabilities, and other properties connected with this ubiquitous data pattern....

30 min ##### 6: Covariance and Correlation

When are two variables correlated? Learn how to measure covariance, which is the association between two random variables. Then use covariance to obtain a dimensionless number called the correlation coefficient. Using an R data set, plot correlation values for several variables, including the physical measurements of a sample population....

26 min ##### 7: Validating Statistical Assumptions

Graphical data analysis was once cumbersome and time-consuming, but that has changed with programming tools such as R. Analyze the classic Iris Flower Data Set-the standard for testing statistical classification techniques. See if you can detect a pattern in sepal and petal dimensions for different species of irises by using scatterplots, histograms, box plots, and other graphical tools....

27 min ##### 8: Sample Size and Sampling Distributions

It's rarely possible to collect all the data from a population. Learn how to get a lot from a little by "bootstrapping," a technique that lets you improve an estimate by resampling the same data set over and over. It sounds like magic, but it works! Test tools such as the Q-Q plot and the Shapiro-Wilk test, and learn how to apply the central limit theorem....

31 min ##### 9: Point Estimates and Standard Error

Take your understanding of descriptive techniques to the next level, as you begin your study of statistical inference, learning how to extract information from sample data. In this lecture, focus on the point estimate-a single number that provides a sensible value for a given parameter. Consider how to obtain an unbiased estimator, and discover how to calculate the standard error for this estimate...

23 min ##### 10: Interval Estimates and Confidence Intervals

Move beyond point estimates to consider the confidence interval, which provides a range of possible values. See how this tool gives an accurate estimate for a large population by sampling a relatively small subset of individuals. Then learn about the choice of confidence level, which is often specified as 95%. Investigate what happens when you adjust the confidence level up or down....

29 min ##### 11: Hypothesis Testing: 1 Sample

Having learned to estimate a given population parameter from sample data, now go the other direction, starting with a hypothesized parameter for a population and determining whether we think a given sample could have come from that population. Practice this important technique, called hypothesis testing, with a single parameter, such as whether a lifestyle change reduces cholesterol. Discover the ...

28 min ##### 12: Hypothesis Testing: 2 Samples, Paired Test

Extend the method of hypothesis testing to see whether data from two different samples could have come from the same population-for example, chickens on different feed types or an ice skater's speed in two contrasting maneuvers. Using R, learn how to choose the right tool to differentiate between independent and dependent samples. One such tool is the matched pairs t-test....

27 min ##### 13: Linear Regression Models and Assumptions

Step into fully modeling the relationship between data with the most common technique for this purpose: linear regression. Using R and data on the growth of wheat under differing amounts of rainfall, test different models against criteria for determining their validity. Cover common pitfalls when fitting a linear model to data....

27 min ##### 14: Regression Predictions, Confidence Intervals

What do you do if your data doesn't follow linear model assumptions? Learn how to transform the data to eliminate increasing or decreasing variance (called heteroscedasticity), thereby satisfying the assumptions of normality, independence, and linearity. One of your test cases uses the R data set for miles per gallon versus weight in 1973-74 model automobiles....

27 min ##### 15: Multiple Linear Regression

Multiple linear regression lets you deal with data that has multiple predictors. Begin with an R data set on diabetes in Pima Indian women that has an array of potential predictors. Evaluate these predictors for significance. Then turn to data where you fit a multiple regression model by adding explanatory variables one by one. Learn to avoid overfitting, which happens when too many explanatory va...

34 min ##### 16: Analysis of Variance: Comparing 3 Means

Delve into ANOVA, short for analysis of variance, which is used for comparing three or more group means for statistical significance. ANOVA answers three questions: Do categories have an effect? How is the effect different across categories? Is this significant? Learn to apply the F-test and Tukey's honest significant difference (HSD) test....

30 min ##### 17: Analysis of Covariance and Multiple ANOVA

You can combine features of regression and ANOVA to perform what is called analysis of covariance, or ANCOVA. And that's not all: Just as you can extend simple linear regression to multiple linear regression, you can also extend ANOVA to multiple ANOVA, known as MANOVA, or multivariate analysis of variance. Learn when to apply each of these techniques....

32 min ##### 18: Statistical Design of Experiments

While a creative statistical analysis can sometime salvage a poorly designed experiment, gain an understanding of how experiments can be designed in from the outset to collect far more reliable statistical data. Consider the role of randomization, replication, blocking, and other criteria, along with the use of ANOVA to analyze the results. Work several examples in R....

29 min ##### 19: Regression Trees and Classification Trees

Delve into decision trees, which are graphs that use a branching method to determine all possible outcomes of a decision. Trees for continuous outcomes are called regression trees, while those for categorical outcomes are called classification trees. Learn how and when to use each, producing inferences that are easily understood by non-statisticians....

28 min ##### 20: Polynomial and Logistic Regression

What can be done with data when transformations and tree algorithms don't work? One approach is polynomial regression, a form of regression analysis in which the relationship between the independent and dependent variables is modelled as the power of a polynomial. Step functions fit smaller, local models instead of one global model. Or, if we have binary data, there is logistic regression, in whic...

34 min ##### 21: Spatial Statistics

Spatial analysis is a set of statistical tools used to find additional order and patterns in spatial phenomena. Drawing on libraries for spatial analysis in R, use a type of graph called a semivariogram to plot the spatial autocorrelation of the measured sample points. Try your hand at data sets involving the geographic incidence of various medical conditions....

35 min ##### 22: Time Series Analysis

Time series analysis provides a way to model response data that is correlated with itself, from one point in time to the next, such as daily stock prices or weather history. After disentangling seasonal changes from longer-term patterns, consider methods that can model a dependency on time, collectively known as ARIMA (autoregressive integrated moving average) models....

34 min ##### 23: Prior Information and Bayesian Inference

Turn to an entirely different approach for doing statistical inference: Bayesian statistics, which assumes a known prior probability and updates the probability based on the accumulation of additional data. Unlike the frequentist approach, the Bayesian method does not depend on an infinite number of hypothetical repetitions. Explore the flexibility of Bayesian analysis.

35 min ##### 24: Statistics Your Way with Custom Functions

Close the course by learning how to write custom functions for your R programs, streamlining operations, enhancing graphics, and putting R to work in a host of other ways. Professor Williams also supplies tips on downloading and exporting data, and making use of the rich resources for R-a truly powerful tool for understanding and interpreting data in whatever way you see fit....

34 min
Talithia Williams

To truly appreciate statistical information, we have to understand the language of statistics, the assumptions of statistics, and how we reason in the face of uncertainty.

ALMA MATER

Rice University

INSTITUTION

Harvey Mudd College