Learning objectives
V Preliminaries
* At a general/broad overview level:
* Recall the 6 steps of the statistical method of investigation is and why it is needed in science and other fields
* Recognize that variation is pervasive
* Explain why probability can be used to measure randomness
* Recognize different ways of representing and summarizing data.
* Be able to identify observational units and variables in a dataset.
* Terminology: see Preliminaries glossary
V 6 Jan 2014. Section P.1
* Understand why anecdotal evidence is unreliable
* Follow an example of the 6 step method of statistical investigation.
* Distinguish types of variables.
* Distinguish observational units and variables
* Know organization and expectations of class.
V 8 Jan 2014. Section P.2
* Explain why statistics is needed to interpret data
* Describe distributions in terms of shape and basic statistics.
* Use graphs to describe and answer questions about data.
V 10 Jan 2014. Section P.3
* Distinguish different sources of variability
* Define probability in terms of long-run frequency.
* Practice simulation as a way to model processes
V Chapter 1
* Be able to measure the strength of evidence in the case of a single binary variable.
* Justify a conclusion about data using p-values and standardized statistics
* Terminology: see Chapter 1 glossary
V 13 Jan 2014. Section 1.1
* Distinguish between a statistic and a parameter
* Explain each step the 3S method for measuring the strength of evidence (statistic, simulate, strength)
* Explain why a chance model can be used to evaluate a statement about real data
* Justify using coin flipping and simulation using the one-proportion applet to simulate a 50/50 binary random process
* Qualitatively compare real data to the outcome of a random process.
V 15 Jan 2014. Section 1.2
* Simulate a non-50/50 binary random process.
* Use a random process to simulate the outcomes of a null hypothesis
* Associate non-random processes with alternative hypotheses
* Relate 'null and alternative' hypotheses to the 3S method and the 6 step method of statistical investigation.
* Relate null hypothesis, null distribution, and random process
* Define p-value
* Graphically interpret a p-value using a distribution.
* Use a p-value to make a statement about the strength of evidence
* Become more comfortable with the difference between a statsitic and a parameter
V 17 Jan 2014. Section 1.3
* Describe what the standard deviation is supposed to measure
* Calculate the standard deviation of a set of data
* Standardize a statistic using a null distribution
* Use a standardized statistic to make a statement about the strength of evidence
V 17-22 Jan 2014. Section 1.4 (homework)
* Identify three factors that affect the strength of evidence
* Explain how and why they affect it (i.e. whether they would make the p-value and standardized statistic larger or smaller)
* Be able to decide whether to use a one- or two-sided test based on the research question and prior knowledge.
* Link the decision to do a one-sided or two-sided test to how an alternative hypothesis is formulated
V 22 Jan 2014. Section 1.5
* Relate the theory-based alternative to the simulation part of the 3S method.
* Use the normal distribution to evaluate the strength of evidence.
* Memorize formula for calculating the appropriate standard deviation of the null distribution for sample proportions.
* Know when the theory-based approach is invalid and understand that the simulation-based approach is valid more often.
* Use the one-proportion applet to experiment and build your intuition about the central limit theorem
V Chapter 2
* Appreciate properties of and be able to generate a simple random sample
* Recognize biased sampling methods
* Be able to critique the validity of a study based on its sampling scheme
* Apply tests of significance to random samples from populations
* Be able to measure the strength of evidence for a single quantitative variable
* Distinguish type I and type II errors
* Terminology: see the Chapter 2 section summaries
V 24 Jan 2014. Section 2.1
* Distinguish a sample from a population.
* Describe the relationship between samples, populations, statistics, and parameters
* Distinguish between quantitative and categorical variables
* Interpret a histogram
* Identify the population in a description of a study design.
* Decide between different possible populations that a sample may represent
* Describe how to use a sampling frame to generate a simple random sample
* Relate bias to sampling methodology
* Distinguish simple random samples, convenience samples, and proportional samples.
* Connect the sampling scheme to the generalizability of a study
V 27 Jan 2014. Section 2.2
* Understand how the property of resistance to outliers can be used to choose a statistic to use
* Justify using the median vs. the mean as a statistic to summarize a set of data
* Use the 3S strategy to draw inferences about a quantitative variable
* Understand the relationship between the formulas for the z-statistic and the t-statistic
* Use a theory based method (one-sample t-test) to draw inferences about a population mean
V 29 Jan 2014. Section 2.3
* Relate p-values to significance levels
* Use a significance level to draw a conclusion about the strength of evidence.
* Construct a "truth table" to define type I and type II errors.
V Chapter 3
* Estimate the size of the effect of a non-random process on the data
* Justify a conclusion about data using a confidence interval
* Master 4 different ways of constructing a confidence interval and relate them to each other
* Explain how biases in sampling methods affect confidence intervals
* Terminology: see the Chapter 3 section summaries
V 31 Jan 2013. Section 3.1
* Make the conceptual link between significance testing and whether a value is a plausible value for a parameter
* Define a confidence interval
* Relate a confidence level to a significance level.
* Explain the connection between confidence level and the range of plausible parameter values
V 3 Feb 2014. Section 3.2
* Relate the term margin-of-error to confidence intervals.
* Use the 2SD method (and its generalization "the Empirical Rule") to generate confidence intervals for population proportions and know when this method is valid.
* Use the theory-based method to generate confidence intervals and to explain where the Empirical Rule comes from
* Memorize the formula for the theory-based confidence interval for a proportion
V 5 Feb 2013. Section 3.3
* Use the Empirical Rule to generate a confidence intervals for a population mean and know when this method is valid
* Use the t-distribution (theory-based) to generate a confidence interval for a population mean and know when this is valid
V 5 Feb 2013 also. Section 3.4
* Explain how and why sample size and confidence level affect the width of confidence intervals and be able to justify this using the theory-based confidence interval formulas and simulations
V 7 Feb 2014. Bootstrap confidence intervals [not in textbook]
* Understand how the relationship between sample and population is analogous to the relationship be statistics and parameter
* Justify sampling with replacement from a sample to represent sampling from a population based on the above relationship
* Connect the idea of sampling to the idea of bootstrapping
* Use percentiles to construct a bootstrapped confidence interval
V 10 Feb 2014. Section 3.5B
* Understand the distinction between statistical and practical significance
* Define power of a test
* Identify factors that affect the power of a test
* Describe how power relates to the null hypothesis
V Chapter 4
* Evaluate whether the design of a study allows causal conclusions to be drawn
* Terminology: see Chapter 4 glossary
V 12 Feb 2014. Section 4.1
* Distinguish and identify explanatory and response variables in a study
* Define and identify confounding variables.
* Explain how confounding variables affect the ability to draw causal conclusions.
V 19 Feb 2014. Section 4.2
* Identify the two places where randomization can come into study design and distinguish between them (random sampling and random assignment)
* Explain why the processes of unit selection and explanatory group formation affect the scope of conclusions
* Distinguish observational from experimental studies
* Explain why causal conclusions cannot be drawn from observational studies (and any circumstances when they can).
V 19,20 Feb 2014. Section 4.3 (homework)
* Identify and calculate the statistic in a paired design study.Be able to identify the statistic and explain the logic of a paired design.
* Use a paired design to draw a conclusion
* Identify where randomization comes into paired design studies.
V Chapter 5.
* Apply the 6 step statistical investigation method to the case of data on two groups with one binary variable
* Terminology: see Chapter 5 glossary; also sensitivity, specificity, positive predictive value, negative predictive value
V 21 Feb 2014. Section 5.1
* Construct a contingency table from a data table
* Make a segmented bar graph from a data table
* Identify observation units and variables in a two-variable dataset.
* Calculate conditional proportions from a contingency table
V 21 Feb 2014. Section 5.2 (inference)
* Apply the 3S method to a dataset that involves comparing two sample proportions (two groups, one binary variable)
* Explain the logic of the scrambling strategy of simulation. Compare shuffling cards to the applet
* Relate the scrambling strategy of simulation to previous strategies
* Use the two-way table inference applet to draw conclusions about two proportions
V 24 Feb 2014. Section 5.2 (estimation)
* Use the 2SD method to produce confidence intervals for the difference between two proportions
* Use bootstrapping to produce confidence intervals for the difference between two proportions
* Identify factors that affect p-values and confidence intervals for the two groups, one binary variable situation and explain why they have their effects
* Explain how confidence levels and significant levels affect the process of drawing conclusions
V 24 Feb 2014. Section 5.3
* Use the normal distribution to test for and estimate differences between two proportions
* Relate the theory-based approach to the simulation based approach
* Explain the boxes in the two-proportion theory based inference applet and use it to interpret the data in this chapter
* Memorize and interpret the formulas for the two proportion theory-based approach
V 26 Feb 2014. Sensitivity, specificity, positive predictive value, negative predictive value
* Understand how to interpret medical test results in the context of a 2x2 contingency table
* Contrast the usefulness and intuitiveness of counts vs. proportions in interpreting 2x2 tables
V Chapter 6
* Apply the 7 step method to data with two groups, one quantitative variable
* Terminology: see Chapter 6 glossary
V 28 Feb 2014. Section 6.1
* Use dotplots and summary statistics to display, summarize, and compare distributions of quantitative data.
* Develop intuition about how variance within vs. between groups affects the ability to draw conclusions about differences between the groups
V 28 Feb 2013. Section 6.2
* Apply the 3S method to a dataset that involves comparing two quantitative sample statistics (two groups, one quantitative variable)
* Use the 2SD method to produce confidence intervals for the difference between two statistics
* Use bootstrapping to produce confidence intervals for the difference between two statistics
* Identify some factors that affect p-values and confidence intervals for the two groups, one quantitative variable situation and explain why they have their effects
* Explain the options in the randomization test with quantitative response applet and use it to interpret data in this chapter
V 3 Mar 2014. Section 6.3
* Compare and contrast the t-distribution with the normal distribution
* Explain why a t-distribution is used to compare means of two groups and not a normal distribution
* Use the t-distribution to test for and estimate differences between two means
* Know the formula for the t statistic
* Relate the theory-based approach to the simulation based approach
* Use the theory-based inference applet to interpret data in this chapter
V Chapter 9
* Develop statistics for comparing means across multiple groups
* Utilize simulation-based approaches to compare several means
* Understand roles of within-group and between-group variability in assessing significance
* Apply and interpret results of theory-based approach (ANOVA F test)
* Understand the link between the F distribution, F test, and F statistic
* Realize that either simulation or theory-based approaches can be used to assess the significance of an F statistic
* Consider follow-up multiple comparison analyses
* Terminology: see Chapter 9 glossary
V 3 and 5 March 2014. Section 9.1
* Justify making a new statistical test vs. mutliple pairwise tests
* Invent a statistic for comparing means across multiple groups
* Present a statistic comparing within vs. between group variation (F statistic)
* Use the 3S method to draw inferences about group differences using the F statistic
* Motivate the MAD statistic in the case of quantitative response variables
* Use the 3S method to draw inferences about group differences using the MAD statistic
V 7 March 2014. Section 9.1-2 continued
* Understand the motivation behind the F statistic and how to calculate it
* Explain and calculate "degrees of freedom" and know how it figures into theory-based approaches
* Explain what the mean-square column in an ANOVA table means.
* Interpret what a significant result in an ANOVA means.
* Use bootstrapping to calculate confidence intervals for the statistics that compare means across multiple groups
* Construct post-hoc confidence intervals for differences between means
V Chapter 10
* Plot data with two quantitative variables
* Interpret the form and interpret, measure, and test the direction and strength of association in a scatterplot
* Terminology: see Chapter 10 glossary (terms relevant to sections 10.1-4)
V 10 March 2014. Section 10.1
* Identify observational units and variables in the case of paired data with two different variables
* Distinguish this case from the matched-pairs case of 4.3
* Construct a scatterplot from data
* Qualitatively describe the form, strength, and direction of an association from a scatterplot
* Identify any unusual observations in a scatterplot
* Construct the correlation coefficient statistic based on desired properties
* Infer the strength and direction of an association from a correlation coefficient
* Demonstrate why a correlation coefficient is agnostic about the form of an association
V 12 March 2014. Section 10.2-3
* Calculate a correlation coefficient from data and see how it summarizes strength and direction of an association and when it is valid
* Estimate correlations visually - practice with the correlation guessing game applet
* Use the 3S method to test hypotheses about the correlation coefficient
* Use bootstrapping to compute confidence intervals for the correlation coefficient
* Define the regression line in terms of how it summarizes the data
* Interpret the slope and intercept of such a line and relate the slope to the correlation coefficient
V 14 March 2014. Section 10.4
* Use the 3S method to test hypotheses about the regression slope
* Use bootstrapping to compute confidence intervals for the regression slope
V 14 March 2014. Conclusion
* Catalog the different types of simulation strategies we have used
* Integrate the different situations and strategies into a common framework using the 6-step method and the 3S strategy.
* Make a flowchart for statistical investigation based on data type and data shape (best to do as the course progresses)
* Distinguish between estimation and inference and explain the roles of each