Learning objectives - Preliminaries - At a general/broad overview level: - Recall the 6 steps of the statistical method of investigation is and why it is needed in science and other fields - Recognize that variation is pervasive - Explain why probability can be used to measure randomness - Recognize different ways of representing and summarizing data. - Be able to identify observational units and variables in a dataset. - Terminology: see Preliminaries glossary - 6 Jan 2014. Section P.1 - Understand why anecdotal evidence is unreliable - Follow an example of the 6 step method of statistical investigation. - Distinguish types of variables. - Distinguish observational units and variables - Know organization and expectations of class. - 8 Jan 2014. Section P.2 - Explain why statistics is needed to interpret data - Describe distributions in terms of shape and basic statistics. - Use graphs to describe and answer questions about data. - 10 Jan 2014. Section P.3 - Distinguish different sources of variability - Define probability in terms of long-run frequency. - Practice simulation as a way to model processes - Chapter 1 - Be able to measure the strength of evidence in the case of a single binary variable. - Justify a conclusion about data using p-values and standardized statistics - Terminology: see Chapter 1 glossary - 13 Jan 2014. Section 1.1 - Distinguish between a statistic and a parameter - Explain each step the 3S method for measuring the strength of evidence (statistic, simulate, strength) - Explain why a chance model can be used to evaluate a statement about real data - Justify using coin flipping and simulation using the one-proportion applet to simulate a 50/50 binary random process - Qualitatively compare real data to the outcome of a random process. - 15 Jan 2014. Section 1.2 - Simulate a non-50/50 binary random process. - Use a random process to simulate the outcomes of a null hypothesis - Associate non-random processes with alternative hypotheses - Relate 'null and alternative' hypotheses to the 3S method and the 6 step method of statistical investigation. - Relate null hypothesis, null distribution, and random process - Define p-value - Graphically interpret a p-value using a distribution. - Use a p-value to make a statement about the strength of evidence - Become more comfortable with the difference between a statsitic and a parameter - 17 Jan 2014. Section 1.3 - Describe what the standard deviation is supposed to measure - Calculate the standard deviation of a set of data - Standardize a statistic using a null distribution - Use a standardized statistic to make a statement about the strength of evidence - 17-22 Jan 2014. Section 1.4 (homework) - Identify three factors that affect the strength of evidence - Explain how and why they affect it (i.e. whether they would make the p-value and standardized statistic larger or smaller) - Be able to decide whether to use a one- or two-sided test based on the research question and prior knowledge. - Link the decision to do a one-sided or two-sided test to how an alternative hypothesis is formulated - 22 Jan 2014. Section 1.5 - Relate the theory-based alternative to the simulation part of the 3S method. - Use the normal distribution to evaluate the strength of evidence. - Memorize formula for calculating the appropriate standard deviation of the null distribution for sample proportions. - Know when the theory-based approach is invalid and understand that the simulation-based approach is valid more often. - Use the one-proportion applet to experiment and build your intuition about the central limit theorem - Chapter 2 - Appreciate properties of and be able to generate a simple random sample - Recognize biased sampling methods - Be able to critique the validity of a study based on its sampling scheme - Apply tests of significance to random samples from populations - Be able to measure the strength of evidence for a single quantitative variable - Distinguish type I and type II errors - Terminology: see the Chapter 2 section summaries - 24 Jan 2014. Section 2.1 - Distinguish a sample from a population. - Describe the relationship between samples, populations, statistics, and parameters - Distinguish between quantitative and categorical variables - Interpret a histogram - Identify the population in a description of a study design. - Decide between different possible populations that a sample may represent - Describe how to use a sampling frame to generate a simple random sample - Relate bias to sampling methodology - Distinguish simple random samples, convenience samples, and proportional samples. - Connect the sampling scheme to the generalizability of a study - 27 Jan 2014. Section 2.2 - Understand how the property of resistance to outliers can be used to choose a statistic to use - Justify using the median vs. the mean as a statistic to summarize a set of data - Use the 3S strategy to draw inferences about a quantitative variable - Understand the relationship between the formulas for the z-statistic and the t-statistic - Use a theory based method (one-sample t-test) to draw inferences about a population mean - 29 Jan 2014. Section 2.3 - Relate p-values to significance levels - Use a significance level to draw a conclusion about the strength of evidence. - Construct a "truth table" to define type I and type II errors. - Chapter 3 - Estimate the size of the effect of a non-random process on the data - Justify a conclusion about data using a confidence interval - Master 4 different ways of constructing a confidence interval and relate them to each other - Explain how biases in sampling methods affect confidence intervals - Terminology: see the Chapter 3 section summaries - 31 Jan 2013. Section 3.1 - Make the conceptual link between significance testing and whether a value is a plausible value for a parameter - Define a confidence interval - Relate a confidence level to a significance level. - Explain the connection between confidence level and the range of plausible parameter values - 3 Feb 2014. Section 3.2 - Relate the term margin-of-error to confidence intervals. - Use the 2SD method (and its generalization "the Empirical Rule") to generate confidence intervals for population proportions and know when this method is valid. - Use the theory-based method to generate confidence intervals and to explain where the Empirical Rule comes from - Memorize the formula for the theory-based confidence interval for a proportion - 5 Feb 2013. Section 3.3 - Use the Empirical Rule to generate a confidence intervals for a population mean and know when this method is valid - Use the t-distribution (theory-based) to generate a confidence interval for a population mean and know when this is valid - 5 Feb 2013 also. Section 3.4 - Explain how and why sample size and confidence level affect the width of confidence intervals and be able to justify this using the theory-based confidence interval formulas and simulations - 7 Feb 2014. Bootstrap confidence intervals [not in textbook] - Understand how the relationship between sample and population is analogous to the relationship be statistics and parameter - Justify sampling with replacement from a sample to represent sampling from a population based on the above relationship - Connect the idea of sampling to the idea of bootstrapping - Use percentiles to construct a bootstrapped confidence interval - 10 Feb 2014. Section 3.5B - Understand the distinction between statistical and practical significance - Define power of a test - Identify factors that affect the power of a test - Describe how power relates to the null hypothesis - Chapter 4 - Evaluate whether the design of a study allows causal conclusions to be drawn - Terminology: see Chapter 4 glossary - 12 Feb 2014. Section 4.1 - Distinguish and identify explanatory and response variables in a study - Define and identify confounding variables. - Explain how confounding variables affect the ability to draw causal conclusions. - 19 Feb 2014. Section 4.2 - Identify the two places where randomization can come into study design and distinguish between them (random sampling and random assignment) - Explain why the processes of unit selection and explanatory group formation affect the scope of conclusions - Distinguish observational from experimental studies - Explain why causal conclusions cannot be drawn from observational studies (and any circumstances when they can). - 19,20 Feb 2014. Section 4.3 (homework) - Identify and calculate the statistic in a paired design study.Be able to identify the statistic and explain the logic of a paired design. - Use a paired design to draw a conclusion - Identify where randomization comes into paired design studies. - Chapter 5. - Apply the 6 step statistical investigation method to the case of data on two groups with one binary variable - Terminology: see Chapter 5 glossary; also sensitivity, specificity, positive predictive value, negative predictive value - 21 Feb 2014. Section 5.1 - Construct a contingency table from a data table - Make a segmented bar graph from a data table - Identify observation units and variables in a two-variable dataset. - Calculate conditional proportions from a contingency table - 21 Feb 2014. Section 5.2 (inference) - Apply the 3S method to a dataset that involves comparing two sample proportions (two groups, one binary variable) - Explain the logic of the scrambling strategy of simulation. Compare shuffling cards to the applet - Relate the scrambling strategy of simulation to previous strategies - Use the two-way table inference applet to draw conclusions about two proportions - 24 Feb 2014. Section 5.2 (estimation) - Use the 2SD method to produce confidence intervals for the difference between two proportions - Use bootstrapping to produce confidence intervals for the difference between two proportions - Identify factors that affect p-values and confidence intervals for the two groups, one binary variable situation and explain why they have their effects - Explain how confidence levels and significant levels affect the process of drawing conclusions - 24 Feb 2014. Section 5.3 - Use the normal distribution to test for and estimate differences between two proportions - Relate the theory-based approach to the simulation based approach - Explain the boxes in the two-proportion theory based inference applet and use it to interpret the data in this chapter - Memorize and interpret the formulas for the two proportion theory-based approach - 26 Feb 2014. Sensitivity, specificity, positive predictive value, negative predictive value - Understand how to interpret medical test results in the context of a 2x2 contingency table - Contrast the usefulness and intuitiveness of counts vs. proportions in interpreting 2x2 tables - Chapter 6 - Apply the 7 step method to data with two groups, one quantitative variable - Terminology: see Chapter 6 glossary - 28 Feb 2014. Section 6.1 - Use dotplots and summary statistics to display, summarize, and compare distributions of quantitative data. - Develop intuition about how variance within vs. between groups affects the ability to draw conclusions about differences between the groups - 28 Feb 2013. Section 6.2 - Apply the 3S method to a dataset that involves comparing two quantitative sample statistics (two groups, one quantitative variable) - Use the 2SD method to produce confidence intervals for the difference between two statistics - Use bootstrapping to produce confidence intervals for the difference between two statistics - Identify some factors that affect p-values and confidence intervals for the two groups, one quantitative variable situation and explain why they have their effects - Explain the options in the randomization test with quantitative response applet and use it to interpret data in this chapter - 3 Mar 2014. Section 6.3 - Compare and contrast the t-distribution with the normal distribution - Explain why a t-distribution is used to compare means of two groups and not a normal distribution - Use the t-distribution to test for and estimate differences between two means - Know the formula for the t statistic - Relate the theory-based approach to the simulation based approach - Use the theory-based inference applet to interpret data in this chapter - Chapter 9 - Develop statistics for comparing means across multiple groups - Utilize simulation-based approaches to compare several means - Understand roles of within-group and between-group variability in assessing significance - Apply and interpret results of theory-based approach (ANOVA F test) - Understand the link between the F distribution, F test, and F statistic - Realize that either simulation or theory-based approaches can be used to assess the significance of an F statistic - Consider follow-up multiple comparison analyses - Terminology: see Chapter 9 glossary - 3 and 5 March 2014. Section 9.1 - Justify making a new statistical test vs. mutliple pairwise tests - Invent a statistic for comparing means across multiple groups - Present a statistic comparing within vs. between group variation (F statistic) - Use the 3S method to draw inferences about group differences using the F statistic - Motivate the MAD statistic in the case of quantitative response variables - Use the 3S method to draw inferences about group differences using the MAD statistic - 7 March 2014. Section 9.1-2 continued - Understand the motivation behind the F statistic and how to calculate it - Explain and calculate "degrees of freedom" and know how it figures into theory-based approaches - Explain what the mean-square column in an ANOVA table means. - Interpret what a significant result in an ANOVA means. - Use bootstrapping to calculate confidence intervals for the statistics that compare means across multiple groups - Construct post-hoc confidence intervals for differences between means - Chapter 10 - Plot data with two quantitative variables - Interpret the form and interpret, measure, and test the direction and strength of association in a scatterplot - Terminology: see Chapter 10 glossary (terms relevant to sections 10.1-4) - 10 March 2014. Section 10.1 - Identify observational units and variables in the case of paired data with two different variables - Distinguish this case from the matched-pairs case of 4.3 - Construct a scatterplot from data - Qualitatively describe the form, strength, and direction of an association from a scatterplot - Identify any unusual observations in a scatterplot - Construct the correlation coefficient statistic based on desired properties - Infer the strength and direction of an association from a correlation coefficient - Demonstrate why a correlation coefficient is agnostic about the form of an association - 12 March 2014. Section 10.2-3 - Calculate a correlation coefficient from data and see how it summarizes strength and direction of an association and when it is valid - Estimate correlations visually - practice with the correlation guessing game applet - Use the 3S method to test hypotheses about the correlation coefficient - Use bootstrapping to compute confidence intervals for the correlation coefficient - Define the regression line in terms of how it summarizes the data - Interpret the slope and intercept of such a line and relate the slope to the correlation coefficient - 14 March 2014. Section 10.4 - Use the 3S method to test hypotheses about the regression slope - Use bootstrapping to compute confidence intervals for the regression slope - 14 March 2014. Conclusion - Catalog the different types of simulation strategies we have used - Integrate the different situations and strategies into a common framework using the 6-step method and the 3S strategy. - Make a flowchart for statistical investigation based on data type and data shape (best to do as the course progresses) - Distinguish between estimation and inference and explain the roles of each