SREE Research Design Short Course
This workshop focuses on the design of experimental and quasi-experimental evaluation studies. The session emphasizes the Campbell-Stanley-Cook-Shadish validity framework and the use of threats to validity as a strategy to evaluate potential research designs.
The most widely used experimental and quasi-experimental designs, including nonequivalent control group designs and regression discontinuity designs, are examined. Modern methods of multivariate matching, such as propensity score methods are investigated as well. Throughout the workshop, examples of evaluation research problems submitted by the participants will be employed to illustrate the design process. Alternative designs for each are assessed, in order to illustrate how decisions about design may be made in light of practical considerations affecting research programs.
Improving Generalization from Randomized Trials
for Policy Purposes
Randomized trials provide the gold standard of internal validity for making causal inferences about the effects of interventions. However, randomized trials are seldom conducted using probability samples that might provide the same gold standard of generalizability (external validity).
This lecture discusses methods to quantify and improve the generalizability of findings from randomized trials conducted to inform policy and illustrate these ideas with the FIRST trial. It begins by formalizing some subjective notions of generalizability in terms of estimating average treatment effects in well-defined inference populations. The problem is to use a study sample to estimate parameters of the distribution of treatment effects (e.g., the average treatment effect) in an inference population. When study samples are not probability samples, the inference process relies on matching the study sample to the inference population on a potentially large number of covariates that are related to variation in treatment effects. Then, the lecture outlines methods that can, under definable assumptions, yield estimates of the population average treatment effects are unbiased (or nearly so) with a standard error depends largely on how well the study sample matches the inference population. If the standard error is reasonably small, the study sample yields generalizable effects, but if it is large (or even infinite, as it can be) the evidence in the study sample has little or no generalizability to the inference population. Finally, the Flexibility In duty hour Requirements for Surgical Trainees (FIRST) trial is used to illustrate the use of these ideas.
Improving Generalizations from Experiments:
This course is aimed at researchers of all levels who are interested in either making generalizations from large-scale experiments that have been completed or planning to conduct large-scale experiments.
It focuses on studies using a cluster randomized or multi-site design including many schools or school districts, and the instructors introduce methods for improving the external validity of these experiments.
The course begins with an overview of the larger issues of generalization, and then provides participants with tools to implement new methods for improving generalizations in their own experimental work. This includes tools for developing a strategic recruitment plan and for improving estimates of the average treatment effect. Participants should be familiar with large-scale experiments.
Topics covered in this course include:
- Validity and Overview: Review of validity types and approaches to increasing validity.
- Retrospective Generalizations: Illustrate methods for improving generalization from experiments that have already been conducted.
- Prospective Methods: Illustrate methods for planning experiments to increase their generalizability to policy relevant populations.