Disease Detectives
2026 season
DESCRIPTION
Participants apply investigative and analytical skills to the scientific study of disease, injury, health, and disability in populations or groups of people. Unlike clinical medicine, which focuses on the health of individuals, Disease Detectives emphasizes population-level thinking: identifying patterns, comparing risks across groups, evaluating evidence, and recommending control and prevention strategies. Questions are designed to be process-oriented and assess how you evaluate, interpret, and synthesize information rather than how many pathogen facts you can recall.
This event rewards clear reasoning grounded in epidemiologic methods. You will define and classify cases, orient data by person–place–time, propose hypotheses consistent with the evidence, and select appropriate analytic approaches to test those hypotheses. You may be provided signs/symptoms, causative agents, or incubation periods as supportive background, but the task is to use such inputs to reason through the scenario—e.g., to frame a plausible transmission pathway, differentiate point-source from propagated outbreaks, or choose the most appropriate study design.
Overview
Applied epidemiology: outbreak investigations, surveillance, study designs, measures of disease frequency and association, screening, bias/confounding, and interpretation of public health data. Emphasis on data tables, epi curves, case definitions, and concise reasoning.
Background & Surveillance
(1) Clinical Approach vs Public Health Approach
At a glance, clinical medicine and public health share the same end goal—better health—but they operate at different levels, use different tools, and answer different questions. Disease Detectives is rooted in the public health approach: population‑level prevention and control based on data, causal inference, and program evaluation.
Clinical approach (health of individuals)
- Unit of action: an individual patient. Chief concerns are diagnosing, treating, and counseling the person in front of you.
- Data sources: patient history, physical exam, labs/imaging, differential diagnoses, clinical trials applicable to the patient’s condition.
- Outcomes: symptom relief, functional recovery, survival for the patient; risk–benefit at the bedside.
- Time horizon: immediate to short‑term care decisions, with longitudinal follow‑up per patient.
Public health approach (health of populations)
- Unit of action: groups and communities. The focus is patterns and causes of health states in populations, and which interventions reduce risk or harm across groups.
- Data sources: surveillance systems, registries, surveys, environmental monitoring, administrative datasets, field investigations, program data.
- Outcomes: incidence, prevalence, mortality, DALYs/QALYs, vaccination coverage, equity measures, and cost‑effectiveness at scale.
- Time horizon: anticipatory (prevention), rapid response (control), and sustained (policy/programs), with ongoing evaluation.
Key implications for Disease Detectives
- Emphasis on person–place–time, measures (risk, rate, prevalence), and measures of association (RR, OR, AR) rather than exhaustive pathogen trivia. If etiology or incubation periods are given, use them as clues to frame hypotheses and control measures.
- Choose designs that answer the public health question: e.g., a cohort or case–control study nested in an outbreak to quantify associations; surveillance to detect an aberration; program evaluation to assess impact.
- Translate findings into action: recommend feasible, proportionate control and prevention measures; consider timeliness, resources, and equity.
Concrete example
- Clinical lens: A physician evaluates a patient with acute gastroenteritis, orders stool testing, hydrates, and considers antibiotics by guidelines.
- Public health lens: Multiple similar cases emerge after a banquet. You define a case, build a line list, draw an epi curve, compute meal‑specific attack rates, identify the likely vehicle (e.g., cream pastries), and recommend immediate controls (discard leftovers, notify caterer, reinforce refrigeration and hygiene) while labs confirm the agent.
Another example: injury epidemiology
- Clinical: treat an individual’s fall‑related fracture.
- Public health: quantify fall incidence in older adults, identify risk factors (medications, hazards), and implement population‑level interventions (home safety assessments, strength/balance programs), then evaluate hospitalization and fracture trends post‑intervention.
Comparison summary
| Dimension | Clinical (Individual) | Public Health (Population) |
|---|---|---|
| Unit of analysis | Patient | Group/community |
| Primary goal | Diagnosis and treatment | Prevention, control, and health promotion |
| Typical data | H&P, labs, imaging | Surveillance, surveys, registries, field data |
| Measures | Patient outcomes | Incidence, prevalence, RR/OR/AR, mortality, VE |
| Action horizon | Immediate/short‑term | Short‑ to long‑term; program/policy cycles |
| Decisions driven by | Clinical judgment, guidelines | Evidence synthesis, causal inference, feasibility |
How to use this distinction on test day
- When presented background clinical details (signs/symptoms, incubation), convert them into population‑level hypotheses (agent–host–environment, chain of transmission) and operational steps (case definition, triad by PPT, analytic design).
- Keep conclusions population‑focused (e.g., “Food A has RR=6.2 suggesting strong association; recommend removing Food A and reinforcing temperature control while confirming with lab”) rather than patient‑specific advice.
- Frame answers around data: define denominators correctly, compute measures, and justify conclusions with the results.
(2) History and Development of Epidemiology
Epidemiology has roots in careful observation and comparative reasoning long before the field was named. Key milestones reveal recurring themes you’ll apply in competition: describing person–place–time, testing hypotheses with data, and acting to control risk.
Early foundations
- 17th century: John Graunt analyzed London Bills of Mortality, producing the first life tables and recognizing regularities in births/deaths—showing population data can reveal predictable patterns.
- 18th–19th centuries: James Lind’s scurvy trial (citrus) illustrated controlled comparisons; Pierre Louis’ numerical method emphasized data over anecdotes.
Cholera and natural experiments
- John Snow (1854) mapped cholera cases in Soho, London, and compared areas served by different water companies—an elegant natural experiment. Removing the Broad Street pump handle symbolized acting on evidence even before complete etiologic certainty. You’ll mirror Snow’s logic when you align epi curves, spot maps, and attack rates to support a source hypothesis.
Germ theory, sanitation, and vaccination
- Pasteur and Koch established microbial causes of disease; public health integrated laboratory confirmation with sanitation (water, sewage) and immunization programs. Today’s outbreak steps still emphasize verifying diagnosis/lab while implementing immediate controls when warranted.
20th century chronic disease epidemiology
- Framingham Heart Study (cohort) identified risk factors (hypertension, cholesterol, smoking); case–control studies (e.g., Doll & Hill) linked smoking to lung cancer. Modern epidemiology addresses infectious and non‑infectious causes alike—relevant in Disease Detectives when scenarios involve toxins, injuries, or environmental exposures.
Modern developments
- Causal criteria (Bradford Hill), confounding/effect modification, randomized trials, meta‑analysis, and implementation science refined how we infer causes and evaluate interventions. Surveillance expanded with electronic reporting and syndromic signals.
Competition takeaway
- Describe person–place–time; hypothesize plausible causes; choose appropriate designs; and conclude with actionable, feasible controls—echoing the field’s evolution from description to intervention.
(3) Roles of Epidemiology in Public Health and Steps in Solving Health Problems
Epidemiology supports public health through four core functions: assessment, policy development, assurance, and communication. In practice, you will cycle through a structured problem‑solving approach: define the problem, measure it, identify determinants, test hypotheses, implement controls, and evaluate impact.
Roles of epidemiology
- Assessment: quantify disease burden (incidence, prevalence, mortality), detect aberrations via surveillance, and characterize distribution by person–place–time.
- Causal inference: identify determinants and quantify associations (RR/OR/AR), accounting for bias/confounding; apply causal frameworks (Bradford Hill, sufficient‑component cause, DAGs—as scoped).
- Program/policy: translate evidence into interventions, prioritize by impact and feasibility, and recommend proportionate control measures.
- Evaluation: monitor effectiveness, timeliness, sensitivity/specificity of surveillance, and program outcomes; iterate based on data.
Steps in solving health problems (operational playbook)
- Define the problem precisely
- Establish case definition (clinical, lab, time and place criteria; confirmed/probable/suspected); verify diagnosis.
- Establish the existence of a problem relative to expected (baseline) data.
- Describe by person, place, time
- Construct a line list; produce tables/graphs (epi curve, spot/cluster maps); identify high‑risk groups and exposure windows.
- Develop hypotheses
- Use agent–host–environment triad and chain of transmission; consult background details (incubation, mode, reservoir) provided as context.
- Consider non‑infectious causes (toxic exposures, injuries) with appropriate analogs (dose, route, temporal relation).
- Evaluate hypotheses analytically
- Choose designs suited to the setting: cohort (risk/RR/AR) for known populations/meal cohorts; case–control (OR) for rare/undefined populations; cross‑sectional for prevalence snapshots.
- Compute and compare measures; stratify to assess confounding/effect modification; interpret confidence intervals when provided.
- Implement control and prevention measures
- Do not wait to implement urgent controls (remove source, isolation, hygiene, environmental remediation) when evidence is strong; balance timeliness with uncertainty.
- Address feasibility, acceptability, and equity; communicate risks clearly.
- Communicate and evaluate
- Report methods, key measures, and recommendations succinctly; maintain surveillance to assess impact; refine actions based on observed outcomes and stakeholder feedback.
Applied example
- School gastroenteritis cluster: define case (onset ≥3 symptoms within 24 h post‑lunch), line list 80 students, epi curve peaks 10–12 h; cohort study by menu shows AR for egg salad 62% vs 8% (RR≈7.8); implement immediate control (remove egg salad, temperature control, staff hygiene) while confirming Staph aureus enterotoxin; communicate findings and evaluate decline in cases.
(4) Natural History and Spectrum of Disease; Infectious and Noninfectious Causes
Natural history of disease describes how a condition progresses in the absence of intervention. Understanding the timeline clarifies which prevention strategy (primary/secondary/tertiary) is feasible and when.
Core stages (general framework)
- Stage of susceptibility: risk factors present but no disease yet (e.g., low vaccination, unsafe water, unguarded machinery). Primary prevention acts here (vaccination, engineering controls, PPE, policy).
- Stage of preclinical/subclinical disease: after exposure/initial pathologic changes but before diagnosis; for infections, this includes incubation (exposure→symptom onset). For chronic disease, latent/asymptomatic periods can be long. Secondary prevention (screening, early detection) acts here.
- Stage of clinical disease: symptoms/signs lead to diagnosis and treatment. Tertiary prevention mitigates complications/disability.
- Outcome: recovery, disability, or death; consider rehabilitation and long‑term follow‑up.
Alignment of prevention levels with the timeline
| Timeline stage | Examples of prevention strategies |
|---|---|
| Susceptibility (before disease) | Vaccination; sanitation; safe food temperatures; guardrails; masking; ventilation; behavioral education; regulation |
| Preclinical/subclinical | Screening (e.g., BP, HbA1c); contact tracing/testing; targeted prophylaxis; surveillance triggers |
| Clinical | Timely treatment; isolation/cohorting; harm‑reduction; complication prevention |
| Outcome | Rehab, disability support, secondary prevention to avert recurrence |
Key time concepts
- Incubation period (infections): time from exposure to symptom onset. Helps identify exposure windows on epi curves.
- Latency (chronic/noninfectious and some infections): time from exposure to disease detectability/clinical diagnosis.
- Infectious period: when cases can transmit; may precede symptom onset in some pathogens.
- Iceberg phenomenon: identified cases are the visible tip; subclinical/asymptomatic infections increase the true burden.
Infectious causes (breadth and transmission)
- Agents: bacteria, viruses, fungi, protists, prions—each with differing reservoirs, survival, and control measures.
- Reservoirs: humans, animals, environment (soil, water).
- Transmission: direct (droplet, contact), indirect (fomites, vehicles—food/water), vector‑borne (mosquitoes, ticks), airborne (aerosols), vertical (mother→child).
- Chain of transmission: agent → reservoir → portal of exit → mode → portal of entry → susceptible host. Controls break links (e.g., source removal, hand hygiene, barriers, vaccination, vector control).
Noninfectious causes (accidents, exposures, toxicities)
- Injuries: mechanical, thermal, chemical, radiation. Prevention focuses on hazards, exposure opportunities, and human factors (engineering, enforcement, education).
- Chemical/toxic exposures: dose, duration, route (inhalation, ingestion, dermal), and timing drive risk; look for dose–response patterns and exposure opportunities (e.g., food item, workplace task, spill).
- Environmental events: heat waves, air pollution episodes; surveillance may use syndromic signals and environmental metrics.
- Cluster investigations: similar logic as infectious outbreaks—define case, person–place–time, test hypotheses linking exposure to outcomes; consider latency and competing explanations.
Applying the framework in scenarios
- If the epi curve suggests a tight point‑source exposure with incubation compatible with Staph toxin (short, 2–12 h), prioritize food vehicle removal and temperature control; tertiary care is supportive only.
- For suspected chemical intoxication (e.g., nitrite contamination): many cases with very rapid onset after a particular item; emphasize product traceback, immediate stop‑sale/recall, and medical toxicology guidance.
- For chronic clusters (e.g., suspected environmental carcinogen), emphasize latency, small‑number instability, and careful control selection; outline long‑term surveillance and exposure assessment rather than over‑interpreting a single snapshot.
(5) Core Epidemiologic Terms (CDC Principles Glossary – Practical Guide)
Know these terms well enough to use them precisely in answers.
Foundational distribution terms
- Endemic: constant presence of a disease/health condition within a geographic area or population group.
- Epidemic: occurrence in excess of expected baseline in a community/region (aka outbreak when more localized).
- Pandemic: epidemic spread over several countries/continents, usually affecting a large number of people.
Investigation and measurement
- Outbreak: increase in cases above baseline; often localized in time/place/population; triggers the 10 steps.
- Case: person meeting case definition (confirmed, probable, suspected).
- Attack rate (incidence proportion): new cases among a group ÷ total in that group during a specified period (common in meal cohorts).
- Risk vs Rate: risk is proportion over a period; rate uses person‑time in denominator.
- Incidence vs Prevalence: incidence is new cases; prevalence is existing cases (point or period).
- Case fatality ratio (CFR): deaths among cases ÷ total cases.
- Secondary attack rate: new cases among contacts ÷ susceptible contacts (excludes index cases and immune individuals if known).
Transmission and exposure
- Reservoir: habitat where an agent normally lives/multiplies (human, animal, environment).
- Vector: living intermediary (e.g., mosquito, tick) conveying an agent to a susceptible host.
- Vehicle: inanimate intermediary (food, water, air) carrying an agent.
- Fomite: contaminated object (doorknob, utensil) that transmits via contact.
- Index case (primary case): first case identified in an investigation (may not be the true source).
Temporal concepts
- Incubation period: exposure → symptom onset (infections).
- Latency period: exposure → disease detectability/diagnosis (often chronic).
- Serial interval: symptom onset in case A → symptom onset in case B infected by A.
- Infectious period: time during which a case can transmit to others.
Surveillance & test characteristics
- Surveillance: ongoing, systematic collection, analysis, interpretation, and dissemination of health data for action.
- Sensitivity/Specificity: test performance; sensitivity detects true positives, specificity detects true negatives.
- Predictive values (PPV/NPV): depend on prevalence; as prevalence rises, PPV rises, NPV falls (holding Se/Sp fixed).
- Validity vs Reliability: validity = accuracy; reliability = precision/repeatability.
Bias and confounding (brief)
- Bias: systematic error (selection, information/misclassification).
- Confounding: a third variable distorts the exposure–outcome association (associated with exposure and outcome, not on the causal pathway). Address via design (randomization, restriction, matching) or analysis (stratification, adjustment).
Tip for answers: define terms once, then use them consistently (e.g., “incidence proportion (attack rate) among exposed = 28/45 = 0.62”).
(6) Surveillance: Role, 5‑Step Process, Types, and Attributes
Surveillance identifies and monitors health problems, detects aberrations, guides control measures, and evaluates interventions. For competition, you should be able to describe the cycle, distinguish types, and discuss attributes with concrete examples.
Role in identifying health problems
- Detect unusual increases (e.g., GI complaints exceeding baseline trigger an investigation).
- Characterize affected populations and trends (who/where/when).
- Inform resource allocation and immediate controls; evaluate program impact over time.
Surveillance cycle (5‑step framing)
- Case definition & data collection: define what counts; capture data via reports, labs, syndromic feeds.
- Data management & quality: clean, de‑duplicate, ensure completeness/validity.
- Analysis & interpretation: summarize by person–place–time; compute indicators and thresholds; compare to expected.
- Dissemination: timely alerts and routine reports to stakeholders (clinicians, health departments, public).
- Evaluation & improvement: assess attributes; refine processes, definitions, and data flows.
Types of surveillance
- Passive: routine reporting by providers/labs. Lower cost, broader coverage; may be less timely/complete.
- Active: public health actively solicits reports (contacting facilities). Higher completeness/timeliness; resource‑intensive.
- Sentinel: selected sites/providers monitor trends for specific conditions. Efficient signals; may not be fully representative.
- Syndromic: near‑real‑time symptoms/indicators (ED chief complaints, OTC sales, school absenteeism). Early signals; lower specificity.
- Laboratory‑based: culture/PCR reports; supports pathogen‑specific tracking, subtyping (e.g., PulseNet).
- Event‑based: media/social reports curated for unusual events; rapid but noisy.
Key attributes (be ready to discuss trade‑offs)
- Sensitivity: ability to capture cases/outbreaks; higher detection may raise workload and false alarms.
- Specificity/PPV: proportion of true events among detected; improves efficiency but may miss early signals.
- Timeliness: speed from event to action; crucial for fast‑moving outbreaks.
- Data quality: completeness, validity, and consistency.
- Simplicity: ease of operation; affects training and adoption.
- Flexibility: adaptability to new conditions, case definitions, or data sources.
- Acceptability: willingness of reporters/partners to participate; influenced by burden, feedback, and perceived value.
- Representativeness: extent to which data accurately describe occurrence across person–place–time.
- Stability: reliability and uptime of systems.
Aberration detection and thresholds (concept)
- Compare observed counts to expected baselines using historical means/SDs or control charts; simple rules (e.g., 2 SD above mean) can flag signals for further review.
- Subtyping and molecular surveillance (PFGE, SNP, WGS) link dispersed cases into clusters that would be invisible in aggregate counts.
Test‑day guidance
- If asked to recommend surveillance enhancements, argue from attributes (e.g., “To improve timeliness and sensitivity for GI outbreaks, add syndromic feeds from ED triage and set near‑real‑time aberration thresholds; pair with active outreach when signals cross threshold”).
- When interpreting a surveillance table, specify numerator/denominator, timeframe, and baseline; comment on representativeness and possible biases (changes in testing, care‑seeking).
Outbreak Investigation
(5) Interpreting Epi Curves, Line Listings, Cluster Maps, Subdivided Tables; PFGE/SNP & PulseNet Concept
Line listing
- Tabular tool listing each case (rows) and key variables (columns): ID, onset date/time, age/sex, exposure(s), lab status, outcome. A good line list enables rapid filtering by PPT and supports constructing epi curves and 2×2 tables.
Epi curves
- Histogram of case onsets over time. Identify pattern: point source (sharp peak), continuous common source (plateau), propagated (successive waves). Use min/max incubation to back‑calculate likely exposure windows for point sources; look for secondary peaks indicating person‑to‑person.
Cluster/spot maps
- Plot cases by location to detect spatial clustering (e.g., same classroom, water line, food station). Combine with time to see spread patterns.
Subdivided (stratified) tables
- Present measures within strata (e.g., AR by exposure within age groups) to check for confounding/effect modification. If a crude association disappears or reverses after stratification, consider confounding; if associations differ markedly by stratum, consider effect modification.
PFGE/SNP/WGS and PulseNet
- Molecular subtyping (PFGE historically; SNP/WGS now) links cases with indistinguishable pathogen fingerprints, revealing clusters across jurisdictions. PulseNet is the US network that connects labs for outbreak detection; conceptually, it raises sensitivity for dispersed, low‑level outbreaks.
(6) Agent–Host–Environment Triad and Chain of Transmission
Triad
- Agent: infectious/toxic hazard. Host: susceptibility (age, immunity, behaviors). Environment: conditions enabling exposure/transmission (crowding, food handling, water systems). Use the triad to organize hypotheses and controls.
Chain of transmission (infections)
- Agent → reservoir → portal of exit → mode of transmission → portal of entry → susceptible host. Controls break links: source removal, hygiene, PPE, ventilation, vaccination, prophylaxis.
(7) Evaluate Data: AR, RR, OR (calculate, compare, interpret)
Attack rate (AR)
- AR_exposed = cases_exposed / total_exposed; AR_unexposed = cases_unexposed / total_unexposed.
- AR difference = AR_exp − AR_unexp; AR ratio (RR) = AR_exp / AR_unexp.
Risk Ratio (RR)
- Interpret as multiplicative change in risk. RR>1 suggests positive association; RR<1 suggests protective; magnitude and CI (if given) inform strength/precision.
Odds Ratio (OR)
- In case–control studies, OR approximates RR when disease is rare. Interpret similarly with CIs (if given). Avoid interpreting OR as risk unless it’s clearly rare‑disease context.
Decision notes
- Choose RR in cohorts; OR in case–control. Always check denominators and ensure mutually exclusive exposed/unexposed groups.
(8) Causality: Bradford Hill vs Koch/Evans; Modern Models (DAGs, Sufficient/Component Cause, GRADE)
Bradford Hill criteria (guideposts, not a checklist)
- Strength: larger associations are less likely due to confounding.
- Consistency: observed by different people, places, circumstances.
- Specificity: single cause → single effect (limited in complex diseases).
- Temporality: exposure precedes outcome (the only essential criterion).
- Biological gradient: dose–response.
- Plausibility: consistency with biological knowledge (evolves over time).
- Coherence: does not contradict what is known.
- Experiment: reduction in risk when exposure removed/added.
- Analogy: similar exposures have similar effects.
Koch’s postulates (microbial causation) and Evans’ postulates extended these for infectious disease, but many pathogens/carriers do not obey strict specificity or culture rules. Modern practice integrates multiple lines of evidence.
Modern frameworks
- Sufficient/component cause model: multiple component causes complete a sufficient cause; explains multifactorial etiology.
- Directed acyclic graphs (DAGs): graphical causal models clarifying confounding, mediation, and colliders; justify adjustment sets (conceptual level for this event).
- GRADE: framework for rating evidence quality and recommendation strength; useful when weighing public health actions under uncertainty.
Test‑day use
- Emphasize temporality; point to dose–response if present; note consistency across subgroups or data streams (epi + lab + environment). State limits and residual uncertainty.
(9) Herd Immunity; R0, Re, Thresholds
Basic reproduction number R0
- Expected secondary cases from a primary case in a fully susceptible population (context‑specific).
- Herd immunity threshold (HIT): . If vaccine effectiveness (VE) < 100%, required coverage ≈ .
Effective reproduction number Re
- Re = R0 × S, where S is the susceptible fraction after immunity/interventions. Control aims for Re < 1.
- Interventions reduce R0 (e.g., contact reduction) and/or S (e.g., vaccination).
Interpretation in scenarios
- If R0≈4, HIT≈0.75. With VE=80%, coverage ≈ 0.75/0.8 ≈ 94% to achieve herd protection (simplified).
- Epi curves with successive, diminishing peaks may reflect Re crossing below 1 after controls.
(10) Study Design/Bias/Confounding; Control Groups; Confidence Intervals; Stratification & Adjusted Rates; Vaccine Effectiveness/Efficacy
Bias catalog (examples)
- Selection: non‑representative controls in case–control; differential loss to follow‑up in cohorts.
- Information: recall bias (cases remember exposures differently); interviewer bias; misclassification (non‑differential biases toward the null; differential can bias either way).
- Surveillance/testing changes: increased case finding after an alert inflates apparent incidence.
Confounding
- A confounder is associated with exposure and outcome, not on the causal path. Control with design (randomization, restriction, matching) or analysis (stratification, multivariable adjustment conceptually).
- Stratification: if crude RR=2.0 but stratum‑specific RRs≈1.1, confounding likely. If stratum RRs differ greatly (e.g., 1.1 vs 3.6), report effect modification rather than a pooled estimate.
Control/comparison groups
- Cohort: exposed vs unexposed from the same source population.
- Case–control: controls sampled from the population that produced the cases; comparable opportunity for exposure measurement.
Confidence intervals (interpretation)
- You are not required to compute CIs unless provided; interpret width (precision) and whether null (1.0 for RR/OR) is included. Narrow CIs with large N suggest stable estimates; wide CIs warrant caution.
Adjusted rates (concept)
- Use age‑adjusted or stratum‑adjusted comparisons when distributions differ; prevents confounding by age when comparing communities/program effects.
Vaccine effectiveness vs efficacy
- Cohort (field): VE ≈ 1 − RR (risk among vaccinated / risk among unvaccinated).
- Case–control: VE ≈ 1 − OR (odds of vaccination among cases vs controls).
- Efficacy refers to ideal settings (e.g., trials); effectiveness reflects real‑world performance.
(11) Nationals Only: Control & Prevention Measures
Be prepared to suggest proportionate, feasible control and prevention strategies and to justify them with data and the agent–host–environment framework.
Examples by transmission
- Food/water vehicles: immediate source removal/recall; temperature control; cross‑contamination prevention; staff exclusion pending stool results where appropriate; facility inspection.
- Airborne/droplet/contact: isolation/cohorting, masking, ventilation improvements, hand hygiene, surface disinfection; targeted prophylaxis/vaccination when indicated.
- Vector‑borne: vector control (breeding site removal, larvicides), repellents, protective clothing, window screens, travel advisories.
- Environmental/chemical: cease exposure, containment/remediation, medical toxicology guidance, product traceback, regulatory notifications.
Programmatic considerations
- Timeliness and equity (ensure hard‑to‑reach populations are covered); resource constraints; communication strategies; monitoring for unintended consequences.
- Evaluate pre/post‑intervention indicators (incidence, AR among risk groups, Re proxies) to judge effectiveness; refine iteratively.
Patterns, Control, and Prevention
(1) Identify patterns and trends of epidemiologic data in charts, tables, and graphs
Reading principles
- Start with axes, units, and denominators. Label who/where/when. Check whether scales are linear vs logarithmic and whether multiple axes are used.
- Determine the baseline/expected and identify deviations (peaks, trends, seasonal cycles). Use moving averages to smooth noise if applicable.
- Compare strata (age, sex, site) side‑by‑side or stacked; look for cross‑overs suggestive of effect modification or programmatic shifts.
- In maps, note classification scheme (quantiles vs equal intervals) and whether rates (adjusted) or counts are shown; small‑number areas may show unstable rates.
Common patterns
- Secular trend: long‑term increase/decrease (e.g., multi‑year decline in smoking prevalence).
- Seasonal: periodic spikes (e.g., winter respiratory infections, summer heat illness).
- Cyclic: multi‑year oscillations (e.g., vector‑borne disease linked to climate cycles).
- Outbreak signals: abrupt point‑source peaks or sharp deviations from baseline; propagate versus continuous common source distinguishable on epi curves.
Actionable interpretation
- Link anomalies to plausible causes (policy change, intervention, behavior, environment).
- Recommend next analytical step (e.g., stratify by site; compute AR by exposure; inspect time windows against incubation periods).
- State uncertainty and data limits (missingness, changes in testing/reporting).
Worked micro‑example
- A monthly line chart shows diarrheal ED visits with clear summer peaks; a heat map shows highest rates in neighborhoods A and C. Hypothesis: warm‑weather water/food exposures, possibly recreational water. Suggest enhanced surveillance in July–September, targeted messaging, and inspection of high‑risk facilities in A and C.
(2) Calculate disease risk and frequency metrics from given data
Key formulas (use correct denominators/timeframes)
- Risk (cumulative incidence) = new cases / population at risk over a defined period.
- Proportion = part / whole (specify what the whole is).
- Incidence proportion (attack rate) = cases in outbreak group / total in that group during the exposure period.
- Incidence rate = new cases / person‑time at risk.
- Prevalence (point) = existing cases at time t / population at time t (period prevalence over interval as applicable).
- Mortality (death) rate = deaths in a population / population (often per 1,000 or 100,000) over a specified period.
- Cause‑specific mortality rate = deaths from cause X / population.
- Case fatality ratio (CFR) = deaths among cases / total cases of the disease.
- Frequency ratio (generic) = rate (or risk) in group A / rate (or risk) in group B (specify which measures used).
Illustrative calculation
Banquet attendees: 240 total; egg‑salad exposed = 94 (58 ill), unexposed = 146 (12 ill)
AR_exposed = 58/94 ≈ 0.617
AR_unexposed= 12/146 ≈ 0.082
RR = 0.617 / 0.082 ≈ 7.5 (strong association)
AR difference = 0.617 − 0.082 = 0.535 (excess risk among exposed)
Incidence rate example (person‑time)
10 incident cases over 50,000 person‑days → 10 / 50,000 = 2.0 × 10^-4 per person‑day (≈ 73 per 100,000 person‑years)
Mortality metric example
Deaths from disease X in 2025: 120; population: 1,000,000 → cause‑specific mortality = 120 / 1,000,000 = 12 per 100,000
CFR example: 120 deaths / 3,000 cases = 4%
Checklist for computations
- Define the population at risk precisely; exclude those not at risk when appropriate.
- Use consistent time windows; do not mix person‑time with simple counts.
- State units (per 100,000; per 1,000); round sensibly; show work.
(3) Strategies of Disease Control (apply to scenarios)
Frameworks to select controls
- Break the chain: source control (remove/recall, isolate), block transmission (hand hygiene, PPE, ventilation, safe water/food handling, vector control), protect susceptibles (vaccination, prophylaxis, shielding).
- Hierarchy of controls (for injuries/toxicants): elimination, substitution, engineering controls, administrative controls, PPE.
- Primary, secondary, tertiary prevention mapping onto the natural history timeline.
Scenario‑driven examples
- Foodborne point source: remove implicated item; temperature control; staff exclusion per policy; environmental inspection; reinforce cross‑contamination prevention.
- Droplet/airborne outbreak in school: early isolation/sent‑home policy, mask/ventilation improvements, cohorting; targeted vaccination if applicable; communicate symptoms and when to return.
- Vector‑borne cluster: larval habitat reduction, targeted adulticiding, repellents, protective clothing, community education; surveillance for vector indices.
- Chemical exposure: immediate cessation of exposure; product traceback/recall; environmental remediation; health advisories; medical toxicology consultation.
Evaluation of controls
- Specify measurable outcomes (incidence decline, AR in high‑risk subgroup, absenteeism trends).
- Monitor for unintended consequences (risk compensation, inequities).
- Iterate: maintain surveillance and refine measures based on observed effectiveness.
(4) Strategies for Prevention: Scope and Levels of Prevention
Scope of prevention
- Individual‑level: counseling, vaccination, prophylaxis, PPE use, clinical screening and early treatment.
- Community‑level: school/workplace policies, engineering controls (ventilation, food safety equipment), environmental remediation, vector control, safe road design.
- Population/policy‑level: regulations (food temps, seatbelts, smoke‑free laws), taxation/subsidies, surveillance mandates, communication campaigns, and equitable access initiatives.
Levels of prevention
- Primordial: prevent emergence of risk factors (e.g., safe built environments encouraging activity).
- Primary: prevent disease onset (vaccination, sanitation, masking, safe water/food, machine guards).
- Secondary: early detection and prompt treatment to halt progression (screening programs; contact tracing/testing).
- Tertiary: reduce complications/disability (rehab, secondary prophylaxis for recurrent events).
- Quaternary (sometimes cited): avoid over‑medicalization and harms from unnecessary interventions.
Mapping to the natural history
- Place strategies along the susceptibility → subclinical → clinical → outcome timeline. For fast outbreaks, primary prevention and immediate controls dominate; for chronic disease, primordial/primary measures often have greatest long‑term impact.
Choosing strategies in context
- Effectiveness and feasibility: prefer high‑impact, fast, feasible measures first (source removal, isolation) while planning durable changes (policy/engineering).
- Equity: ensure access to prevention across subgroups; tailor communication and delivery.
- Acceptability and unintended effects: anticipate barriers and risk compensation; design mitigations.
- Evaluation planning: define success metrics (incidence, AR in high‑risk groups, program coverage, timeliness) before implementation.
(5) Propose prevention strategies after cause determination
Structured proposal template
- Restate the determined cause succinctly (vehicle/vector/pathway, setting, high‑risk groups).
- Immediate controls (hours–days): remove source, isolation/cohorting, PPE, disinfection, advisories, targeted prophylaxis.
- Near‑term (days–weeks): training refreshers, checklists, supply fixes, engineering tweaks, targeted vaccination clinics, enhanced surveillance triggers.
- Long‑term (weeks–months+): policy updates, infrastructure upgrades, supplier standards/audits, routine monitoring, public communication plans.
- Equity/acceptability: address language, access, stigma; leverage trusted messengers.
- Evaluation: specify indicators, data sources, frequency, and responsible parties.
Example (foodborne at a banquet facility)
- Cause: cross‑contamination and temperature abuse of egg products.
- Immediate: discard implicated products; hold service; deep clean; staff exclusion pending evaluation; notify health department.
- Near‑term: retrain on time/temperature controls and cross‑contamination prevention; calibrate thermometers; require logs.
- Long‑term: revise SOPs; install blast chiller; supplier verification; quarterly audits; public scorecard.
- Evaluation: track facility‑linked complaints, inspection critical violations, temperature log adherence, and outbreak recurrence (target zero).
(6) Nationals Only: Assess strategies’ strengths/weaknesses; analyze pre/post data for effectiveness
Assessment framework
- Strengths: expected impact magnitude, timeliness, scalability, sustainability, equity improvements, cost‑effectiveness.
- Weaknesses/risks: feasibility constraints, acceptability barriers, resource demands, potential inequities, risk compensation, maintenance burden.
- Mitigations: phased rollout, targeted communication, subsidies/support, monitoring and rapid iteration.
Analyzing pre/post data (concepts)
- Simple pre/post: compare incidence/AR before vs after; beware confounding by secular trends/seasonality.
- Stratified pre/post: check key subgroups; ensure benefits are equitably distributed.
- Interrupted time series (conceptual): visualize level and slope changes after intervention; comment on robustness if trends persist.
- Difference‑in‑differences (conceptual): compare changes in an intervention group vs a comparable control group; state assumptions (parallel trends).
Reporting effectiveness succinctly
- “Following implementation of temperature logs and retraining at Facility X, weekly GI complaints declined from 14 to 3 (−79%) over 6 weeks, while matched facilities without intervention declined from 10 to 8 (−20%), suggesting a specific benefit of the intervention. Continued monitoring planned to confirm durability and seasonality control.”
Key Definitions
- Case: individual meeting the case definition (confirmed/probable/suspected).
- Exposure: suspected risk factor preceding disease.
- Outcome: disease or health state of interest.
- Population at risk: individuals susceptible and under observation.
- Incubation period: time from exposure to symptom onset.
- Infectious period: time when a case can transmit the pathogen.
Measures and Core Calculations
- Risk (cumulative incidence) = cases / population-at-risk over specified period
- Incidence rate = new cases / person-time at risk
- Prevalence = existing cases / population (point or period)
- Odds = p / (1 − p)
- Attack rate (AR) = cases among exposed / total exposed (often outbreak meals/venues)
- Secondary attack rate = new cases among contacts / susceptible contacts
- Risk Ratio (RR) = [a/(a+b)] / [c/(c+d)]
- Odds Ratio (OR) = (a·d) / (b·c) (commonly in case–control)
- Attributable Risk (AR difference) = Risk_exposed − Risk_unexposed
- AR Ratio (Etiologic fraction) = (Risk_exposed − Risk_unexposed) / Risk_exposed
- Population Attributable Risk (PAR) = Risk_population − Risk_unexposed
- PAR% = (Risk_pop − Risk_unexp) / Risk_pop × 100
2×2 Table (exposure vs. disease):
| Disease + | Disease − | Total | |
|---|---|---|---|
| Exposed | a | b | a+b |
| Unexposed | c | d | c+d |
| Total | a+c | b+d | N |
Screening metrics:
- Sensitivity = TP / (TP + FN); Specificity = TN / (TN + FP)
- PPV = TP / (TP + FP); NPV = TN / (TN + FN)
- As prevalence ↑, PPV ↑ and NPV ↓ (holding test characteristics fixed)
Outbreak Investigation: 10 Steps
- Prepare for field work (confirm diagnosis, gather supplies, liaise)
- Establish the existence of an outbreak (observe vs. expected)
- Verify diagnosis (clinical, lab, rule out artifacts)
- Define and identify cases (case definition; confirmed/probable/suspected)
- Describe and orient data by person, place, time (line list; epi curve; spot maps)
- Develop hypotheses (source, transmission, risk factors)
- Evaluate hypotheses (analytic studies, e.g., case–control/cohort in outbreaks)
- Implement control and prevention measures (don’t wait to finish analysis)
- Communicate findings (briefs, stakeholders, press as needed)
- Maintain surveillance and evaluate effectiveness of interventions
Line list essentials: ID, onset date/time, demographics, exposures, outcomes; supports epi curve and stratified analysis.
Epi curve patterns: point source (sharp rise and fall), continuous common source (plateaued), propagated (successive waves).
Study Designs and When to Use
- Descriptive: person–place–time summaries; hypothesis generation.
- Cross-sectional: prevalence snapshot; association not temporality.
- Cohort: start with exposure; compute risks/RR/AR; prospective or retrospective.
- Case–control: start with outcome; sample controls; compute OR; efficient for rare diseases/long latency.
- RCTs: randomization; control confounding; ethical/feasibility constraints (rare in outbreak response).
Confounding vs. effect modification:
- Confounder: associated with exposure and outcome, not on causal path; distorts association. Control by randomization, restriction, matching, stratification.
- Effect modification (interaction): true difference in effect across strata; report stratum-specific estimates.
Bias types:
- Selection bias (non-representative sampling), information/misclassification bias (recall, interviewer, measurement), surveillance bias.
Stratification and Simpson’s paradox: combined association reverses when analyzing within strata—always check key strata (e.g., age, sex, site) and use Mantel–Haenszel OR conceptually when needed.
Screening and Public Health Programs
- Purposes: early detection, reduce morbidity/mortality when treatment effective.
- Tradeoffs: sensitivity vs. specificity; false positives vs. false negatives depend on cutoffs.
- Parallel vs. serial testing: parallel increases sensitivity; serial increases specificity.
- Lead-time and length biases: can inflate apparent survival benefits without real mortality reduction.
Surveillance Systems
- Passive (routine reporting), active (field outreach), sentinel (selected sites), syndromic (symptom-based, near-real-time).
- Notifiable conditions: report per jurisdiction; timeliness, completeness matter.
- Use surveillance to detect aberrations, trigger investigations, and evaluate interventions.
Infectious Disease Dynamics
- Basic reproduction number R0: expected secondary cases in fully susceptible population.
- Effective reproduction number Re: R0 × S (fraction susceptible) with interventions/immunity.
- Herd immunity threshold (HIT) ≈ 1 − 1/R0; Vaccine effectiveness (VE) ≈ 1 − RR among vaccinated vs. unvaccinated (for disease outcomes).
- Serial interval vs. incubation period: serial is case-to-case onset gap; incubation is exposure-to-onset.
Strategy
- Build a compact formula sheet; include unit checks and typical denominators.
- Practice rapid 2×2, attack rate, and screening computations; annotate tables.
- Interpret epi curves and spot maps; write 1–2 sentence conclusions with caveats.
Practice Prompts
- Construct a case definition and classify 10 sample patients from a line list (confirmed/probable/suspected).
- Meal cohort: compute attack rates by item, AR difference/ratio, and identify the likely vehicle.
- Case–control: compute OR and interpret; check for confounding by age via stratified ORs.
- Screening: given sensitivity/specificity and prevalence, compute PPV/NPV and discuss tradeoffs.
- Epi curve: identify exposure pattern (point, continuous, propagated) and estimate likely exposure window.
Advanced topics and deep dives
Stratification, confounding, and effect modification (worked)
- Suppose crude OR = 2.0 for smoking→MI. Stratify by age: OR_young = 1.1, OR_old = 3.6, with very different smoking prevalence. The crude OR may be confounded by age. Report stratum-specific ORs if effect modification suspected; otherwise provide an adjusted estimate (Mantel–Haenszel) and justify.
Mantel–Haenszel OR (conceptual)
- For K strata: OR_MH ≈ Σ (a_k·d_k / n_k) / Σ (b_k·c_k / n_k). Use provided tables only if event scope includes it; otherwise, reason qualitatively that stratification changed the association.
Cochran–Mantel–Haenszel (CMH) test (stratified 2×2×K)
- Purpose: test the null of no common association between exposure and disease across K strata (e.g., age groups), controlling for stratification.
- Inputs per stratum k: 2×2 counts a_k, b_k, c_k, d_k; n_k = a_k+b_k+c_k+d_k.
- Expected count of a_k under H0: E[a_k] = ((a_k+b_k)(a_k+c_k)) / n_k.
- Variance: Var(a_k) = ((a_k+b_k)(c_k+d_k)(a_k+c_k)(b_k+d_k)) / (n_k^2 (n_k−1)).
- CMH statistic (with continuity correction 0.5 when appropriate): X_CM H^2 = (|Σ (a_k − E[a_k])| − 0.5)^2 / Σ Var(a_k).
- Decision: compare X_CM H^2 to χ^2 with 1 d.f. (or use p-value). If small cells (<5), Fisher’s exact or exact CMH may be preferred; state limitation if applicable.
- Interpretation: If significant and stratum-specific effects are similar (no strong interaction), conclude an overall association controlling for strata. If effects differ greatly, report effect modification, not a pooled effect.
Confidence intervals (if in scope): compute OR_MH, then use a provided variance formula for ln(OR_MH) to form 95% CI = exp( ln(OR_MH) ± 1.96·SE ). If variance not provided, state qualitative strength and consistency across strata.
Fisher’s exact (2×2) and small cells
- Use Fisher’s exact when any expected cell < 5 (common in case–control with rare outcomes). Report exact p-value if provided, or state that small counts limit chi-square validity.
Confidence intervals for OR and RR (log method)
- For OR = (a·d)/(b·c): SE[ln(OR)] ≈ √(1/a + 1/b + 1/c + 1/d); 95% CI = exp( ln(OR) ± 1.96·SE ).
- For RR = [a/(a+b)] / [c/(c+d)]: SE[ln(RR)] ≈ √( b/(a(a+b)) + d/(c(c+d)) ); 95% CI analogously.
Bayes theorem and ROC (screening depth)
- Bayes: PPV = (Se·Prev) / (Se·Prev + (1−Sp)·(1−Prev)); NPV = (Sp·(1−Prev)) / ((1−Se)·Prev + Sp·(1−Prev)).
- ROC curve: tradeoff of Se vs 1−Sp as cutoff varies; AUC summarizes discriminative ability. For competition, describe qualitatively unless data are provided.
Vaccine effectiveness (VE)
- Cohort: VE ≈ 1 − RR (risk among vaccinated / risk among unvaccinated).
- Case–control: VE ≈ 1 − OR (odds of vaccination among cases vs controls).
- Example (cohort): attack rate vaccinated 5/200 = 0.025; unvaccinated 20/200 = 0.10 → RR=0.25; VE≈75%.
Worked CMH numeric example
Two age strata:
Stratum Young (n1=60):
- Exposed: cases a1=18, non-cases b1=12
- Unexposed: cases c1=10, non-cases d1=20 OR1 = (18·20)/(12·10) = 360/120 = 3.0
Stratum Old (n2=100):
- Exposed: a2=22, b2=18
- Unexposed: c2=25, d2=35 OR2 = (22·35)/(18·25) = 770/450 ≈ 1.71
Pooled OR_MH ≈ Σ(a_k d_k/n_k) / Σ(b_k c_k/n_k) = (18·20/60 + 22·35/100) / (12·10/60 + 18·25/100) = (6 + 7.7) / (2 + 4.5) = 13.7/6.5 ≈ 2.11.
CMH test:
- E[a1] = ((a1+b1)(a1+c1))/n1 = (30·28)/60 = 14; Var(a1) ≈ ((30·30·28·32)/(60^2·59)) ≈ 3.80.
- E[a2] = (40·47)/100 = 18.8; Var(a2) ≈ ((40·60·47·53)/(100^2·99)) ≈ 6.04.
- Σ(a_k − E[a_k]) = (18−14) + (22−18.8) = 7.2; ΣVar ≈ 9.84.
- With 0.5 continuity correction: X^2 ≈ (|7.2|−0.5)^2 / 9.84 = 6.7^2 / 9.84 ≈ 4.56 → p≈0.03 (df=1) → significant pooled association controlling for age.
Heterogeneity note: OR1 (3.0) vs OR2 (1.71) differ but not dramatically; if strata ORs were very different or opposite, report effect modification instead of pooling.
Interpreting epi curves
- Point source: sharp rise and fall; narrow incubation distribution. Continuous: plateaued cases; exposures persist. Propagated: successive peaks ~1 incubation apart; consider person-to-person spread.
- Estimating exposure window: earliest case onset − minimum incubation to latest case onset − maximum incubation.
Bias catalog (examples and mitigations)
- Selection: volunteer bias, loss to follow-up → ensure comparable follow-up, use intention-to-treat where applicable.
- Information: recall bias (case–control), interviewer bias → blinding, standardized questionnaires.
- Misclassification: nondifferential tends to bias toward the null; differential can bias either way → validate measures.
- Surveillance: increased case finding in exposed group → harmonize ascertainment across groups.
Screening caveats
- Lead-time bias: earlier diagnosis inflates survival time without real mortality benefit.
- Length bias: screening overrepresents slow-progressing disease.
- Overdiagnosis: detection of indolent disease → weigh harms from false positives/over-treatment.
Infectious disease metrics
- Generation time vs. serial interval: generation time (infection→infection) unobserved; serial interval (onset→onset) observable proxy.
- Interventions lower Re below 1 to halt spread; interpret HIT with caution in heterogenous mixing.
Reporting templates
- One-paragraph abstract (background, methods, results with key metrics, recommendation). Bullet list of immediate controls (remove source, hygiene, isolation) and longer-term steps (policy, vaccination, surveillance changes).
Case study (end-to-end)
- A wedding outbreak: 240 attendees; line list constructed. Symptoms: onset peaked 10–12 h post-dinner; recovery within 24–36 h → suggests toxin-mediated illness. Meal cohort shows highest AR for “cream-filled pastries”: AR_e=0.62 (n=58), AR_u=0.08 (n=182). RR≈7.8 (95% CI calculation optional if scope allows). Control: discard leftovers, notify caterer, inspect kitchen, advise attendees. Lab: test for Staph aureus enterotoxin. Communicate with local health department; issue recommendations.
Study checklist (self-audit)
- Definitions memorized; formula sheet prepared; blank 2×2 and line list templates ready
- Practice with epi curves and meal cohort problems; screening PPV/NPV with varying prevalence
- Short-answer structures memorized (design choice, bias identification, confounding control)
References
- SciOly Wiki: https://scioly.org/wiki/index.php/Disease_Detectives
- CDC Principles of Epidemiology: https://www.cdc.gov/csels/dsepd/ss1978/index.html
- CDC Field Epidemiology Manual (selected chapters)
The Competition (Process Orientation Deepening)
Expect process‑oriented tasks that reward how you think: defining problems precisely, selecting correct denominators, computing and interpreting measures, and recommending proportionate controls. Content knowledge supports but does not replace structured reasoning.
Process checklist
- Define → Describe → Hypothesize → Analyze → Act → Evaluate → Communicate.
- State assumptions and data limits explicitly.
- Keep person–place–time top‑of‑mind; ensure your denominators match the populations being described.
Answer style
- Use concise sentences anchored in computed numbers: “AR_e=0.62 vs AR_u=0.08 (RR≈7.5) suggests egg salad as likely vehicle; remove item, audit temperature logs, and reinforce cross‑contamination prevention; confirm with lab.”
- When uncertain, outline next steps: “If WGS links cases across counties, expand traceback; if not, consider alternative exposures with similar timing.”
Official references
Sample notesheet
Download a printable, rule-compliant sample notesheet. Customize with your notes.
Study roadmap
- Step 1
- Step 2
- Step 3