Updated: 5/22/2020

# Statistic Definitions

0%
Topic
Review Topic
0
0
N/A
N/A
Questions
42
0
0
0%
0%
Evidence
17
0
0
0%
0%
Videos / Pods
2
Topic  Introduction
Measure of Central Tendency
• Mode
• defined as the value that occurs most often
• best for data which is allocated into distinct categories (nominal data)
• Median
• defined as the value that occurs at the middle of all values of the variable (half are greater, half are less)
• not affected by extreme values
• good for all levels of measurement except nominal data
• especially good for skewed distributions
• Mean
• defined as arithmetic average
• the most frequently used measure of central tendency
• uses all values of data
• highly sensitive to extreme values (especially skewed distributions)
Sensitivity
• Definition • Equation
• sensitivity = a / (a + c) or
• sensitivity = TP / (TP + FN)
• Relevance
• sensitive tests are useful for screening since they are unlikely to miss a patient with disease
• Example
• a new test is developed to quickly diagnose HIV.  There are 10 patients in the study group with the disease.  Upon testing of all 10 patients, only 6 results return positive.  What is the sensitivity of the new test?
• solution
• sensitivity = a / (a + c)
• sensitivity = 6 / 10
• sensitivity = 60%

 disease pos disease neg test pos true positivea (6) false positiveb test neg false negativec  (4) true negatived TOTAL 10 b + d
Specificity
False Positive Rate
• Definition
• patients without the disease who have a positive test result
• Equation
• false positive rate  = b / (b + d)
 disease pos disease neg test pos true positivea false positiveb test neg false negativec true negatived

False Negative Rate
• Definition
• patients with disease who have a negative test result
• Equation
• false negative rate = c / (a + c)

 disease pos disease neg test pos true positivea false positiveb test neg false negativec true negatived
Positive Predictive Value
Negative Predictive Value
• probability patient with a negative test actually has no disease
• dependent on prevalence of disease
• Equation
• NPV = d / (c + d) or
• NPV = TN / (FN + TN)
• Example
• 200 patients are enrolled in a study to evaluate the accuracy of a ELISA-based test for the diagnosis of influenza.  100 patients were diagnosed by the gold-standard method.  80 of the patients with influenza had a positive ELISA-based test as did 5 of the patients without influenza.  What is the negative predictive value of this test?
• solution
• NPV = TN / (FN + TN)
• NPV = 95 / (20 + 95)
• NPV = 83%

 disease pos disease neg test pos true positivea (80) false positiveb (5) test neg false negativec (20) true negatived (95)

Likelihood Ratio
• Definition
• likelihood that a given test result would be expected in a patient with the target disorder compared to the likelihood that that same result would be expected in a patient without the target disorder
• Classification
• positive likelihood ratio
• definition
• describe how the likelihood of a disease is changed by a positive test result
• equation
• positive likelihood ratio = sensitivity / (1 - specificity)
• negative likelihood ratio
• definition
• describe how the likelihood of a disease is changed by a negative test result
• equation
• negative likelihood ratio = (1 - sensitivity) / specificity
Incidence
Prevalence
• Definition
• the total number of cases of a disease present in a location at any time point
Relative Risk
• Definition
• Equation
• incidence risk of YES = a / (a + b)
• incidence risk of NO =c / (c + d)
• relative risk = [(a / a + b)] / [(c / c + d)]
 Disease Status Risk Present Absent Yes a b No c d
• Example
• a study is performed concerning the relationship between blood transfusions and the risk of developing hepatitis C. A group of patients is studied for three years.
 Disease Status Transfused Hepatitis C Healthy Yes 75 595 No 16 712
• solution
• disease incidence in transfused
• "YES" = 75 / (75 + 595) = .112
• disease incidence in patients not transfued
• "NO" = 16 / (16 + 712) = .022
• relative risk (RR) = 0.112 / 0.022 = 5.09
Odds Ratio
• Definition
• represents the odds that an outcome will occur given a particular exposure, compared to the odds that the outcome will occur without the exposure
• obtained from case-control studies (retrospective)
• also obtained from the output of logistic regression models
• odds ratio's approximate RR when the outcome is rare (usually defined as <10%)
• Equation
• OR = (a x d) / (b x c)

 Disease Status Risk Present Absent Yes a b No c d

• Example
• a study is performed concerning the relationship between blood transfusions and the risk of developing hepatitis C. A group of patients is studied for three years.
 Disease Status Transfused Hepatitis C Healthy Yes 75 595 No 16 712
• Solution:
• OR = (75 x 712) / (595 x 16) = 5.61
Number Needed to Treat
• Definition
• number of patients that must be treated in order to achieve one additional favorable outcome
• Equation
• number needed to treat = (1 / absolute risk reduction)
• Example
• you learn the number-needed-to-screen with FOBT is nearly 1000 to prevent colon cancer.  What is the absolute risk reduction associated with FOBT?
• solution
• absolute risk reduction (ARR) = 1 / number needed to treat
• ARR = 1 / 1000
• ARR = .1%
Post-test Odds of Disease
• Equations
• post-test probability = (pretest probability) X (likelihood ratio)
• likelihood ratio = sensitivity / (1 - specificity)
• pre-test odds = pre-test probability / (1 - pre-test probability)
• post-test probability = post-test odds / (post-test odds + 1)
Power
Effect size
Variance
• Definition
• an estimate of the variability of each individual data point from the mean
Type II Error (beta)
Type I Error (alpha)
Confidence Interval
Statistical Inference
• Definition
• used to test specific hypotheses about associations or differences among groups of subjects/sample data
• Classification
• Study types
• when comparing two means
• Student's t-test
• Mann-Whitney or Wilcoxon rank sum test
• used for non-parametric data
• when comparing proportions
• when comparing three or more groups
• Analysis of variance (ANOVA) Choosing the Right Test Comparison Parametric Nonparametric Continous Data Two groups Paired Dependent (paired) t-test Wilcoxon Rank-Sum Test Unpaired Independent t-test Mann-Whitney U test Three or more groups Analysis of variance (ANOVA) Kruskal-Wallis test Categorical data Two or more variables Chi-square Chi-square Two or more variables (when sample size is small) Fisher exact test Fisher exact test
Funnel Plot
• Definition • is a simple scatter plot of the intervention effect estimates from individual studies against some measure of each study’s size or precision and is used to detect publication bias in meta-analyses
• Clinical Significance • this method is based on the fact that larger studies have smaller variability, whereas small studies, which are more numerous, have larger variability. Thus the plot of a sample of studies without publication bias will produce a symmetrical, inverted-funnel-shaped scatter, whereas a biased sample will result in a skewed plot.
• Definition
• Variables
• False positive rate (1 - specificity)
• is plotted on the x-axis
• True positive rate (sensitivity)
• is plotted on the y-axis
• Interpretation
• Area under the ROC curve (C-statistic)
• used to compare different tests, higher C-statistics mean better diagnostic ability of test
• an area under the ROC curve of 0.5 is a useless test
 Survivorship Analysis Overview  often used to measure success of joint replacements analyzes data from patients with different lengths of follow-up for analysis, it is assumed that all patients had their operation simultaneously chance of implant surviving for a particular length of time is calculated as the survival rate calculation method is either life table or product limit method may be analyzed with the Kaplan-Meier method LIfe table method number of joints being followed and the number of failures are determined for each year after operation (number of joints being followed and the number of failures are determined for each year after operation each year of follow-up,  failure rate is calculated from the number of failures and the ‘number at risk’ annual success rate, determined from the failure rate, is cumulated to give a survival rate for each successive year, this can change only once per year Product limit method same as life table method, but the survival rate is recalculated each time a failure occurs Minimal Clinically Important Difference (MCID) The difference in outcome measures that will have clinical relevance  Difficult to study and measure, very few outcome tools have established and universally accepted MCID Helps to reconcile the statistical significance and clinical relevance of study results that use outcome tools. 