Think of running a 400m track race, once around a standard outdoor track, split it into 3 phases.
PHASE 1 - the start line
At the start line all runners are placed in lanes 1 through 8 at a specific unique starting line which is to ‘balance’ the track distance for each runner at 400m even. If a runner starts ahead or behind of their designated line, they will run shorter or further than 400m. There could also be an error in placement or identifying a false start. Equipment advantages may also be present (not generally meaningful in running). Exclusion of athletes using performance enhancing drugs is also done before the competition. However, it is possible that athletes are competing with performance enhancing drugs albeit unknown to the race officials or sport governing body at the time. In other words, there are several known and unknown factors that may create a fair or unfair advantage. These factors are managed primarily through ‘design’.
If at the start of an randomized clinical trial (RCT), patients are not on average similar between treatment groups in their known and unknown characteristics, other than the assigned treatments, then the comparison is not fair from the beginning. The best design tactic to balance the groups at the start of the trial is to randomized patients in a truly random manner and ensure the process is concealed. If randomization breaks down or the process is predictable then patients may be selected IN or OUT of the trial which may lead to results that do not reflect the truth (ie. bias).
PHASE 2 - during the race
After the race has started (i.e. during the race), athletes are to remain in their lane. If they cut in to another lane they will either run shorter or further than 400m. The lanes must remain clear and unobstructed and coaches, fans, or the timing crew must not interfere! Thankfully this is not common during the race but one could imagine a 400m with and without hurdles and who the advantage would go to. If a runner is injured during the race this will also impact the results.
Similarly, if during the conduct of a trial, a patient stops their treatment or starts another treatment, this may affect their chances of experiencing the endpoint of interest. Moreover, patients cannot get extra help. For instance if the health care provider knows which treatment a patient is using during the trial they may be influenced to take away or add additional therapies. The injured runner is analogous to an adverse event which does not necessarily cause bias but must be strictly monitored throughout the trial.
PHASE 3 - the finish line
At the finish line, the timing system must be accurate. A photo finish system is a more objective measure than hand timing! Moreover, if an athlete had dropped out or not finished the race, this affects both their time (it is missing) and the overall denominator of the rankings.
The measurement of the outcomes at the end of a trial must be done accurately and fairly. When the trial is complete, outcome data should be available for all patients enrolled in the trial. If patients drop out of the study and data is missing, this may also introduce bias.
The analogy is not perfect but I find it helpful and fun!
Stay tuned for a four part series on Bias in RCTs.
Although this post is about RCTs, non-randomized (aka observational) drug effect studies are prone to many types of bias and are lacking the randomization mechanism to combat selection bias at baseline. Scientific methods posts will address several major threats to validity commonly seen in observational drug effect studies including confounding by indication, collider bias, and immortal time bias.