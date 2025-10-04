Overcoming the Limitations of VAERS: A 5-Point Guide [Whitepaper]

Jessica Rose, PhD & Liz Willner (OpenVAERS)

A plot generated using VAERS and Our World in Data (OWID) data to demonstrate overlap of neoplasm reports with daily COVID administration statistics. Shown is a 3-month offset: the neoplasm reports were entered with an approximate 3-month delay.

Abstract

The Vaccine Adverse Event Reporting System (VAERS) requires a comprehensive overhaul to enhance its functionality and reliability. This undertaking is complex and demands the expertise of skilled interface designers and data analysts intimately familiar with the system’s intricacies. As a critical repository of adverse event (AE) data spanning over three decades, VAERS is an invaluable resource, yet its outdated infrastructure necessitates a fundamental redesign. This whitepaper outlines the current limitations of VAERS and proposes a robust 5-point framework to modernize it, creating an automated system that seamlessly integrates with the existing platform while aligning with the FDA’s Adverse Event Reporting System (FAERS). Unlike VAERS, which relies on less standardized data collected primarily from healthcare providers and the public through web-based forms or manual entry, FAERS employs the ICH E2B(R3) standard for electronic submission of individual case safety reports (ICSRs), ensuring consistent and structured datasets. Our 5-point guide leverages FAERS’ data collection and processing methodologies, while incorporating cutting-edge AI tools to significantly enhance the quality, usability, and accessibility of VAERS data thus transforming it into a more effective and reliable public health tool.

Background

What is VAERS?

VAERS is a pharmacovigilance database created and implemented by the U.S. FDA and Centers for Disease Control and Prevention (CDC) in 1990 to receive reports of adverse events (AEs) that may be associated with biological products such as vaccines. (1) Pharmacovigilance is the process of collecting, monitoring, and evaluating AEs for safety signals to reduce harm and promote safety to the public in the context of pharmaceutical and biological agents. (2,3,4) Most vaccine AE reports in VAERS concern relatively minor events, such as injection site pain. (1,5,6,7) Other reports describe serious adverse events (SAEs), such as hospitalizations, life-threatening illnesses, or deaths. (1,5) The reports of SAEs are of greatest concern and are meant to receive the most scrutiny by VAERS staff and healthcare professionals. (8)

What is an AE/SAE?

An Adverse Event (AE) is defined as any untoward or unfavorable medical occurrence in a human study participant, including any abnormal physical exam or laboratory finding, symptom, or disease temporally associated with the participants’ involvement in the research, whether or not considered related to participation in the research. Based on the Code of Federal Regulations (9), an SAE is defined as any AE that results in death, is life threatening, or places the individual at immediate risk of death from the AE as it occurred, requires or prolongs hospitalization, causes persistent or significant disability or incapacity, results in congenital anomalies or birth defects, or is another condition which investigators judge to represent significant hazards. (5,8) The VAERS handbook states that approximately 15% of reported AEs are classified as serious (eg: SAEs). (1,5)

What is VAERS for?

The primary purpose of the VAERS database is as a pharmacovigilance tool - to serve as an early warning or signaling system for AEs not detected during pre-market testing. The National Childhood Vaccine Injury Act of 1986 (NCVIA) (10) requires health care providers and vaccine manufacturers to report AEs to the DHHS following the administration of vaccines outlined in the Act. (5,6,7,11) Reported AEs, as part of the VAERS system, represent a fraction of the actual number of AE incidents, so the numbers reported are likely far lower than actual numbers. (12)

Who files AE reports?

VAERS reports can be made by nurse practitioners, general practitioners (~67%), (13) or family members, which can result in duplicate reports being made. As part of the VAERS Standard Operating Procedures for COVID-19 (SOP) published on January 29th, 2021, the CDC and the FDA are meant to perform routine VAERS surveillance to identify potential emergent safety concerns in the context of COVID-19 injectable products and to remove duplicate and false entries. (11) Accordingly, VAERS reports are received, processed, and managed by trained CDC contractors.

How are AE reports filed?

VAERS reports are primarily received online via web-based forms for subsequent review. Symptoms and diagnoses are assigned MedDRA standard codes of which there are approximately 25,000. (1,14) Additional information, including hospital records and autopsy reports, are intended to be requested by trained staff when appropriate as outlined in the SOP. The web-based system is sluggish and requires approximately 30 minutes of time to complete a single adverse event report. Once a report is filed, it is assigned a temporary VAERS ID until it is vetted by VAERS employees. Once vetted, it might receive a permanent VAERS ID to be uploaded to the front-end VAERS database. (1) Reports are often changed or deleted which is highly problematic and in some cases, unexplained. (2) It is unknown why some VAERS IDs are deleted as these removals are not tracked within the system. In the case where a person successfully files a report using the VAERS system, obtains a permanent VAERS ID and subsequently dies, they are, in some cases, assigned a new VAERS ID if “re-reported”, effectively unlinking their reported AEs and death records. Alternatively, the report is filed as an adjunct to the main report and while it remains linked to the main report, the death is not included in counts, or searchable.

It is important to note that there are two sets of VAERS ‘books’ which also lends to data non-transparency and non-standardization. (15) Considering the amount of non-required demographic data collected, it is probable that the non-front-end VAERS database is far more comprehensive than the front-end database available for download and analysis.

Limitations of VAERS

Underreporting of AEs

Being a passive surveillance system, VAERS relies on voluntary submissions, capturing only a small fraction of actual incidents, which can lead to an incomplete picture of vaccine safety and potentially not capture emergent AEs/SAEs. (1,5, 12) Underreporting is the result of a multitude of factors including:

The “psychologic”: Only a proportion of people will ever report an AE to a pharmacovigilance database (or have one reported for them). People oftentimes don’t suspect an emergent AE as associated with a vaccine that they received or administered.

Incentivization/de-incentivization to report an AE within the health care industry: If a hospital is sponsored/paid by a pharmaceutical company, for example, that hospital is contractually-obligated to make sponsored products ‘look good’ and thus reporting SAEs in the context of promoted vaccines would be counter-indicative of this. It is also important to consider that government/Medicaid incentives could be bolstered.

Inherent bias: Many health care practitioners simply do not believe that vaccines can cause SAEs

Time: It is also often the case that due to the time that it takes to file a VAERS report (~30 minutes online), many general practitioners will simply forego this duty. Reimbursement for their time might also provide a positive incentive for filing a VAERS report.

Inability to Establish Causality

AEs in VAERS do not prove that a vaccine caused the reported event, making it challenging to differentiate true vaccine reactions from unrelated health issues without further investigation. However, although VAERS is not designed to confirm causation, secondary assessments can and should be done using VAERS data to determine the probability of causal relationships.

Incomplete or Inaccurate Data

Submissions often lack full details, contain errors, or are unverifiable, which complicates analysis and can introduce biases into the dataset.

Mutability

As previously stated, VAERS data is subject to vetting and can be altered or even deleted once a VAERS report is filed. These changes can often be significant taking place in MedDRA coding, or in removal of key SYMPTOM_TEXT information.

Passive Reporting System

Allowing submissions from the public, including non-experts, enables broad detection but also invites potential misuse, such as false or exaggerated claims driven by misinformation campaigns. The fact that VAERS is a passive reporting system also lends to underreporting.

Reporting Bias and Stimulation

Media attention, litigation, or public awareness can inflate or deflate reports for certain vaccines, skewing perceptions of risk without reflecting true incidence rates, akin to a feedback loop amplifying noise over signal.

Lack of Denominator Data

Without comprehensive data on total vaccinations administered, it’s impossible to calculate accurate AE rates, limiting VAERS to signal detection rather than quantitative risk assessment.

Challenges with Multiple Vaccines

When several vaccines are given simultaneously, attributing an AE to a specific one becomes speculative, clouding insights into individual product safety profiles.

Overcoming the Limitations of VAERS – a 5-Point Guide

Revamping VAERS demands innovative protocols for reporting and educating both the public and healthcare professionals, with AI tools proving essential to this transformation. Key limitations - such as underreporting, incomplete or inaccurate data, the inherent reporting bias of VAERS’ passive system, and complexities in handling multi-vaccine data - can be effectively addressed through a strategic, comprehensive protocol. Moreover, establishing causal relationships from associations can be streamlined using robust methodologies like the Bradford Hill Criteria, proportional reporting ratios (PRRs), and Bayesian analyses when appropriate. Data mutability challenges can be fully resolved by leveraging AI-driven solutions and implementing a decentralized, encrypted data exchange system, which would also tackle underreporting and enhance the sharing and maintenance of electronic health records (EHRs) and datasets, breaking down barriers created by siloed data. Additionally, the issue of missing denominator data can be resolved by developing a comprehensive dataset of total vaccinations administered per state, seamlessly aligned with state-level VAERS reports, thereby significantly improving the system’s accuracy and reliability.

1. The online VAERS data collection system which comprises 5 time-sensitive e-pages and 28 Items needs fundamental revision from both data and user-interface perspectives:

a. A user account set up should precede user/reporter reporting sessions such as is the case for FAERS. See Safety Reporting Portal (SRP)/ b. Reporting sessions should be savable, where updates can be made by the user/reporter to existing reports. This would solve the problem of multiple VAERS IDs created for one individual in the case of death. i. Sessions should be saved during/throughout the reporting process to prevent data loss during submission. ii. Timeouts should be either lengthened or removed entirely. iii. Users (who now have an account) can log back into a saved report to create updates. c. Improvements in communication between VAERS employees (vetting team) and the user/reporter should be made. d. Pre-filled/loaded data collection fields for drug names, etc. should be available to facilitate data searches and prevention of typos. Pull-down menus are essential to this idea. e. Pre-filled/loaded data collection fields for lot numbers and manufacturer names should also be entered according to manufacturer’s pre-designated lot format. For example, in the case of Pfizer/BioNTech/Comirnaty COVID-19 shots, the format for a vaccine lot is 6 characters: two alpha-numeric followed by 4 digits. This would also solve the problem associated with PAA inputs – vial allocation assignment numbers - written in lieu of vaccine lot number. f. Dates should be formatted with only a single date structure (dd/mm/yyyy) using a date selector. g. Consideration should be given to adding additional fields during the conversation between the reporter and the VAERS employee during vetting. For example, pregnancy, medications, allergies, other illnesses, chronic conditions, other vaccines given, and previous AE reports filed should be “required” data.

2. Incorporation of Modern Technologies and Data Manipulation

A critical component of the comprehensive overhaul of VAERS is the integration of an AI-assisted free-text analyzer. This advanced tool would meticulously analyze all data entered into the free-text box during adverse event (AE) reporting, extracting vital details such as mis-entered age data, pre-existing conditions, and other relevant information. These extracted data points would populate a standardized spreadsheet for each VAERS report, uniquely identified by a VAERS ID. The AI tool would further enhance accuracy by cross-referencing the submitted VAERS report with the free-text input to generate a precise and reliable final report. Additionally, it would streamline the identification of duplicate or false reports, alleviating the manual burden on VAERS staff currently tasked with this labor-intensive vetting process, thereby improving efficiency and data integrity.

2.1 The use of AI tools to recode existing reports The application of MedDRA terms is haphazard and biased. For example, reports of tongue swelling and other symptoms of anaphylaxis have been coded as “Asthma, Hypersensitivity, Dysphonia, Tongue disorder, Paraesthesia oral”. AI tools could be used upon submission to accurately MedDRA code reports. It would also be possible to add ICD-10 codes to the system simultaneously. Without these important terms being accurately assigned, records “disappear” from data searches. AI tools offer transformative potential for refining existing data within VAERS. As proposed, these tools can extract information from free-text fields (e.g., SYMPTOM_TEXT) to populate missing fields, such as manufacturer data or drug details, generating a standardized spreadsheet with a comprehensive variable list. This process ensures all fields are accurately completed, free of spelling, formatting, or date structure errors, enhancing data consistency and reliability. Beyond this, AI applications extend to numerous other functions, such as analyzing the foreign dataset to pinpoint the global location of a vaccine adverse event using the first two characters of the SPLTTYPE variable. These capabilities represent just a fraction of the many innovative ways AI can optimize VAERS data processing and usability. 2.2 Data storage and publishing a. Create an electronic connection to VAERS to facilitate the reporting from large medical systems so that bulk uploads are possible. This would alleviate siloed data. b. Create an AI tool to parse dumped data (data dump from free text) and move this data into distinct fields or require that the reporting system matches schemas. c. Decentralize VAERS reports and reporting to prevent data mutability. VAERS IDs are often re-used or removed and this should never occur. Each record should have a distinct number that autoincrements. If a report is deemed non-releasable, then that VAERS ID should be unpublished with a written note; not removed and reused. d. The full dataset should be made available to the public with all fields that are not identifying. This includes fields collected for race, pregnancy and type of reporting party.

3. Causality assessment from VAERS Data

Establishing causal relationships from VAERS data is streamlined when associations are identified, making VAERS a powerful tool for detecting unusual patterns or clusters that may indicate vaccine safety concerns. These insights enable the CDC and FDA to collect critical data, triggering further investigation, such as causality assessments, when potential issues arise. In this context, “cause” or “causality” refers to a specific antecedent event, condition, or characteristic necessary for the occurrence of a disease at a given moment, assuming other conditions remain constant, while sufficient cause encompasses the minimal conditions and events that inevitably lead to disease. In medical evaluations, causality is typically assessed using the Bradford Hill criteria or adapted methodologies tailored to the causal question. (17) These criteria and reasoned approaches include evaluating the strength of association, consistency, specificity, temporality, dose-response, plausibility, coherence, experimental evidence, analogy, and reversibility, ensuring a robust framework for determining causal links with precision and reliability. Below is a short-list of some of the Bradford Hill criteria and what is required to satisfy each condition:

a. Consistency - Consistent findings observed by different persons in different places with different samples strengthens the likelihood of an effect. b. Specificity - Causation is likely if there is a very specific population at a specific site and disease with no other likely explanation. The more specific an association between a factor and an effect is, the bigger the probability of a causal relationship. c. Temporality - The effect has to occur after the cause (and if there is an expected delay between the cause and expected effect, then the effect must occur after that delay). d. Dose-response - Greater exposure should generally lead to greater incidence of the effect. However, in some cases, the mere presence of the factor can trigger the effect. In other cases, an inverse proportion is observed: greater exposure leads to lower incidence. e. Plausibility - A plausible mechanism between cause and effect is helpful (but Hill noted that knowledge of the mechanism is limited by current knowledge). f. Reversibility - If the cause is deleted then the effect should disappear as well. VAERS data has historically proven instrumental in identifying safety signals, notably prompting the withdrawal of the Rotavirus vaccine due to a detected intussusception signal in children, later confirmed as causally linked to the vaccine. (18) Beyond such examples, VAERS employs methods like the proportional reporting ratio (PRR) (19) and Bayesian criteria to assess associations and detect potential safety signals. As outlined in section 2.3.1 of the CDC’s Standard Operating Procedures, the PRR is a key metric for identifying safety signals in VAERS, calculated as PRR = [a/(a+b)]/[c/(c+d)], where: a represents the specific adverse event (AE) for a given vaccine; b denotes all other AEs for that vaccine; c indicates the specific AE for all other vaccines; and d encompasses all other AEs for all other vaccines. This straightforward calculation yields a PRR value, where a result greater than 1 (or 2, depending on the threshold) indicates that the AE is reported more frequently for the vaccine, suggesting a potential side effect warranting further investigation. When PRR exceeds 1, a causality assessment using the Bradford Hill criteria is mandated to rigorously evaluate the potential link, ensuring a robust and systematic approach to vaccine safety monitoring. N.B. CDC is charged with finding causal reports in VAERS. Rochelle Walensky testified before the House Select Subcommittee on the Coronavirus Pandemic on October 2, 2023 stating the following: “We at CDC have a responsibility to comb through every single one of them, to review the medical charts, and see if they are related.” If so, then a VAERS report with a validated causal link to a vaccine should be submitted along with the report to congress via the Public Health Service Act or via or via the Advisory Committee on Immunization Practices (ACIP), detailing which reports have been found to be causal within VAERS with their associated causal analysis.

4. Addressing Mutability via Blockchain

Addressing data mutability issues in VAERS can be achieved by implementing OP-Return within a decentralized and encrypted data exchange system, ensuring secure and immutable data handling. In the U.S., health data is collected through the National Health Interview Survey (NHIS), VAERS, and the Vaccine Safety Datalink (VSD), utilizing surveys, electronic health records (EHRs), and adverse event (AE) reports. While NHIS and VAERS are centrally managed by federal agencies, VSD relies on a distributed network of servers operated by participating healthcare organizations, such as Kaiser Permanente, resulting in centralized coordination but fragmented infrastructure. This structure is highly inefficient, as sharing and maintaining EHRs and datasets across siloed entities is cumbersome, and patients lack access to their own data. This lack of transparency and accessibility is a critical flaw that must be addressed. A decentralized system would empower patients with control over their data, enhance interoperability, and streamline data sharing, fundamentally transforming the efficiency and equity of health data management.

Empowering individuals with ownership of their clinical datasets on decentralized platforms would revolutionize data analytics and research by enabling seamless collaboration across diverse institutions. This approach would facilitate the identification of novel therapeutic targets, robust validation of hypotheses, and accelerated translation of scientific discoveries into clinically viable therapies and technologies, fostering innovation and enhancing patient outcomes. (20) With decentralized storage of data, health data would be stored off-chain on the InterPlanetary File System (IPFS/CID) for scalability, with its metadata (e.g., IPFS CID or hash) anchored on a blockchain via OP_RETURN (light-weight metadata embedded in blockchain transaction) to ensure immutability. IPFS is a decentralized, peer-to-peer protocol and network designed for storing and sharing data in a distributed file system. This decentralized system would completely ameliorate the mutability issue and due to system transparency, reporting would likely become incentivized and thus under-reporting would be less of an issue - more community-driven system (more awareness, less fear of ‘liability’. In a word, decentralized reporting would enable rapid, user-driven drug/prodrug safety assessments. (21) A use-case: If an individual suffers an AE post-vaccination, the event is reported through a DApp (decentralized Application), encrypted, and stored on IPFS. The OP_Return field in a blockchain transaction records the event’s hash and metadata, ensuring immutability. If a secondary AE occurs, such as death, it is similarly recorded, maintaining a complete and transparent record.

Decentralized, secure, seamless, scalable and sustainable health-related data storage with patient control is the way of the future and imperative to deploy to remain current and revolutionary.

5. Connection with State Systems

Addressing the lack of denominator data in VAERS can be effectively resolved by developing a comprehensive, age-stratified dataset of total vaccinations administered per state. As VAERS reports already allow users to specify the state where a vaccine was administered, this information can be seamlessly matched with the state-level vaccination dataset. To ensure accuracy and consistency, this data should align with CDC state-level records. By leveraging decentralized platforms and AI tools, both vaccination and adverse event (AE) data can be made available in real time, enabling dynamic, transparent, and precise monitoring of vaccine safety across populations.

Conclusion

Revamping VAERS requires integration of existing infrastructure with new AI-driven solutions, decentralized data systems to prevent siloed data, and comprehensive vaccination datasets to address underreporting, data mutability, and missing denominator data. These innovations will enhance reporting accuracy, streamline causality assessments, and empower individuals with data control. Ultimately, a modernized VAERS ensures robust vaccine safety monitoring and fosters transparent, efficient health data management.

