Artificial intelligence is rapidly modernising pharmacovigilance (PV). It now offers powerful capabilities for processing Individual Case Safety Reports (ICSRs), detecting safety signals, and predicting adverse drug reactions.

However, as AI systems become more sophisticated, many rely on complex “black-box” models. These models can produce highly accurate results, but their internal reasoning is often difficult for humans to interpret. This raises an important question:

How can PV professionals and regulators trust decisions they cannot clearly understand?


Confidence Scores as a First Layer of Explainability

In practice, many organisations developing AI tools for PV—such as case intake systems, literature monitoring tools, or signal detection platforms—publish a confidence scorecard alongside extracted information from both structured and unstructured data sources.

Confidence scores are widely recognised as a component of an explainable AI (xAI) framework. In fact, CIOMS Working Group XIV explicitly lists confidence scores and trust scores as examples of explainability techniques.

In real-world PV workflows, organisations often use confidence score thresholds (for example, requiring a score of ≥ 0.8) as a key validation and control mechanism for automated processes.

This approach is extremely valuable because it allows systems to automatically flag predictions or extractions that fall below an acceptable certainty level. Those cases can then be routed for human review.

By providing a transparent, case-by-case indication of how confident the model is in its output, confidence scores help enable a Human-in-the-Loop (HITL) governance model. They guide reviewers on when they can rely on the AI and when closer human oversight is needed.


Why Explainable AI Matters for Regulatory Compliance

To meet the expectations of global regulators such as the EMA and FDA, organisations should design AI systems that are transparent, interpretable, and auditable.

Two widely used explainable AI frameworks that support this goal are:

  • SHAP (Shapley Additive exPlanations)
  • LIME (Local Interpretable Model-Agnostic Explanations)

These techniques help uncover the reasoning behind AI predictions, making them especially valuable in regulated environments like pharmacovigilance.

The following sections provide a practical overview of how these tools work and how they can be implemented responsibly.


Understanding the Tools: SHAP and LIME

When an AI model’s internal logic becomes too complex for humans to interpret directly, explainable AI techniques help generate plausible explanations of how the model reached a decision.

Both SHAP and LIME aim to answer one key question:

“Why did the AI make this particular prediction?”


SHAP (Shapley Additive exPlanations)

SHAP is a model-agnostic framework that provides local explainability for tabular, image, and text datasets.

In Simple Terms

SHAP is based on cooperative game theory.

Imagine the AI’s prediction is the final score of a game, and each data feature—such as patient age, symptoms, or laboratory results—is a player contributing to that score.

SHAP calculates exactly how much each feature contributed to the prediction. A feature can either increase or decrease the likelihood of the final outcome.

Example in Pharmacovigilance

In pharmacovigilance, SHAP has been used to explain supervised machine learning models for signal validation classification.

When an AI system evaluates a potential safety signal, SHAP can generate a feature contribution report. This report shows how much weight the algorithm assigned to different variables when determining whether a signal should be considered valid.

For example, the model might reveal that:

  • Time-to-onset contributed strongly to signal detection
  • Patient age had moderate influence
  • Co-reported medications slightly reduced the signal probability

This transparency helps human reviewers understand—and trust—the model’s reasoning.

SHAP in Pharmacovigilance Explained

LIME (Local Interpretable Model-Agnostic Explanations)

LIME is designed to explain individual predictions from complex models by approximating them with simpler, interpretable models.

In Simple Terms

Instead of explaining the entire AI system, LIME focuses on one specific prediction.

It zooms in on a single case and builds a simple explanation showing which parts of the data influenced that decision the most.

Classic Example

A well-known experiment by Ribeiro et al. (2016) used LIME to analyse an image classifier trained to distinguish between dogs and wolves.

The dataset contained a hidden bias: wolves were mostly shown in snowy environments.

When the model incorrectly classified a husky in the snow as a wolf, LIME highlighted the areas of the image driving the prediction.

The result was surprising.

The model wasn’t looking at the animal—it was looking at the snowy background.

This insight exposed a spurious correlation in the dataset.

Application in Pharmacovigilance

In pharmacovigilance, LIME can be applied to clinical narratives.

For example, an AI system analysing a case report might classify an event as non-serious. Using LIME, the system can highlight the words that influenced the decision.

If the explanation reveals that the AI focused heavily on the drug name (for example, an over-the-counter medication) rather than the medical outcome, developers can identify and correct the bias.

LIME in Pharmacovigilance Explained

Step-by-Step Implementation Guide

Implementing SHAP or LIME in pharmacovigilance requires careful planning to ensure both technical reliability and regulatory compliance.


Step 1: Validate the Explainability Tool

Before using SHAP or LIME in a regulated environment, the explainability method itself should be treated as a component of a computerized system.

This means it must undergo appropriate validation procedures to demonstrate that it is fit for purpose and compliant with GxP requirements.

Explainable AI does not eliminate the need for validation. Instead, organisations should establish processes to ensure that explainability tools remain accurate and reliable over time.


Step 2: Use xAI During Model Development

During development, data scientists and PV experts can use SHAP and LIME to inspect model behaviour and identify potential biases.

These tools reveal which data features are influencing predictions, making them extremely useful for debugging and model improvement.

Practical Example

Imagine an AI triage system that frequently misclassifies serious adverse events as non-serious.

An xAI analysis might reveal that the model is incorrectly using the drug name as a proxy for seriousness—for example, assuming that events involving an over-the-counter medication are less likely to be serious.

Once identified, developers can adjust the model or rebalance the training data to remove the bias.


Step 3: Provide Clear Explanations for End Users

During deployment, regulators such as the FDA and EMA encourage organisations to provide explainability outputs for individual predictions.

This helps support the Human-in-the-Loop model.

Structured Data

For structured datasets, SHAP or LIME can generate feature contribution reports showing which variables most influenced the prediction.

For instance, when a model flags a potential safety signal, the system could display factors such as:

  • Patient age
  • Concomitant medications
  • Time-to-onset
  • Medical history

Unstructured Text

For narrative ICSRs, visual explanations can be particularly effective.

One example is the FDA’s Information Visualization Platform (InfoViP), which uses NLP to highlight important sections of case narratives. Colour-coded text helps reviewers quickly see why the AI flagged a case.

This approach improves transparency and allows reviewers to validate AI outputs more efficiently.


Step 4: Align Explainability with Data Privacy

Transparency should never compromise patient privacy.

Processing large health datasets requires strict adherence to regulations such as GDPR and HIPAA, along with principles like data minimisation and anonymisation.

Practical Considerations

High-parameter AI models may unintentionally memorise sensitive patient data.

When implementing explainability tools:

  • Ensure training data is properly de-identified
  • Restrict access to xAI logs and explanation outputs
  • Avoid explanations that could reveal protected health information

It is also important to guard against model inversion attacks, where attackers attempt to reconstruct patient information from the model.


Step 5: Train Users to Avoid Automation Bias

Finally, organisations shoud train PV professionals on the limitations of explainable AI.

SHAP and LIME provide approximations of model behaviour, not perfect explanations.

One major risk is automation bias—the tendency for humans to trust AI outputs too readily, especially when accompanied by convincing explanations.

A plausible explanation attached to an incorrect prediction can still lead reviewers to accept the result without sufficient scrutiny.

For this reason, PV teams should treat explainability outputs as decision support tools, not definitive answers.

Human judgement should always remain the final authority.


Explainable AI is becoming an essential part of responsible AI adoption in pharmacovigilance.

Tools like SHAP, LIME, and confidence scoring mechanisms can significantly improve transparency and trust in AI systems. However, successful implementation requires more than just technical integration.

Organisations should combine:

  • robust validation
  • strong governance
  • privacy safeguards
  • proper user training

When implemented thoughtfully, explainable AI can strengthen Human-in-the-Loop pharmacovigilance systems, ensuring that innovation enhances—rather than replaces—expert human oversight.


😃

That’s all for now!