Shift Happens: A Fairness-Oriented Framework for Medical Classification under Hidden Bias
Minh Nguyen Nhat To, Diane Kim*, Mohamed Harmanani*, Paul F.R. Wilson,
Fahimeh Fooladgar, more authors, Rahul G. Krishnan, Parvin Mousavi, and Purang Abolmaesumi
Information Processing in Computer Assisted Interventions (IPCAI), 2026
Many medical AI models perform unevenly across patient groups because they learn shortcuts from biased data. These hidden biases make models less reliable and less fair in real-world use.
We introduce DPE-Former, a model that combines prototype-based learning with transformer attention. The system trains several complementary classifiers on balanced subsets of data, each capturing different aspects of the population. A transformer module then learns how to combine its outputs in an adaptive way, helping the model make more balanced decisions across unseen or minority groups.
DPE-Former achieved higher accuracy on underrepresented groups and more consistent performance overall compared to standard training methods.
@article{wilson2025_prostnfound,
title = {ProstNFound+: A Prospective Study using Medical Foundation Models for Prostate Cancer Detection},
author = {Wilson, Paul F. R. and Harmanani, Mohamed and Nguyen Nhat To, Minh and Jamzad, Amoon and Elghareb, Tarek and Guo, Zhuoxin and Kinnaird, Adam and Wodlinger, Brian and Abolmaesumi, Purang and Mousavi, Parvin},
journal = {arXiv preprint},
year = {2025},
month = {Oct},
doi = {10.48550/arXiv.2510.26703},
}
ProstNFound+: A Prospective Study using Medical Foundation
Models for Prostate Cancer Detection
Paul F. R. Wilson, Mohamed Harmanani, Minh Nguyen
Nhat To, Amoon Jamzad, Tarek Elghareb, Zhuoxin Guo, Adam
Kinnaird, Brian Wodlinger, Purang Abolmaesumi, and Parvin
Mousavi
International Journal of Computer Assisted Radiology and Surgery, 2026
Medical foundation models (FMs) offer a path to
build high-performance diagnostic systems. However,
their application to prostate cancer detection from
micro-ultrasound (μUS) remains untested in clinical
settings. We present ProstNFound+, an adaptation of FMs
for μUS-based PCa detection, and report the first
prospective validation of such a system. The model
demonstrates strong agreement with biopsy-confirmed
findings and surpasses conventional scoring protocols.
@article{wilson2025_prostnfound,
title = {ProstNFound+: A Prospective Study using Medical Foundation Models for Prostate Cancer Detection},
author = {Wilson, Paul F. R. and Harmanani, Mohamed and Nguyen Nhat To, Minh and Jamzad, Amoon and Elghareb, Tarek and Guo, Zhuoxin and Kinnaird, Adam and Wodlinger, Brian and Abolmaesumi, Purang and Mousavi, Parvin},
journal = {arXiv preprint},
year = {2025},
month = {Oct},
doi = {10.48550/arXiv.2510.26703},
}
Deep learning holds significant promise for enhancing
real-time ultrasound-based prostate biopsy guidance
through precise and effective tissue characterization.
Despite recent advancements, prostate cancer (PCa)
detection using ultrasound imaging still faces two
critical challenges: (i) limited sensitivity to subtle
tissue variations essential for detecting clinically
significant disease, and (ii) weak and noisy labeling
resulting from reliance on coarse annotations in
histopathological reports. To address these issues, we
introduce ProTeUS, an innovative spatio-temporal
framework that integrates clinical metadata with
comprehensive spatial and temporal ultrasound features
extracted by a foundation model. Our method includes a
novel hybrid, cancer involvement-aware loss function
designed to enhance resilience against label noise and
effectively learn distinct PCa signatures. Furthermore,
we employ a progressive training strategy that initially
prioritizes high-involvement cases and gradually
incorporates lower-involvement samples. These
advancements significantly improve the model’s
robustness to noise and mitigate the limitations posed
by weak labels, achieving state-of-the-art PCa detection
performance with an AUROC of 86.9%.
@incollection{elghareb2025_proteus,
title = {ProTeUS: A Spatio-Temporal Enhanced Ultrasound-Based Framework for Prostate Cancer Detection},
author = {Elghareb, Tarek and Harmanani, Mohamed and Nguyen Nhat To, Minh and Wilson, Paul and Jamzad, Amoon and Fooladgar, Fahimeh and Abdelsamad, Baraa and Dzikunu, Obed and Sojoudi, Samira and Reznik, Gabrielle and Leveridge, Michael and Siemens, Robert and Chang, Silvia and Black, Peter and Mousavi, Parvin and Abolmaesumi, Purang},
booktitle = {Medical Image Computing and Computer-Assisted Intervention -- MICCAI 2025},
year = {2025},
month = {September},
publisher = {Springer},
doi = {10.1007/978-3-032-04984-1_39},
}
Diverse Prototypical Ensembles Improve Robustness to
Subpopulation Shift
Minh Nguyen Nhat To, Paul F. R. Wilson, Viet Nguyen,
Mohamed Harmanani, Michael Cooper, Fahimeh
Fooladgar, Purang Abolmaesumi, Parvin Mousavi, Rahul G.
Krishnan
International Conference on Machine Learning (ICML), 2025
The subpopulation shift — a mismatch in subgroup
distribution between training and deployment — can
significantly degrade performance. We propose Diverse
Prototypical Ensembles, which encourage classifier
diversity to improve worst-group accuracy. Evaluated on
nine datasets, our method outperforms prior
state-of-the-art approaches.
@inproceedings{to2025_dpe,
title = {Diverse Prototypical Ensembles Improve Robustness to Subpopulation Shift},
author = {Nguyen Nhat To, Minh and Wilson, Paul F. R. and Nguyen, Viet and Harmanani, Mohamed and Cooper, Michael and Fooladgar, Fahimeh and Abolmaesumi, Purang and Mousavi, Parvin and Krishnan, Rahul G.},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning (ICML)},
year = {2025},
doi = {10.48550/arXiv.2505.23027},
}
We introduce Cinepro, a novel framework that strengthens
foundation models' ability to localize PCa in ultrasound
cineloops. Cinepro adapts robust training by integrating
the proportion of cancer tissue reported by pathology in
a biopsy core into its loss function to address label
noise. It leverages temporal data across multiple frames
to apply robust augmentations. Cinepro achieves an AUROC
of 77.1% and balanced accuracy of 71.9%, surpassing
current benchmarks.
@inproceedings{harmanani2025cinepro,
title = {Cinepro: Robust Training of Foundation Models for Cancer Detection in Prostate Ultrasound Cineloops},
author = {Harmanani, M. and Jamzad, A.* and To, MNN.* and Wilson, PFR.* and Guo, Z. and Fooladgar, F. and Sojoudi, S. and Gilany, M. and Chang, S. and Black, P. and Leveridge, M. and Siemens, R. and Abolmaesumi, P. and Mousavi, P.},
booktitle = {IEEE International Symposium on Biomedical Imaging (ISBI)},
year = {2025}
}
While deep learning methods have shown great promise in
improving the effectiveness of prostate cancer (PCa)
diagnosis by detecting suspicious lesions from
trans-rectal ultrasound (TRUS), they must overcome
multiple simultaneous challenges. There is high
heterogeneity in tissue appearance, significant class
imbalance in favor of benign examples, and scarcity in
the number and quality of ground truth annotations
available to train models. Failure to address even a
single one of these problems can result in unacceptable
clinical outcomes. We propose TRUSWorthy, a carefully
designed, tuned, and integrated system for reliable PCa
detection. Our pipeline integrates self-supervised
learning, multiple-instance learning aggregation using
transformers, random-undersampled boosting and
ensembling: these address label scarcity, weak labels,
class imbalance, and overconfidence, respectively. We
train and rigorously evaluate our method using a large,
multi-center dataset of micro-ultrasound data. Our
method outperforms previous state-of-the-art deep
learning methods in terms of accuracy and uncertainty
calibration, with AUROC and balanced accuracy scores of
79.9% and 71.5%, respectively. On the top 20% of
predictions with the highest confidence, we can achieve
a balanced accuracy of up to 91%. The success of
TRUSWorthy demonstrates the potential of integrated deep
learning solutions to meet clinical needs in a highly
challenging deployment setting, and is a significant
step towards creating a trustworthy system for
computer-assisted PCa diagnosis.
@article{harmanani2025trusworthy,
title={TRUSWorthy: toward clinically applicable deep learning for confident detection of prostate cancer in micro-ultrasound},
author={Harmanani, Mohamed and Wilson, Paul FR and To, Minh Nguyen Nhat and Gilany, Mahdi and Jamzad, Amoon and Fooladgar, Fahimeh and Wodlinger, Brian and Abolmaesumi, Purang and Mousavi, Parvin},
journal={International Journal of Computer Assisted Radiology and Surgery},
pages={1--9},
year={2025},
publisher={Springer}
}
@mastersthesis{harmanani2024towards,
title={Towards a Reliable Deep Learning Framework for Prostate Cancer Diagnosis using Ultrasound},
author={Harmanani, Mohamed},
year={2024},
school={Queen's University (Canada)}
}
We propose Diverse Ensemble Entropy Minimization
(DEnEM), showing that existing TTA methods are
suboptimal on ultrasound data due to reliance on
uncalibrated output probabilities and poorly defined
augmentations. Our approach improves AUROC by 5–7% over
existing methods and by 3–5% over other TTA approaches.
@inproceedings{gilany2024denem,
title = {Calibrated Diverse Ensemble Entropy Minimization for Robust Test-Time Adaptation in Prostate Cancer Detection},
author = {Gilany, M. and Harmanani, M. and Wilson, PFR. and To, MNN. and Jamzad, A. and Fooladgar, F. and Wodlinger, B. and Abolmaesumi, P. and Mousavi, P.},
booktitle = {MICCAI Workshop on Machine Learning for Medical Imaging (MLMI)},
year = {2024}
}
ProstNFound integrates domain-specific ultrasound
knowledge and clinical markers into foundation models
using auxiliary networks that provide high-resolution
texture and clinical prompts. It achieves 90%
sensitivity at 40% specificity, competitive with expert
radiologists.
@inproceedings{wilson2024prostnfound,
title = {ProstNFound: Integrating Foundation Models with Ultrasound Domain Knowledge and Clinical Context for Robust Prostate Cancer Detection},
author = {Wilson, PFR. and To, MNN. and Jamzad, A. and Gilany, M. and Harmanani, M. and Elghareb, T. and Fooladgar, F. and Wodlinger, B. and Abolmaesumi, P. and Mousavi, P.},
booktitle = {Medical Image Computing and Computer Assisted Intervention (MICCAI)},
year = {2024}
}
LensePro is a unified method that excels in label
efficiency and robustness to label noise and OOD data.
It uses self-supervised learning to extract high-quality
features from unlabeled TRUS data, followed by a
noise-tolerant prototype-based classification stage.
@article{to2024lensepro,
title = {LensePro: Label Noise-Tolerant Prototype-Based Network for Improving Cancer Detection in Prostate Ultrasound with Limited Annotations},
author = {To, MNN. and Fooladgar, F.* and Wilson, PFR.* and Harmanani, M.* and Gilany, M. and Sojoudi, S. and Jamzad, A. and Chang, S. and Black, P. and Mousavi, P.{\dagger} and Abolmaesumi, P.{\dagger}},
journal = {International Journal of Computer Assisted Radiology and Surgery},
year = {2024}
}
This multi-center study evaluates PCa ultrasound
detection under realistic data heterogeneity. It
highlights the importance of uncertainty calibration for
clinical decision support and establishes strong
benchmarks for future research.
@article{wilson2024confident,
title = {Toward Confident Prostate Cancer Detection using Ultrasound: A Multi-Center Study},
author = {Wilson, PFR. and Harmanani, M. and To, MNN. and Gilany, M. and Jamzad, A. and Fooladgar, F. and Wodlinger, B. and Abolmaesumi, P. and Mousavi, P.},
journal = {International Journal of Computer Assisted Radiology and Surgery},
year = {2024}
}
This study benchmarks ViT architectures for ROI and
multi-scale prostate ultrasound classification,
comparing them to CNNs. A multi-objective learning
strategy achieves 77.9% AUROC with significant gains in
sensitivity and specificity.
@inproceedings{harmanani2024transformers,
title = {Benchmarking Image Transformers for Prostate Cancer Detection from Ultrasound Data},
author = {Harmanani, M. and Wilson, PFR. and Fooladgar, F. and Jamzad, A. and Gilany, M. and To, MNN. and Wodlinger, B. and Abolmaesumi, P. and Mousavi, P.},
booktitle = {SPIE Medical Imaging},
year = {2024}
}
This work uses probabilistic planning and dynamic graph
analysis to model the spread of COVID-19 indoors. The
planner evaluates and designs NPIs such as masks,
vaccines, and capacity limits, demonstrating
effectiveness in predicting and limiting infections.
@inproceedings{harmanani2023covid,
title = {Modelling the Spread of COVID-19 in Indoor Spaces using Probabilistic Automated Planning},
author = {Harmanani, M.},
booktitle = {ICAPS Scheduling and Planning Applications Workshop (SPARK)},
year = {2023}
}
This work studies RNN and CodeBERT models for
identifying errors in student Python submissions. It
shows that AUC metrics correlate poorly with human
evaluation and that transfer learning benefits only
appear clearly when measured with human assessment.
@inproceedings{fujimori2022errors,
title = {Using Deep Learning to Localize Errors in Student Code Submissions},
author = {Fujimori, S. and Harmanani, M. and Siddiqui, O. and Zhang, L.},
booktitle = {Proceedings of the 53rd ACM Technical Symposium on Computer Science Education (SIGCSE)},
year = {2022},
doi = {10.1145/3478432.3499048}
}