Saghir Alfasly

I am a Research Fellow at Mayo Clinic (Department of AI & Informatics), where I work on Medical Image Analysis and machine learning.

Previously, I've worked in Video Understanding, Human Activity Recognition, Video-to-Video Summarization, Multimodal Action Recognition, and 3D Video Synthesis.

Email  /  CV  /  Bio  /  Google Scholar  /  LinkedIn  /  Github  /  Youtube

profile photo
Research

My current interests include computer vision, machine learning, medical image analysis, and computational pathology.

Rotation-Agnostic Image Representation Learning for Digital Pathology
Saghir Alfasly, Abubakr Shafique, Peyman Nejat, Jibran Khan Areej Alsaafin, Ghazal Alabtah, H.R.Tizhoosh,
CVPR, 2024
[Project webpage] [Demo] [Paper] [Supplementary] [Code] [Data]

In this work, we introduce a fast patch selection method (FPS) for efficient selection of representative patches while preserving spatial distribution. HistoRotate, is a 360∘ rotation augmentation for training histopathology models, enhancing learning without compromising contextual information. PathDino, is a compact histopathology Transformer with five small vision transformer blocks and ≈9 million parameters.

Foundation Models for Histopathology — Fanfare or Flair
Saghir Alfasly, Peyman Nejat, Sobhan Hemati, Jibran Khan, Isaiah Lahr, Areej Alsaafin, Abubakr Shafique, Nneka Comfere, Dennis Murphree, Chady Meroueh, Saba Yasir, Aaron Mangold, Lisa Boardman, Vijay H. Shah, Joaquin J. Garcia, H.R. Tizhoosh,
Mayo Clinic Proceedings: Digital Health, 2024
[Paper] [Supplementary]

This paper investigates the efficacy of the foundation models in the domain of histopathology by conducting a detailed comparison between these models, specifically CLIP derivatives (PLIP and BiomedCLIP), and traditional, domain-specific histology models that leverage well-curated datasets. Through a rigorous evaluation process on eight diverse datasets, including four internal from Mayo Clinic and four well-known public datasets (PANDA, BRACS, CAMELYON16, DigestPath). The findings show that domain-specific models, such as DinoSSLPath and KimiaNet, provide better performance across various metrics, underlining the significance of clean large datasets for histopathological analyses.

Selection of Distinct Morphologies to Divide & Conquer Gigapixel Pathology Images
Abubakr Shafique, Saghir Alfasly, Areej Alsaafin, Peyman Nejat, Jibran Khan H.R.Tizhoosh,
Preprint, 2023
[Paper] [Supplementary]

We propose SDM, a novel method for selecting diverse WSI patches, minimizing patch count while capturing all morphological variations. SDM outperforms the state-of-the-art, achieving high representativeness without needing parameter tuning.

OSRE: Object-to-Spot Rotation Estimation for Bike Parking Assessment
Saghir Alfasly, Zaid Al-Huda, Saifullahi Bello, Ahmed Elazab Jian Lu, Chen Xu,
IEEE Transactions on Intelligent Transportation Systems, 2023
[Project webpage] [Video] [Paper] [Supplementary] [Code] [Data]

We leveraged the power of 3D graphics and computer vision techniques to tackle a real-world problem, that we propose object-to-spot rotation estimation which is of particular significance for intelligent surveillance systems, bike-sharing systems, and smart cities. We introduced a rotation estimator (OSRE) that estimates a parked bike rotation with respect to its parking area.

An Effective Video Transformer with Synchronized Spatiotemporal and Spatial Self-Attention for Action Recognition
Saghir Alfasly, Charles K. Chui, Qingtang Jiang, Jian Lu, Chen Xu,
IEEE Transactions on Neural Networks and Learning Systems, 2022
[Video] [Paper]

We propose a new spatiotemporal attention scheme, termed synchronized spatiotemporal and spatial attention (SSTSA), which derives the spatiotemporal features with temporal and spatial multiheaded self-attention (MSA) modules.

Learnable Irrelevant Modality Dropout for Multimodal Action Recognition on Modality-Specific Annotated Videos
Saghir Alfasly, Jian Lu, Chen Xu, Yuru Zou,
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
[Paper] [Supp]

Multimodal learning for video understanding (Text, Audio, RGB, Motion). We present a multimodal learning approach that leverage several modalities and several on-the-shelf models for both audio and language understanding. We proposed Irrelevant Modality Dropout (IMD) that drops the irrelevant audio from further processing while fusing the relevant audio-visual data for better video understanding.

FastPicker: Adaptive independent two-stage video-to-video summarization for efficient action recognition
Saghir Alfasly, Jian Lu, Chen Xu, Zaid Al-Hudad Qingtang Jiang, ZhaosongLu , Charles K. Chui,
Neurocomputing, 2022
[Video] [Paper]

This research study addresses the following question: To what extent can a fast independent adaptive algorithm select the most discriminative and representative frames to downsize huge video datasets while improving action recognition performance?

Weakly supervised pavement crack semantic segmentation based on multi-scale object localization and incremental annotation refinement
Zaid Al-Hudad Bo Peng Riyadh Nazar Ali Algburi, Saghir Alfasly, Tianrui Li ,
Applied Intelligence, 2022
[Paper]

Weakly supervised pavement crack semantic segmentation based on multi-scale object localization and incremental annotation refinement.

Multi-Label-Based Similarity Learning for Vehicle Re-Identification
Saghir Alfasly, Yongjian Hu, Haoliang Li, Tiancai Liang, Xiaofeng Jin, BeiBei Liu, Qingli Zhao,
IEEE Access, 2019
[Paper] [Video]

We propose a multi-label similarity learning framework for vehicle re-identification.

Variational Representation Learning for Vehicle Re-Identification
Saghir Alfasly, Yongjian Hu, Tiancai Liang, Xiaofeng Jin, Qingli Zhao, Beibei Liu,
IEEE International Conference on Image Processing, 2019
[Paper] [Github]

We propose variational Representation Learning for object Re-Identification. The proposed method has been evaluated on vehicle re-identification and person re-identification and face recognition.

Auto-Zooming CNN-Based Framework for Real-Time Pedestrian Detection in Outdoor Surveillance Videos
Saghir Alfasly, BeiBei Liu, Yongjian Hu, Yufei Wang, Chang-Tsun Li,
IEEE Access, 2019
[Paper] [Github] [Video]

For small objects detection like pedestrians in the outdoor surveillance, we propose a fast, lightweight, and auto-zooming-based framework for small pedestrian detection.


Service

Reviewer
IEEE Transactions on Image Processing
IEEE Transactions on Neural Networks and Learning Systems
IEEE Transactions on Circuits and Systems for Video Technology
IEEE Transactions on Intelligent Transportation Systems
IEEE Transactions on Artificial Intelligence
IEEE Transactions on Intelligent Vehicles
Pattern Recognition
Pattern Recognition Letters
CVPR 2023, CVPR 2024
ICCV 2023
ECCV 2024
ACM MM
IEEE Access


Source: jonbarron/website