profile_picture
Subhankar Roy, Ph.D
Postdoctoral researcher, University of Trento, Italy
subhankar.roy[at]unitn[dot]it

Hi there! 👋

I am a Postdoctoral researcher at University of Trento, Italy. The primary focus of my research is to equip deep learning models with better generalization capabilities under data distribution and semantic shifts. In particular, my main research areas include multimodal learning, domain adaptation, continual learning, novel class discovery, and so on.

Previous to this I was a Postdoctoral researcher at Telecom Paris and lecturer at University of Aberdeen. I received my Ph.D. degree in 2022 with cum-laude from the University of Trento, where I was advised by Prof. Elisa Ricci and Prof. Nicu Sebe. During my Ph.D. I spent six wonderful months as an intern in Naver Labs Europe, where I had the pleasure to work with Gabriela Csurka and other brilliant colleagues. I was also fortunate to collaborate with Prof. Arno Solin from Aalto University, with whom I worked on bayesian neural networks. Prior to the doctoral degree, in 2018 I received the M.Sc. degree in Telecommunication engineering from the University of Trento with the maximum marks. During my masters I worked with Prof. Begüm Demir on applying deep learning techniques to tackle remote sensing problems.


Interests

  • Domain Adaptation
  • Continual Learning
  • Open World Recognition
  • Weakly Supervised Learning
  • Multi-modal Learning
  • Visual Reasoning

Teaching

  • Advanced Programming (CS551P) at University of Aberdeen
  • Modelling and Problem Solving for Computing (CS1029) at University of Aberdeen
  • Deep Learning at University of Trento

Education

University of Trento, Italy
2018 - 2022
Ph.D. Information and Communication Engineering
graduated with cum-laude
University of Trento, Italy
2015 - 2018
M.Sc. Telecommunications Engineering
graduated with 110/110
West Bengal University of Technology
2009 - 2013
B.Tech. Electrical Engineering

News

  • I will be serving as an Area Chair for CVPR 2025 , November 2024.
  • 1 paper accepted to ICPR 2024 (Oral) , October 2024.
  • Co-organizer of Green Foundation Models Workshop at ECCV 2024 , October 2024.
  • I will be serving as an Area Chair for BMVC 2024 , August 2024.
  • 1 paper accepted to ECCV 2024 , July 2024.
  • I will be serving as an Area Chair for ECCV 2024 , March 2024.
  • 1 paper accepted to CVPR 2024 , February 2024.
  • 1 paper accepted to CVIU , February 2024.
  • 1 paper accepted to ICLR 2024 , January 2024.
  • 1 paper accepted to ICCV 2023 , October 2023.

Publications


Organizing Unstructured Image Collections using Natural Language
Mingxuan Liu , Zhun Zhong , Jun Li , Gianni Franchi , Subhankar Roy , Elisa Ricci
TL;DR: TeDeSC tackles Semantic Multiple Clustering by automatically discovering interpretable clustering criteria from large image collections using text-driven reasoning, enabling unsupervised organization and analysis without human input.
arXiv 2024 , Under review

Weighted Ensemble Models Are Strong Continual Learners
Imad Eddine Marouf , Subhankar Roy , Enzo Tartaglione , Stéphane Lathuilière
TL;DR: To address the stability-plasticity trade-off in continual learning, we propose to perform weight-ensembling of the model parameters of the current and previous tasks.
ECCV 2024 , Milan, Italy

One-Shot Unlearning of Personal Identities
Thomas De Min , Subhankar Roy , Stéphane Lathuilière , Elisa Ricci , Massimiliano Mancini
TL;DR: O-UPI evaluates unlearning models without access to training data, proposing a meta-learning method to forget personal identities from a single image.
arXiv 2024 , Under, review

Large-scale pre-trained models are surprisingly strong in incremental novel class discovery
Mingxuan Liu , Subhankar Roy , Zhun Zhong , Nicu Sebe , Elisa Ricci
TL;DR: A simple yet effective baseline for continuous, unsupervised novel class discovery uses a frozen self-supervised pre-trained model with a learnable classifier that outperforms complex state-of-the-art methods.
ICPR 2024 , Kolkata, India

Less is more: Summarizing Patch Tokens for efficient Multi-Label Class-Incremental Learning
Thomas De Min , Massimiliano Mancini , Stéphane Lathuilière , Subhankar Roy , Elisa Ricci
TL;DR: MULTI-LANE is a prompt-tuning method for multi-label class-incremental learning that eliminates prompt selection by using task-specific pathways and summarized patch token embeddings.
CoLLAs 2024 , Pisa, Italy

Simplifying open-set video domain adaptation with contrastive learning
Giacomo Zara , Victor Turrisi da Costa , Subhankar Roy , Paolo Rota , Elisa Ricci
TL;DR: We address open-set video domain adaptation with a unified contrastive learning framework that learns discriminative and well-clustered features. We show that discriminative feature space simplifies the separation of the unknown classes from the known ones.
CVIU 2024 , Journal

Collaborating Foundation models for Domain Generalized Semantic Segmentation
Yasser Benigmim , Subhankar Roy , Slim Essid , Vicky Kalogeiton , Stéphane Lathuilière
TL;DR: CLOUDS is a general purpose domain generalization framework that integrates the foundation models CLIP, Stable Diffusion, LLM and SAM into a single framework.
CVPR 2024 , Seattle, USA

Democratizing Fine-grained Visual Recognition with Large Language Models
Mingxuan Liu , Subhankar Roy , Wenjing Li , Zhun Zhong , Nicu Sebe , Elisa Ricci
TL;DR: FineR leverages large language models to identify fine-grained image categories without expert annotations, by interpreting visual attributes as text. This allows it to reason about subtle differences between species or objects, outperforming current FGVR methods.
ICLR 2024 , Vienna, Austria

The Unreasonable Effectiveness of Large Language-Vision Models for Source-Free Video Domain Adaptation
Giacomo Zara , Alessandro Conti , Subhankar Roy , Stéphane Lathuilière , Paolo Rota , Elisa Ricci
TL;DR: In this work we exploit “web-supervision” from Large Language-Vision Models (LLVMs) to address Source-Free Video Unsupervised Domain Adaptation, driven by the rationale that LLVMs contain a rich world prior surprisingly robust to domainshift. Our parameter-efficient method, which we name Domain Adaptation with Large Language-Vision models (DALLV), distills the world prior and complementary source model information into a student network tailored for the target domain.
ICCV 2023 , Paris, France

AutoLabel: CLIP-based framework for Open-set Video Domain Adaptation
Giacomo Zara , Subhankar Roy , Paolo Rota , Elisa Ricci
TL;DR: CLIP's zero-shot protocol requires oracle knowledge about the target-private label names. To circumvent the impossibility of the knowledge of label names, we propose AutoLabel that automatically discovers and generates object-centric compositional candidate target-private class names.
CVPR 2023 , Vancouver, Canada

Cooperative Self-Training for Multi-Target Adaptive Semantic Segmentation
Yangsong Zhang , Subhankar Roy , Hongtao Lu , Elisa Ricci , Stéphane Lathuilière
TL;DR: Self-training with cross-domain feature stylization and uncertainty quantification in the model predictions improves multi-target domain adaptation in semantic segmentation
WACV 2023 , Waikoloa, Hawaii, USA

Class-incremental Novel Class Discovery
Subhankar Roy , Mingxuan Liu , Zhun Zhong , Nicu Sebe , Elisa Ricci
TL;DR: A self-training strategy that exploits both clustering and old class prototypes to learn a joint classifier for all the base and novel classes. It also uses feature-level knowledge distillation to prevent forgetting.
ECCV 2022 , Tel Aviv, Israel

Uncertainty-guided Source-free Domain Adaptation
Subhankar Roy , Martin Trapp , Andrea Pilzer , Juho Kannala , Nicu Sebe , Elisa Ricci , Arno Solin
TL;DR: We propose quantifying the uncertainty in the source model predictions (using Laplace Approximation) and utilizing it to guide the target adaptation (e.g. maximizing Mutual Information)
ECCV 2022 , Tel Aviv, Israel

Curriculum Graph Co-Teaching for Multi-Target Domain Adaptation
Subhankar Roy , Evgeny Krivosheev , Zhun Zhong , Nicu Sebe , Elisa Ricci
TL;DR: Curriculum Graph Co-Teaching uses a dual classifier head, with one of them being a graph convolutional network for aggregating features from similar samples across the domains, in order to obtain reliable pseudo labels in the target domain.
CVPR 2021 , Virtual

Neighborhood Contrastive Learning for Novel Class Discovery
Zhun Zhong , Enrico Fini , Subhankar Roy , Zhiming Luo , Elisa Ricci , Nicu Sebe
TL;DR: NCL learns discriminative representations by enforcing a query to be close to its correlated view (augmented-positive) and its pseudo-positives (neighbors), as well as to be far from the negatives. We also generate hard negatives by mixing between labeled and unlabeled features
CVPR 2021 , Virtual

TriGAN: image-to-image translation for multi-source domain adaptation
TL;DR: A generative method for multi-source domain adaptation that learns an universal generator for stylizing the images of any labelled source domain to appear like the target domain. Whitening and Coloring transformations are used in TriGAN's universal generator.
MVA 2021 , Journal

Motion-supervised Co-Part Segmentation
TL;DR: We proposed a self-supervised method for co-part segmentation that leverages a collection of videos. The network learns to predict part segments together with a representation of the motion between two frames from a given video, which permits reconstruction of the target image.
ICPR 2021 , Virtual

Deep Learning for Classification and Localization of COVID-19 Markers in Point-of-Care Lung Ultrasound
Subhankar Roy , Willi Menapace , Sebastiaan Oei , Ben Luijten , Enrico Fini , el. al
TL;DR: Studies the application of deep learning techniques (weakly supervised localization, video level score aggregation and pixel-level classification) on Lung Ultrasound Images. Several evaluation metrics have been introduced to properly evaluate on the proposed lung ultrasound benchmark.
TMI 2020 , Journal

Metric-Learning-Based Deep Hashing Network for Content-Based Retrieval of Remote Sensing Images
Subhankar Roy , Enver Sangineto , Begüm Demir , Nicu Sebe
TL;DR: A metric-learning-based hashing network, which implicitly uses a big, pretrained DNN as an intermediate representation step without the need of retraining or fine-tuning. Our method learns a semantic-based metric space where the features are optimized for the target retrieval task
GRSL 2020 , Letters

Regularized Evolutionary Algorithm for Dynamic Neural Topology Search
Cristiano Saltori , Subhankar Roy , Nicu Sebe , Giovanni Iacca
TL;DR: We propose a network architecture search method that leverages evolutionary algorithm to evolve a dynamic image classifier. Our proposed Regularized Evolutionary Algorithm has lower memory footprint compared to the existing literature and yet achieves competitive performance wrt the state of the art
ICIAP 2019 , Trento, Italy

Unsupervised Domain Adaptation Using Full-Feature Whitening and Colouring
Subhankar Roy , Aliaksandr Siarohin , Nicu Sebe
TL;DR: It addresses unsupervised domain adaptation by using embedded full-whitening and coloring transformation blocks \(\text{F}^2\text{WCT}\). The proposed \(\text{F}^2\text{WCT}\) optimally aligns the feature distributions by ensuring that the source and target features have identical covariance matrices.
ICIAP 2019 , Trento, Italy

Unsupervised domain adaptation using feature-whitening and consensus loss
Subhankar Roy , Aliaksandr Siarohin , Enver Sangineto , Samuel Rota Bulo , Nicu Sebe , Elisa Ricci
TL;DR: We propose domain alignment layers that implement feature whitening for the purpose of matching source and target feature distributions. Additionally, we leverage the unlabeled target data by proposing the Min-Entropy Consensus loss, which regularizes training
CVPR 2019 , Long Beach, California, USA

Semantic-Fusion Gans for Semi-Supervised Satellite Image Classification
Subhankar Roy , Enver Sangineto , Begüm Demir , Nicu Sebe
TL;DR: We propose a novel method for semi-supervised classification of satellite images using generative adversarial networks. The representation of the visual information is fed to the discriminator by means of two different channels the original image and its “semantic” representation, the latter being obtained by means of an external network trained on ImageNet.
ICIP 2018 , Athens, Greece