profile_picture
Subhankar Roy, Ph.D
Lecturer (or Assistant Professor), University of Aberdeen, UK
subhankar9.07@gmail.com

Hi there! 👋

I am a Lecturer (or Assistant Professor) at the University of Aberdeen, UK. The primary focus of my research is to equip deep learning models with better generalization capabilities under data distribution and semantic shifts. In particular, my main research areas include multimodal learning, domain adaptation, continual learning, novel class discovery, and so on.

I received my Ph.D. degree in 2022 with cum-laude from the University of Trento, where I was advised by Prof. Elisa Ricci and Prof. Nicu Sebe. During my Ph.D. I spent six wonderful months as an intern in Naver Labs Europe, where I had the pleasure to work with Gabriela Csurka and other brilliant colleagues. I was also fortunate to collaborate with Prof. Arno Solin from Aalto University, with whom I worked on bayesian neural networks. Prior to the doctoral degree, in 2018 I received the M.Sc. degree in Telecommunication engineering from the University of Trento with the maximum marks. During my masters I worked with Prof. Begüm Demir on applying deep learning techniques to tackle remote sensing problems. Before becoming a lecturer, I worked as a postdoctoral researcher at the Multimedia Team of Telecom Paris.


Interests

  • Domain Adaptation
  • Continual Learning
  • Open World Recognition
  • Weakly Supervised Learning
  • Multi-modal Learning
  • Visual Reasoning

Teaching

  • Advanced Programming (CS551P) at University of Aberdeen
  • Modelling and Problem Solving for Computing (CS1029) at University of Aberdeen
  • Deep Learning at University of Trento

Education

University of Trento, Italy
2018 - 2022
Ph.D. Information and Communication Engineering
graduated with cum-laude
University of Trento, Italy
2015 - 2018
M.Sc. Telecommunications Engineering
graduated with 110/110
West Bengal University of Technology
2009 - 2013
B.Tech. Electrical Engineering

News

  • 1 paper accepted to CVPR 2024 , February 2024.
  • 1 paper accepted to CVIU , February 2024.
  • 1 paper accepted to ICLR 2024 , January 2024.
  • 1 paper accepted to ICCV 2023 , October 2023.
  • Joined University of Aberdeen as a Lecturer (or Assistant Professor) , September 2023.
  • 1 paper accepted to Conference on Lifelong Learning Agents (CoLLAs) 2023 , May 2023.
  • 1 paper accepted to CVPR 2023 (+ 2 workshops) , April 2023.
  • I have joined Telecom Paris as a postdoctoral researcher , February 2023.
  • I have joined Fondazione Bruno Kessler as a deep learning researcher , November 2022.
  • I received my Ph.D (summa cum laude). My thesis is available here , September 2022.

Publications


Simplifying open-set video domain adaptation with contrastive learning
Giacomo Zara , Victor Turrisi da Costa , Subhankar Roy , Paolo Rota , Elisa Ricci
TL;DR: We address open-set video domain adaptation with a unified contrastive learning framework that learns discriminative and well-clustered features. We show that discriminative feature space simplifies the separation of the unknown classes from the known ones.
CVIU 2024 , Journal

Collaborating Foundation models for Domain Generalized Semantic Segmentation
Yasser Benigmim , Subhankar Roy , Slim Essid , Vicky Kalogeiton , Stéphane Lathuilière
TL;DR: CLOUDS is a general purpose domain generalization framework that integrates the foundation models CLIP, Stable Diffusion, LLM and SAM into a single framework.
CVPR 2024 , Seattle, USA

Democratizing Fine-grained Visual Recognition with Large Language Models
Mingxuan Liu , Subhankar Roy , Wenjing Li , Zhun Zhong , Nicu Sebe , Elisa Ricci
TL;DR: FineR leverages large language models to identify fine-grained image categories without expert annotations, by interpreting visual attributes as text. This allows it to reason about subtle differences between species or objects, outperforming current FGVR methods.
ICLR 2024 , Vienna, Austria

The Unreasonable Effectiveness of Large Language-Vision Models for Source-Free Video Domain Adaptation
Giacomo Zara , Alessandro Conti , Subhankar Roy , Stéphane Lathuilière , Paolo Rota , Elisa Ricci
TL;DR: In this work we exploit “web-supervision” from Large Language-Vision Models (LLVMs) to address Source-Free Video Unsupervised Domain Adaptation, driven by the rationale that LLVMs contain a rich world prior surprisingly robust to domainshift. Our parameter-efficient method, which we name Domain Adaptation with Large Language-Vision models (DALLV), distills the world prior and complementary source model information into a student network tailored for the target domain.
ICCV 2023 , Paris, France

AutoLabel: CLIP-based framework for Open-set Video Domain Adaptation
Giacomo Zara , Subhankar Roy , Paolo Rota , Elisa Ricci
TL;DR: CLIP's zero-shot protocol requires oracle knowledge about the target-private label names. To circumvent the impossibility of the knowledge of label names, we propose AutoLabel that automatically discovers and generates object-centric compositional candidate target-private class names.
CVPR 2023 , Vancouver, Canada

Cooperative Self-Training for Multi-Target Adaptive Semantic Segmentation
Yangsong Zhang , Subhankar Roy , Hongtao Lu , Elisa Ricci , Stéphane Lathuilière
TL;DR: Self-training with cross-domain feature stylization and uncertainty quantification in the model predictions improves multi-target domain adaptation in semantic segmentation
WACV 2023 , Waikoloa, Hawaii, USA

Class-incremental Novel Class Discovery
Subhankar Roy , Mingxuan Liu , Zhun Zhong , Nicu Sebe , Elisa Ricci
TL;DR: A self-training strategy that exploits both clustering and old class prototypes to learn a joint classifier for all the base and novel classes. It also uses feature-level knowledge distillation to prevent forgetting.
ECCV 2022 , Tel Aviv, Israel

Uncertainty-guided Source-free Domain Adaptation
Subhankar Roy , Martin Trapp , Andrea Pilzer , Juho Kannala , Nicu Sebe , Elisa Ricci , Arno Solin
TL;DR: We propose quantifying the uncertainty in the source model predictions (using Laplace Approximation) and utilizing it to guide the target adaptation (e.g. maximizing Mutual Information)
ECCV 2022 , Tel Aviv, Israel

Curriculum Graph Co-Teaching for Multi-Target Domain Adaptation
Subhankar Roy , Evgeny Krivosheev , Zhun Zhong , Nicu Sebe , Elisa Ricci
TL;DR: Curriculum Graph Co-Teaching uses a dual classifier head, with one of them being a graph convolutional network for aggregating features from similar samples across the domains, in order to obtain reliable pseudo labels in the target domain.
CVPR 2021 , Virtual

Neighborhood Contrastive Learning for Novel Class Discovery
Zhun Zhong , Enrico Fini , Subhankar Roy , Zhiming Luo , Elisa Ricci , Nicu Sebe
TL;DR: NCL learns discriminative representations by enforcing a query to be close to its correlated view (augmented-positive) and its pseudo-positives (neighbors), as well as to be far from the negatives. We also generate hard negatives by mixing between labeled and unlabeled features
CVPR 2021 , Virtual

TriGAN: image-to-image translation for multi-source domain adaptation
TL;DR: A generative method for multi-source domain adaptation that learns an universal generator for stylizing the images of any labelled source domain to appear like the target domain. Whitening and Coloring transformations are used in TriGAN's universal generator.
MVA 2021 , Journal

Motion-supervised Co-Part Segmentation
TL;DR: We proposed a self-supervised method for co-part segmentation that leverages a collection of videos. The network learns to predict part segments together with a representation of the motion between two frames from a given video, which permits reconstruction of the target image.
ICPR 2021 , Virtual

Deep Learning for Classification and Localization of COVID-19 Markers in Point-of-Care Lung Ultrasound
Subhankar Roy , Willi Menapace , Sebastiaan Oei , Ben Luijten , Enrico Fini , el. al
TL;DR: Studies the application of deep learning techniques (weakly supervised localization, video level score aggregation and pixel-level classification) on Lung Ultrasound Images. Several evaluation metrics have been introduced to properly evaluate on the proposed lung ultrasound benchmark.
TMI 2020 , Journal

Metric-Learning-Based Deep Hashing Network for Content-Based Retrieval of Remote Sensing Images
Subhankar Roy , Enver Sangineto , Begüm Demir , Nicu Sebe
TL;DR: A metric-learning-based hashing network, which implicitly uses a big, pretrained DNN as an intermediate representation step without the need of retraining or fine-tuning. Our method learns a semantic-based metric space where the features are optimized for the target retrieval task
GRSL 2020 , Letters

Regularized Evolutionary Algorithm for Dynamic Neural Topology Search
Cristiano Saltori , Subhankar Roy , Nicu Sebe , Giovanni Iacca
TL;DR: We propose a network architecture search method that leverages evolutionary algorithm to evolve a dynamic image classifier. Our proposed Regularized Evolutionary Algorithm has lower memory footprint compared to the existing literature and yet achieves competitive performance wrt the state of the art
ICIAP 2019 , Trento, Italy

Unsupervised Domain Adaptation Using Full-Feature Whitening and Colouring
Subhankar Roy , Aliaksandr Siarohin , Nicu Sebe
TL;DR: It addresses unsupervised domain adaptation by using embedded full-whitening and coloring transformation blocks \(\text{F}^2\text{WCT}\). The proposed \(\text{F}^2\text{WCT}\) optimally aligns the feature distributions by ensuring that the source and target features have identical covariance matrices.
ICIAP 2019 , Trento, Italy

Unsupervised domain adaptation using feature-whitening and consensus loss
Subhankar Roy , Aliaksandr Siarohin , Enver Sangineto , Samuel Rota Bulo , Nicu Sebe , Elisa Ricci
TL;DR: We propose domain alignment layers that implement feature whitening for the purpose of matching source and target feature distributions. Additionally, we leverage the unlabeled target data by proposing the Min-Entropy Consensus loss, which regularizes training
CVPR 2019 , Long Beach, California, USA

Semantic-Fusion Gans for Semi-Supervised Satellite Image Classification
Subhankar Roy , Enver Sangineto , Begüm Demir , Nicu Sebe
TL;DR: We propose a novel method for semi-supervised classification of satellite images using generative adversarial networks. The representation of the visual information is fed to the discriminator by means of two different channels the original image and its “semantic” representation, the latter being obtained by means of an external network trained on ImageNet.
ICIP 2018 , Athens, Greece