Subhankar Roy - Personal Website

Publications

Organizing Unstructured Image Collections using Natural Language

Mingxuan Liu , Zhun Zhong , Jun Li , Gianni Franchi , Subhankar Roy , Elisa Ricci

TL;DR: TeDeSC tackles Semantic Multiple Clustering by automatically discovering interpretable clustering criteria from large image collections using text-driven reasoning, enabling unsupervised organization and analysis without human input.

arXiv 2024 , Under review

arXiv

Weighted Ensemble Models Are Strong Continual Learners

Imad Eddine Marouf , Subhankar Roy , Enzo Tartaglione , Stéphane Lathuilière

TL;DR: To address the stability-plasticity trade-off in continual learning, we propose to perform weight-ensembling of the model parameters of the current and previous tasks.

ECCV 2024 , Milan, Italy

Proceedings arXiv GitHub

One-Shot Unlearning of Personal Identities

Thomas De Min , Subhankar Roy , Stéphane Lathuilière , Elisa Ricci , Massimiliano Mancini

TL;DR: O-UPI evaluates unlearning models without access to training data, proposing a meta-learning method to forget personal identities from a single image.

arXiv 2024 , Under, review

arXiv

Large-scale pre-trained models are surprisingly strong in incremental novel class discovery

Mingxuan Liu , Subhankar Roy , Zhun Zhong , Nicu Sebe , Elisa Ricci

TL;DR: A simple yet effective baseline for continuous, unsupervised novel class discovery uses a frozen self-supervised pre-trained model with a learnable classifier that outperforms complex state-of-the-art methods.

ICPR 2024 , Kolkata, India

Proceedings arXiv GitHub

Less is more: Summarizing Patch Tokens for efficient Multi-Label Class-Incremental Learning

Thomas De Min , Massimiliano Mancini , Stéphane Lathuilière , Subhankar Roy , Elisa Ricci

TL;DR: MULTI-LANE is a prompt-tuning method for multi-label class-incremental learning that eliminates prompt selection by using task-specific pathways and summarized patch token embeddings.

CoLLAs 2024 , Pisa, Italy

arXiv GitHub

Simplifying open-set video domain adaptation with contrastive learning

Giacomo Zara , Victor Turrisi da Costa , Subhankar Roy , Paolo Rota , Elisa Ricci

TL;DR: We address open-set video domain adaptation with a unified contrastive learning framework that learns discriminative and well-clustered features. We show that discriminative feature space simplifies the separation of the unknown classes from the known ones.

CVIU 2024 , Journal

Proceedings arXiv GitHub

Collaborating Foundation models for Domain Generalized Semantic Segmentation

Yasser Benigmim , Subhankar Roy , Slim Essid , Vicky Kalogeiton , Stéphane Lathuilière

TL;DR: CLOUDS is a general purpose domain generalization framework that integrates the foundation models CLIP, Stable Diffusion, LLM and SAM into a single framework.

CVPR 2024 , Seattle, USA

arXiv GitHub

Democratizing Fine-grained Visual Recognition with Large Language Models

Mingxuan Liu , Subhankar Roy , Wenjing Li , Zhun Zhong , Nicu Sebe , Elisa Ricci

TL;DR: FineR leverages large language models to identify fine-grained image categories without expert annotations, by interpreting visual attributes as text. This allows it to reason about subtle differences between species or objects, outperforming current FGVR methods.

ICLR 2024 , Vienna, Austria

Proceedings GitHub

The Unreasonable Effectiveness of Large Language-Vision Models for Source-Free Video Domain Adaptation

Giacomo Zara , Alessandro Conti , Subhankar Roy , Stéphane Lathuilière , Paolo Rota , Elisa Ricci

TL;DR: In this work we exploit “web-supervision” from Large Language-Vision Models (LLVMs) to address Source-Free Video Unsupervised Domain Adaptation, driven by the rationale that LLVMs contain a rich world prior surprisingly robust to domainshift. Our parameter-efficient method, which we name Domain Adaptation with Large Language-Vision models (DALLV), distills the world prior and complementary source model information into a student network tailored for the target domain.

ICCV 2023 , Paris, France

Proceedings arXiv GitHub

AutoLabel: CLIP-based framework for Open-set Video Domain Adaptation

Giacomo Zara , Subhankar Roy , Paolo Rota , Elisa Ricci

TL;DR: CLIP's zero-shot protocol requires oracle knowledge about the target-private label names. To circumvent the impossibility of the knowledge of label names, we propose AutoLabel that automatically discovers and generates object-centric compositional candidate target-private class names.

CVPR 2023 , Vancouver, Canada

Proceedings arXiv GitHub

Cooperative Self-Training for Multi-Target Adaptive Semantic Segmentation

Yangsong Zhang , Subhankar Roy , Hongtao Lu , Elisa Ricci , Stéphane Lathuilière

TL;DR: Self-training with cross-domain feature stylization and uncertainty quantification in the model predictions improves multi-target domain adaptation in semantic segmentation

WACV 2023 , Waikoloa, Hawaii, USA

arXiv GitHub

Class-incremental Novel Class Discovery

Subhankar Roy , Mingxuan Liu , Zhun Zhong , Nicu Sebe , Elisa Ricci

TL;DR: A self-training strategy that exploits both clustering and old class prototypes to learn a joint classifier for all the base and novel classes. It also uses feature-level knowledge distillation to prevent forgetting.

ECCV 2022 , Tel Aviv, Israel

Proceedings arXiv GitHub

Uncertainty-guided Source-free Domain Adaptation

Subhankar Roy , Martin Trapp , Andrea Pilzer , Juho Kannala , Nicu Sebe , Elisa Ricci , Arno Solin

TL;DR: We propose quantifying the uncertainty in the source model predictions (using Laplace Approximation) and utilizing it to guide the target adaptation (e.g. maximizing Mutual Information)

ECCV 2022 , Tel Aviv, Israel

Proceedings arXiv GitHub

Curriculum Graph Co-Teaching for Multi-Target Domain Adaptation

Subhankar Roy , Evgeny Krivosheev , Zhun Zhong , Nicu Sebe , Elisa Ricci

TL;DR: Curriculum Graph Co-Teaching uses a dual classifier head, with one of them being a graph convolutional network for aggregating features from similar samples across the domains, in order to obtain reliable pseudo labels in the target domain.

CVPR 2021 , Virtual

Project Page Proceedings arXiv GitHub

Neighborhood Contrastive Learning for Novel Class Discovery

Zhun Zhong , Enrico Fini , Subhankar Roy , Zhiming Luo , Elisa Ricci , Nicu Sebe

TL;DR: NCL learns discriminative representations by enforcing a query to be close to its correlated view (augmented-positive) and its pseudo-positives (neighbors), as well as to be far from the negatives. We also generate hard negatives by mixing between labeled and unlabeled features

CVPR 2021 , Virtual

Proceedings arXiv GitHub

TriGAN: image-to-image translation for multi-source domain adaptation

Subhankar Roy , Aliaksandr Siarohin , Enver Sangineto , Nicu Sebe , Elisa Ricci

TL;DR: A generative method for multi-source domain adaptation that learns an universal generator for stylizing the images of any labelled source domain to appear like the target domain. Whitening and Coloring transformations are used in TriGAN's universal generator.

MVA 2021 , Journal

Proceedings

Motion-supervised Co-Part Segmentation

Aliaksandr Siarohin , Subhankar Roy , Stéphane Lathuilière , Sergey Tulyakov , Elisa Ricci , Nicu Sebe

TL;DR: We proposed a self-supervised method for co-part segmentation that leverages a collection of videos. The network learns to predict part segments together with a representation of the motion between two frames from a given video, which permits reconstruction of the target image.

ICPR 2021 , Virtual

Proceedings GitHub Video

Deep Learning for Classification and Localization of COVID-19 Markers in Point-of-Care Lung Ultrasound

Subhankar Roy , Willi Menapace , Sebastiaan Oei , Ben Luijten , Enrico Fini , el. al

TL;DR: Studies the application of deep learning techniques (weakly supervised localization, video level score aggregation and pixel-level classification) on Lung Ultrasound Images. Several evaluation metrics have been introduced to properly evaluate on the proposed lung ultrasound benchmark.

TMI 2020 , Journal

Proceedings GitHub

Metric-Learning-Based Deep Hashing Network for Content-Based Retrieval of Remote Sensing Images

Subhankar Roy , Enver Sangineto , Begüm Demir , Nicu Sebe

TL;DR: A metric-learning-based hashing network, which implicitly uses a big, pretrained DNN as an intermediate representation step without the need of retraining or fine-tuning. Our method learns a semantic-based metric space where the features are optimized for the target retrieval task

GRSL 2020 , Letters

Proceedings GitHub

Regularized Evolutionary Algorithm for Dynamic Neural Topology Search

Cristiano Saltori , Subhankar Roy , Nicu Sebe , Giovanni Iacca

TL;DR: We propose a network architecture search method that leverages evolutionary algorithm to evolve a dynamic image classifier. Our proposed Regularized Evolutionary Algorithm has lower memory footprint compared to the existing literature and yet achieves competitive performance wrt the state of the art

ICIAP 2019 , Trento, Italy

Proceedings arXiv

Unsupervised Domain Adaptation Using Full-Feature Whitening and Colouring

Subhankar Roy , Aliaksandr Siarohin , Nicu Sebe

TL;DR: It addresses unsupervised domain adaptation by using embedded full-whitening and coloring transformation blocks \(\text{F}^2\text{WCT}\). The proposed \(\text{F}^2\text{WCT}\) optimally aligns the feature distributions by ensuring that the source and target features have identical covariance matrices.

ICIAP 2019 , Trento, Italy

Proceedings

Unsupervised domain adaptation using feature-whitening and consensus loss

Subhankar Roy , Aliaksandr Siarohin , Enver Sangineto , Samuel Rota Bulo , Nicu Sebe , Elisa Ricci

TL;DR: We propose domain alignment layers that implement feature whitening for the purpose of matching source and target feature distributions. Additionally, we leverage the unlabeled target data by proposing the Min-Entropy Consensus loss, which regularizes training

CVPR 2019 , Long Beach, California, USA

Proceedings arXiv GitHub

Semantic-Fusion Gans for Semi-Supervised Satellite Image Classification

Subhankar Roy , Enver Sangineto , Begüm Demir , Nicu Sebe

TL;DR: We propose a novel method for semi-supervised classification of satellite images using generative adversarial networks. The representation of the visual information is fed to the discriminator by means of two different channels the original image and its “semantic” representation, the latter being obtained by means of an external network trained on ImageNet.

ICIP 2018 , Athens, Greece

Proceedings GitHub

Hi there! 👋

Interests

Teaching

Education

News

Publications