TWC issue #3

State-of-the-art papers with Github avatars of researchers who released code, models (in most cases) and demo apps (in few cases) along with their paper. Image created from papers described below

Consolidation of daily twitter posts  tracking SOTA changes between 15–21 August 2022

  • Long tail learning
  • Skeleton based action recognition
  • Semantic segmentation - Weakly supervised; Medical Image Segmentation
  • Anomaly detection - supervised and unsupervised
  • Hand pose estimation

Official code release (with pre-trained models in most cases) also available for these tasks


Long tail learning

Image from paper

Paper: Balanced Contrastive Learning for Long-Tailed Visual Recognition

Github code released by Jianggang Zhu (first author in paper) Model link: Pretrained models on Github page

Notes:  This paper proposes a novel loss for representation learning of imbalanced data. The loss called "balanced contrastive learning" helps the model perform competitively on long-tail benchmark datasets.

Model Name: BCL(ResNet-32)

Score (↓) : 46.1 (Prev: 46.45)

Δ:  .35  (Metric: Error rate )

Dataset: CIFAR-100

Demo page  link? None to date

Google colab link? None to date

Container image? None to date


Skeleton based action recognition

Image from paper

Paper: Spatial Temporal Graph Attention Network for Skeleton-Based Action Recognition

Github code  released by Lianyu Hu (first author in paper) Model link: None to date

Notes: This model claims to be robust feature extractor the the tasks of  skeleton based action recognition. This is accomplished by reducing the spatiotemporal feature redundancy.

Code:

Model Name: STGAT

Score (↑) : 90.4 (Prev: 89.9)

Δ:   .5  (Metric: Accuracy )

Dataset: NTU RGB+D 120 (link appears to be broken)

Model links. None to date

Demo page link? None to date

Google colab link? None to date

Container image? None to date


Semantic segmentation

Weakly supervised semantic segmentation

Image from paper

Paper: RecurSeed and EdgePredictMix: Single-stage Learning is Sufficient for Weakly-Supervised Semantic Segmentation

Github code  released by Sanghyun Jo (first author in paper) Model link: Pretrained models on Github page

Notes: This paper proposes a solution to reducing false negatives and false positives in semantic segmentation by alternatively reducing both and converging on an optimal junction that minimizes both errors.

Model Name: RS+EPM

Score (↑) : 74.4 (Prev: 72.2); 46.4 (Prev (44.7)

Δ:     2.2 (Metric: Mean IoU); 1.7 (IoU)

Dataset: PASCAL VOC 2012 val, COCO 2014 val

Demo page link? None to date

Google colab link

Container image? None to date


Medical Image Segmentation

Image from paper

Paper: FCN-Transformer Feature Fusion for Polyp Segmentation

Github link released by Edward Sanderson (first author in paper) Model link: Pretrained models on Github page

Notes: This model leverages the strength of transformers and convolutions in the detection of lesions and classification in colonscopy images.

Model Name: FCBFormer

Score (↑) : 0.9385 (Prev:  .9357)

Δ:  0.0028  (Metric: mean DICE) The DICE score quantifies the pixel-wise degree of similarity between the model predicted segmentation mask and the ground truth, and ranges from 0 (no similarity) to 1 (identical)

Dataset: Kvasir-SEG

Demo page link? None to date

Google colab link? None to date

Container image? None to date


Anomaly detection

Unsupervised anomaly detection

Image from paper

Paper: Semi-orthogonal Embedding for Efficient Unsupervised Anomaly Segmentation

Github code  released by Jin-Hwa Kim (first author in paper) Model link: None to date

Notes: Anomaly segmentation tends to be a harder problem than anomaly detection in general. Autoencoders and GAN approaches to anomaly detection by reducing reconstruction errors also tend to reconstruct anomalous samples too when the models have sufficient capacity.  This paper uses Mahalanobis distance as a measure of how far an anomalous samples is from the mean of distribution of normal samples.  Semi orthogonal embeddings are learned as a  low rank approximation to compute the Mahalanobis distance of the input sample from the mean of the distribution of normal samples.  

Model Name: Semi-orthogonal

Score (↑) : 96

Δ:    96 (Metric: ) Segmentation AUROC

Dataset: KolektorSDD (surface defect)

Model links. None to date

Demo page link? None to date

Google colab link? None to date

Container image? None to date


Supervised anomaly detection

Image from paper

Paper: Pixel-wise Energy-biased Abstention Learning for Anomaly Segmentation on Complex Urban Driving Scenes

Github code  released by Yuyuan Liu (second author in paper) Model link: None to date

Notes: This paper performs anomaly detection by jointly training an energy based model(EBM)  and an abstention learning(AL) model where EBM associates high energy with anomalous pixels while the AL  model adaptively ascribes low penalty for being included in the anomalous class

Model Name: PEBAL

Score (↑) : 44.17 (Prev: 43.22)

Δ:   .95 (Metric: AP)

Dataset: Fishyscapes L&F

Model links. Not released to date

Demo page link? None to date

Google colab link? None to date

Container image? None to date


Hand Pose estimation

Paper: Efficient Virtual View Selection for 3D Hand Pose Estimation

Github code released by ME495  Model link: Pretrained models on Github page

Notes: This paper attempts to solve the problem of unsatisfactory pose estimation results of existing methods due to hand occlusion and view variations. It automatically selects multiple viewpoints for pose estimation and fuses the results to obtain a robust pose estimation.

Model Name: Virtual View Selection

Score (↓) : 6.4 (Prev: 7.48)

Δ:   1.08 (Metric: Average 3D error)

Dataset: NYU Hands

Demo page link? None to date

Google colab link? None to date

Container image? None to date