TWC issue #3

Consolidation of daily twitter posts tracking SOTA changes between 15–21 August 2022
- Long tail learning
- Skeleton based action recognition
- Semantic segmentation - Weakly supervised; Medical Image Segmentation
- Anomaly detection - supervised and unsupervised
- Hand pose estimation
Official code release (with pre-trained models in most cases) also available for these tasks
Long tail learning

Paper: Balanced Contrastive Learning for Long-Tailed Visual Recognition
Github code released by Jianggang Zhu (first author in paper) Model link: Pretrained models on Github page
Notes: This paper proposes a novel loss for representation learning of imbalanced data. The loss called "balanced contrastive learning" helps the model perform competitively on long-tail benchmark datasets.
Model Name: BCL(ResNet-32)
Score (↓) : 46.1 (Prev: 46.45)
Δ: .35 (Metric: Error rate )
Dataset: CIFAR-100
Demo page link? None to date
Google colab link? None to date
Container image? None to date
Skeleton based action recognition

Paper: Spatial Temporal Graph Attention Network for Skeleton-Based Action Recognition
Github code released by Lianyu Hu (first author in paper) Model link: None to date
Notes: This model claims to be robust feature extractor the the tasks of skeleton based action recognition. This is accomplished by reducing the spatiotemporal feature redundancy.
Code:
Model Name: STGAT
Score (↑) : 90.4 (Prev: 89.9)
Δ: .5 (Metric: Accuracy )
Dataset: NTU RGB+D 120 (link appears to be broken)
Model links. None to date
Demo page link? None to date
Google colab link? None to date
Container image? None to date
Semantic segmentation
Weakly supervised semantic segmentation

Github code released by Sanghyun Jo (first author in paper) Model link: Pretrained models on Github page
Notes: This paper proposes a solution to reducing false negatives and false positives in semantic segmentation by alternatively reducing both and converging on an optimal junction that minimizes both errors.
Model Name: RS+EPM
Score (↑) : 74.4 (Prev: 72.2); 46.4 (Prev (44.7)
Δ: 2.2 (Metric: Mean IoU); 1.7 (IoU)
Dataset: PASCAL VOC 2012 val, COCO 2014 val
Demo page link? None to date
Google colab link
Container image? None to date
Medical Image Segmentation

Paper: FCN-Transformer Feature Fusion for Polyp Segmentation
Github link released by Edward Sanderson (first author in paper) Model link: Pretrained models on Github page
Notes: This model leverages the strength of transformers and convolutions in the detection of lesions and classification in colonscopy images.
Model Name: FCBFormer
Score (↑) : 0.9385 (Prev: .9357)
Δ: 0.0028 (Metric: mean DICE) The DICE score quantifies the pixel-wise degree of similarity between the model predicted segmentation mask and the ground truth, and ranges from 0 (no similarity) to 1 (identical)
Dataset: Kvasir-SEG
Demo page link? None to date
Google colab link? None to date
Container image? None to date
Anomaly detection
Unsupervised anomaly detection

Paper: Semi-orthogonal Embedding for Efficient Unsupervised Anomaly Segmentation
Github code released by Jin-Hwa Kim (first author in paper) Model link: None to date
Notes: Anomaly segmentation tends to be a harder problem than anomaly detection in general. Autoencoders and GAN approaches to anomaly detection by reducing reconstruction errors also tend to reconstruct anomalous samples too when the models have sufficient capacity. This paper uses Mahalanobis distance as a measure of how far an anomalous samples is from the mean of distribution of normal samples. Semi orthogonal embeddings are learned as a low rank approximation to compute the Mahalanobis distance of the input sample from the mean of the distribution of normal samples.
Model Name: Semi-orthogonal
Score (↑) : 96
Δ: 96 (Metric: ) Segmentation AUROC
Dataset: KolektorSDD (surface defect)
Model links. None to date
Demo page link? None to date
Google colab link? None to date
Container image? None to date
Supervised anomaly detection

Github code released by Yuyuan Liu (second author in paper) Model link: None to date
Notes: This paper performs anomaly detection by jointly training an energy based model(EBM) and an abstention learning(AL) model where EBM associates high energy with anomalous pixels while the AL model adaptively ascribes low penalty for being included in the anomalous class
Model Name: PEBAL
Score (↑) : 44.17 (Prev: 43.22)
Δ: .95 (Metric: AP)
Dataset: Fishyscapes L&F
Model links. Not released to date
Demo page link? None to date
Google colab link? None to date
Container image? None to date
Hand Pose estimation

Paper: Efficient Virtual View Selection for 3D Hand Pose Estimation
Github code released by ME495 Model link: Pretrained models on Github page
Notes: This paper attempts to solve the problem of unsatisfactory pose estimation results of existing methods due to hand occlusion and view variations. It automatically selects multiple viewpoints for pose estimation and fuses the results to obtain a robust pose estimation.
Model Name: Virtual View Selection
Score (↓) : 6.4 (Prev: 7.48)
Δ: 1.08 (Metric: Average 3D error)
Dataset: NYU Hands
Demo page link? None to date
Google colab link? None to date
Container image? None to date