CVPR 2021 论文和开源项目合集
Posted AI浩
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了CVPR 2021 论文和开源项目合集相关的知识,希望对你有一定的参考价值。
CVPR 2021 论文和开源项目合集(Papers with Code)
地址:https://github.com/amusi/CVPR2021-Papers-with-Code
- Best Paper
- Backbone
- NAS
- GAN
- VAE
- Visual Transformer
- Regularization
- SLAM
- 长尾分布(Long-Tailed)
- 数据增广(Data Augmentation)
- 无监督/自监督(Self-Supervised)
- 半监督(Semi-Supervised)
- 胶囊网络(Capsule Network)
- 图像分类(Image Classification
- 2D目标检测(Object Detection)
- 单/多目标跟踪(Object Tracking)
- 语义分割(Semantic Segmentation)
- 实例分割(Instance Segmentation)
- 全景分割(Panoptic Segmentation)
- 医学图像分割(Medical Image Segmentation)
- 视频目标分割(Video-Object-Segmentation)
- 交互式视频目标分割(Interactive-Video-Object-Segmentation)
- 显著性检测(Saliency Detection)
- 伪装物体检测(Camouflaged Object Detection)
- 协同显著性检测(Co-Salient Object Detection)
- 图像抠图(Image Matting)
- 行人重识别(Person Re-identification)
- 行人搜索(Person Search)
- 视频理解/行为识别(Video Understanding)
- 人脸识别(Face Recognition)
- 人脸检测(Face Detection)
- 人脸活体检测(Face Anti-Spoofing)
- Deepfake检测(Deepfake Detection)
- 人脸年龄估计(Age-Estimation)
- 人脸表情识别(Facial-Expression-Recognition)
- Deepfakes
- 人体解析(Human Parsing)
- 2D/3D人体姿态估计(2D/3D Human Pose Estimation)
- 动物姿态估计(Animal Pose Estimation)
- 手部姿态估计(Hand Pose Estimation)
- Human Volumetric Capture
- 场景文本识别(Scene Text Recognition)
- 图像压缩(Image Compression)
- 模型压缩/剪枝/量化
- 知识蒸馏(Knowledge Distillation)
- 超分辨率(Super-Resolution)
- 去雾(Dehazing)
- 图像恢复(Image Restoration)
- 图像补全(Image Inpainting)
- 图像编辑(Image Editing)
- 图像描述(Image Captioning)
- 字体生成(Font Generation)
- 图像匹配(Image Matching)
- 图像融合(Image Blending)
- 反光去除(Reflection Removal)
- 3D点云分类(3D Point Clouds Classification)
- 3D目标检测(3D Object Detection)
- 3D语义分割(3D Semantic Segmentation)
- 3D全景分割(3D Panoptic Segmentation)
- 3D目标跟踪(3D Object Tracking)
- 3D点云配准(3D Point Cloud Registration)
- 3D点云补全(3D-Point-Cloud-Completion)
- 3D重建(3D Reconstruction)
- 6D位姿估计(6D Pose Estimation)
- 相机姿态估计(Camera Pose Estimation)
- 深度估计(Depth Estimation)
- 立体匹配(Stereo Matching)
- 光流估计(Flow Estimation)
- 车道线检测(Lane Detection)
- 轨迹预测(Trajectory Prediction)
- 人群计数(Crowd Counting)
- 对抗样本(Adversarial-Examples)
- 图像检索(Image Retrieval)
- 视频检索(Video Retrieval)
- 跨模态检索(Cross-modal Retrieval)
- Zero-Shot Learning
- 联邦学习(Federated Learning)
- 视频插帧(Video Frame Interpolation)
- 视觉推理(Visual Reasoning)
- 图像合成(Image Synthesis)
- 视图合成(Visual Synthesis)
- 风格迁移(Style Transfer)
- 布局生成(Layout Generation)
- Domain Generalization
- Domain Adaptation
- Open-Set
- Adversarial Attack
- "人-物"交互(HOI)检测
- 阴影去除(Shadow Removal)
- 虚拟试衣(Virtual Try-On)
- 标签噪声(Label Noise)
- 视频稳像(Video Stabilization)
- 数据集(Datasets)
- 其他(Others)
- 待添加(TODO)
- 不确定中没中(Not Sure)
Best Paper
GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields
-
Homepage: https://m-niemeyer.github.io/project-pages/giraffe/index.html
-
Paper(Oral): https://arxiv.org/abs/2011.12100
-
Code: https://github.com/autonomousvision/giraffe
-
Demo: http://www.youtube.com/watch?v=fIaDXC-qRSg&vq=hd1080&autoplay=1
Backbone
HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers
-
Paper(Oral): https://arxiv.org/abs/2106.06560
-
Code: https://github.com/dingmyu/HR-NAS
BCNet: Searching for Network Width with Bilaterally Coupled Network
- Paper: https://arxiv.org/abs/2105.10533
- Code: None
Decoupled Dynamic Filter Networks
- Homepage: https://thefoxofsky.github.io/project_pages/ddf
- Paper: https://arxiv.org/abs/2104.14107
- Code: https://github.com/thefoxofsky/DDF
Lite-HRNet: A Lightweight High-Resolution Network
- Paper: https://arxiv.org/abs/2104.06403
- https://github.com/HRNet/Lite-HRNet
CondenseNet V2: Sparse Feature Reactivation for Deep Networks
-
Paper: https://arxiv.org/abs/2104.04382
-
Code: https://github.com/jianghaojun/CondenseNetV2
Diverse Branch Block: Building a Convolution as an Inception-like Unit
-
Paper: https://arxiv.org/abs/2103.13425
-
Code: https://github.com/DingXiaoH/DiverseBranchBlock
Scaling Local Self-Attention For Parameter Efficient Visual Backbones
-
Paper(Oral): https://arxiv.org/abs/2103.12731
-
Code: None
ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network
- Paper: https://arxiv.org/abs/2007.00992
- Code: https://github.com/clovaai/rexnet
Involution: Inverting the Inherence of Convolution for Visual Recognition
- Paper: https://github.com/d-li14/involution
- Code: https://arxiv.org/abs/2103.06255
Coordinate Attention for Efficient Mobile Network Design
- Paper: https://arxiv.org/abs/2103.02907
- Code: https://github.com/Andrew-Qibin/CoordAttention
Inception Convolution with Efficient Dilation Search
- Paper: https://arxiv.org/abs/2012.13587
- Code: https://github.com/yifan123/IC-Conv
RepVGG: Making VGG-style ConvNets Great Again
- Paper: https://arxiv.org/abs/2101.03697
- Code: https://github.com/DingXiaoH/RepVGG
NAS
HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers
-
Paper(Oral): https://arxiv.org/abs/2106.06560
-
Code: https://github.com/dingmyu/HR-NAS
BCNet: Searching for Network Width with Bilaterally Coupled Network
- Paper: https://arxiv.org/abs/2105.10533
- Code: None
ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search
- Paper: ttps://arxiv.org/abs/2105.10154
- Code: None
Combined Depth Space based Architecture Search For Person Re-identification
- Paper: https://arxiv.org/abs/2104.04163
- Code: None
DiNTS: Differentiable Neural Network Topology Search for 3D Medical Image Segmentation
- Paper(Oral): https://arxiv.org/abs/2103.15954
- Code: None
HR-NAS: Searching Efficient High-Resolution Neural Architectures with Transformers
- Paper(Oral): None
- Code: https://github.com/dingmyu/HR-NAS
Neural Architecture Search with Random Labels
- Paper: https://arxiv.org/abs/2101.11834
- Code: None
Towards Improving the Consistency, Efficiency, and Flexibility of Differentiable Neural Architecture Search
- Paper: https://arxiv.org/abs/2101.11342
- Code: None
Joint-DetNAS: Upgrade Your Detector with NAS, Pruning and Dynamic Distillation
- Paper: https://arxiv.org/abs/2105.12971
- Code: None
Prioritized Architecture Sampling with Monto-Carlo Tree Search
- Paper: https://arxiv.org/abs/2103.11922
- Code: https://github.com/xiusu/NAS-Bench-Macro
Contrastive Neural Architecture Search with Neural Architecture Comparators
- Paper: https://arxiv.org/abs/2103.05471
- Code: https://github.com/chenyaofo/CTNAS
AttentiveNAS: Improving Neural Architecture Search via Attentive
- Paper: https://arxiv.org/abs/2011.09011
- Code: None
ReNAS: Relativistic Evaluation of Neural Architecture Search
- Paper: https://arxiv.org/abs/1910.01523
- Code: None
HourNAS: Extremely Fast Neural Architecture
- Paper: https://arxiv.org/abs/2005.14446
- Code: None
Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator
- Paper: https://arxiv.org/abs/2103.07289
- Code: https://github.com/eric8607242/SGNAS
OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection
- Paper: https://arxiv.org/abs/2103.04507
- Code: https://github.com/VDIGPKU/OPANAS
Inception Convolution with Efficient Dilation Search
- Paper: https://arxiv.org/abs/2012.13587
- Code: None
GAN
High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network
- Paper: https://arxiv.org/abs/2105.09188
- Code: https://github.com/csjliang/LPTN
- Dataset: https://github.com/csjliang/LPTN
DG-Font: Deformable Generative Networks for Unsupervised Font Generation
-
Paper: https://arxiv.org/abs/2104.03064
-
Code: https://github.com/ecnuycxie/DG-Font
PD-GAN: Probabilistic Diverse GAN for Image Inpainting
- Paper: https://arxiv.org/abs/2105.02201
- Code: https://github.com/KumapowerLIU/PD-GAN
StyleMapGAN: Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing
- Paper: https://arxiv.org/abs/2104.14754
- Code: https://github.com/naver-ai/StyleMapGAN
- Demo Video: https://youtu.be/qCapNyRA_Ng
Drafting and Revision: Laplacian Pyramid Network for Fast High-Quality Artistic Style Transfer
- Paper: https://arxiv.org/abs/2104.05376
- Code: https://github.com/PaddlePaddle/PaddleGAN/
Regularizing Generative Adversarial Networks under Limited Data
- Homepage: https://hytseng0509.github.io/lecam-gan/
- Paper: https://faculty.ucmerced.edu/mhyang/papers/cvpr2021_gan_limited_data.pdf
- Code: https://github.com/google/lecam-gan
Towards Real-World Blind Face Restoration with Generative Facial Prior
- Paper: https://arxiv.org/abs/2101.04061
- Code: None
TediGAN: Text-Guided Diverse Image Generation and Manipulation
-
Homepage: https://xiaweihao.com/projects/tedigan/
-
Paper: https://arxiv.org/abs/2012.03308
-
Code: https://github.com/weihaox/TediGAN
Generative Hierarchical Features from Synthesizing Image
-
Homepage: https://genforce.github.io/ghfeat/
-
Paper(Oral): https://arxiv.org/abs/2007.10379
-
Code: https://github.com/genforce/ghfeat
Teachers Do More Than Teach: Compressing Image-to-Image Models
- Paper: https://arxiv.org/abs/2103.03467
- Code: https://github.com/snap-research/CAT
HistoGAN: Controlling Colors of GAN-Generated and Real Images via Color Histograms
- Paper: https://arxiv.org/abs/2011.11731
- Code: https://github.com/mahmoudnafifi/HistoGAN
pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis
-
Homepage: https://marcoamonteiro.github.io/pi-GAN-website/
-
Paper(Oral): https://arxiv.org/abs/2012.00926
-
Code: None
DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network
- Paper: https://arxiv.org/abs/2103.07893
- Code: None
Diverse Semantic Image Synthesis via Probability Distribution Modeling
- Paper: https://arxiv.org/abs/2103.06878
- Code: https://github.com/tzt101/INADE.git
LOHO: Latent Optimization of Hairstyles via Orthogonalization
- Paper: https://arxiv.org/abs/2103.03891
- Code: None
PISE: Person Image Synthesis and Editing with Decoupled GAN
- Paper: https://arxiv.org/abs/2103.04023
- Code: https://github.com/Zhangjinso/PISE
DeFLOCNet: Deep Image Editing via Flexible Low-level Controls
- Paper: http://raywzy.com/
- Code: http://raywzy.com/
PD-GAN: Probabilistic Diverse GAN for Image Inpainting
- Paper: http://raywzy.com/
- Code: http://raywzy.com/
Efficient Conditional GAN Transfer with Knowledge Propagation across Classes
- Paper: https://www.researchgate.net/publication/349309756_Efficient_Conditional_GAN_Transfer_with_Knowledge_Propagation_across_Classes
- Code: http://github.com/mshahbazi72/cGANTransfer
Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing
- Paper: None
- Code: None
Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs
- Paper: https://arxiv.org/abs/2011.14107
- Code: None
Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation
- Homepage: https://eladrich.github.io/pixel2style2pixel/
- Paper: https://arxiv.org/abs/2008.00951
- Code: https://github.com/eladrich/pixel2style2pixel
A 3D GAN for Improved Large-pose Facial Recognition
- Paper: https://arxiv.org/abs/2012.10545
- Code: None
HumanGAN: A Generative Model of Humans Images
- Paper: https://arxiv.org/abs/2103.06902
- Code: None
ID-Unet: Iterative Soft and Hard Deformation for View Synthesis
- Paper: https://arxiv.org/abs/2103.02264
- Code: https://github.com/MingyuY/Iterative-view-synthesis
CoMoGAN: continuous model-guided image-to-image translation
- Paper(Oral): https://arxiv.org/abs/2103.06879
- Code: https://github.com/cv-rits/CoMoGAN
Training Generative Adversarial Networks in One Stage
- Paper: https://arxiv.org/abs/2103.00430
- Code: None
Closed-Form Factorization of Latent Semantics in GANs
- Homepage: https://genforce.github.io/sefa/
- Paper(Oral): https://arxiv.org/abs/2007.06600
- Code: https://github.com/genforce/sefa
Anycost GANs for Interactive Image Synthesis and Editing
- Paper: https://arxiv.org/abs/2103.03243
- Code: https://github.com/mit-han-lab/anycost-gan
Image-to-image Translation via Hierarchical Style Disentanglement
- Paper: https://arxiv.org/abs/2103.01456
- Code: https://github.com/imlixinyang/HiSD
VAE
Soft-IntroVAE: Analyzing and Improving Introspective Variational Autoencoders
-
Homepage: https://taldatech.github.io/soft-intro-vae-web/
-
Paper: https://arxiv.org/abs/2012.13253
-
Code: https://github.com/taldatech/soft-intro-vae-pytorch
Visual Transformer
1. End-to-End Human Pose and Mesh Reconstruction with Transformers
- Paper: https://arxiv.org/abs/2012.09760
- Code: https://github.com/microsoft/MeshTransformer
2. Temporal-Relational CrossTransformers for Few-Shot Action Recognition
- Paper: https://arxiv.org/abs/2101.06184
- Code: https://github.com/tobyperrett/trx
3. Kaleido-BERT:Vision-Language Pre-training on Fashion Domain
- Paper: https://arxiv.org/abs/2103.16110
- Code: https://github.com/mczhuge/Kaleido-BERT
4. HOTR: End-to-End Human-Object Interaction Detection with Transformers
- Paper: https://arxiv.org/abs/2104.13682
- Code: https://github.com/kakaobrain/HOTR
5. Multi-Modal Fusion Transformer for End-to-End Autonomous Driving
- Paper: https://arxiv.org/abs/2104.09224
- Code: https://github.com/autonomousvision/transfuser
6. Pose Recognition with Cascade Transformers
-
Paper: https://arxiv.org/abs/2104.06976
-
Code: https://github.com/mlpc-ucsd/PRTR
7. Variational Transformer Networks for Layout Generation
- Paper: https://arxiv.org/abs/2104.02416
- Code: None
8. LoFTR: Detector-Free Local Feature Matching with Transformers
- Homepage: https://zju3dv.github.io/loftr/
- Paper: https://arxiv.org/abs/2104.00680
- Code: https://github.com/zju3dv/LoFTR
9. Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
- Paper: https://arxiv.org/abs/2012.15840
- Code: https://github.com/fudan-zvg/SETR
10. Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers
- Paper: https://arxiv.org/abs/2103.16553
- Code: None
11. Transformer Tracking
- Paper: https://arxiv.org/abs/2103.15436
- Code: https://github.com/chenxin-dlut/TransT
12. HR-NAS: Searching Efficient High-Resolution Neural Architectures with Transformers
- Paper(Oral): https://arxiv.org/abs/2106.06560
- Code: https://github.com/dingmyu/HR-NAS
13. MIST: Multiple Instance Spatial Transformer
- Paper: https://arxiv.org/abs/1811.10725
- Code: None
14. Multimodal Motion Prediction with Stacked Transformers
- Paper: https://arxiv.org/abs/2103.11624
- Code: https://decisionforce.github.io/mmTransformer
15. Revamping cross-modal recipe retrieval with hierarchical Transformers and self-supervised learning
-
Paper: https://www.amazon.science/publications/revamping-cross-modal-recipe-retrieval-with-hierarchical-transformers-and-self-supervised-learning
-
Code: https://github.com/amzn/image-to-recipe-transformers
16. Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking
-
Paper(Oral): https://arxiv.org/abs/2103.11681
-
Code: https://github.com/594422814/TransformerTrack
17. Pre-Trained Image Processing Transformer
- Paper: https://arxiv.org/abs/2012.00364
- Code: None
18. End-to-End Video Instance Segmentation with Transformers
- Paper(Oral): https://arxiv.org/abs/2011.14503
- Code: https://github.com/Epiphqny/VisTR
19. UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
- Paper(Oral): https://arxiv.org/abs/2011.09094
- Code: https://github.com/dddzg/up-detr
20. End-to-End Human Object Interaction Detection with HOI Transformer
- Paper: https://arxiv.org/abs/2103.04503
- Code: https://github.com/bbepoch/HoiTransformer
21. Transformer Interpretability Beyond Attention Visualization
- Paper: https://arxiv.org/abs/2012.09838
- Code: https://github.com/hila-chefer/Transformer-Explainability
22. Diverse Part Discovery: Occluded Person Re-Identification With Part-Aware Transformer
- Paper: None
- Code: None
23. LayoutTransformer: Scene Layout Generation With Conceptual and Spatial Diversity
- Paper: None
- Code: None
24. Line Segment Detection Using Transformers without Edges
- Paper(Oral): https://arxiv.org/abs/2101.01909
- Code: None
25. MaX-DeepLab: End-to-End Panoptic Segmentation With Mask Transformers
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Wang_MaX-DeepLab_End-to-End_Panoptic_Segmentation_With_Mask_Transformers_CVPR_2021_paper.html
- Code: None
26. SSTVOS: Sparse Spatiotemporal Transformers for Video Object Segmentation
- Paper(Oral): https://arxiv.org/abs/2101.08833
- Code: https://github.com/dukebw/SSTVOS
27. Facial Action Unit Detection With Transformers
- Paper: None
- Code: None
28. Clusformer: A Transformer Based Clustering Approach to Unsupervised Large-Scale Face and Visual Landmark Recognition
- Paper: None
- Code: None
29. Lesion-Aware Transformers for Diabetic Retinopathy Grading
- Paper: None
- Code: None
30. Topological Planning With Transformers for Vision-and-Language Navigation
- Paper: https://arxiv.org/abs/2012.05292
- Code: None
31. Adaptive Image Transformer for One-Shot Object Detection
- Paper: None
- Code: None
32. Multi-Stage Aggregated Transformer Network for Temporal Language Localization in Videos
- Paper: None
- Code: None
33. Taming Transformers for High-Resolution Image Synthesis
- Homepage: https://compvis.github.io/taming-transformers/
- Paper(Oral): https://arxiv.org/abs/2012.09841
- Code: https://github.com/CompVis/taming-transformers
34. Self-Supervised Video Hashing via Bidirectional Transformers
- Paper: None
- Code: None
35. Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos
- Paper(Oral): https://hehefan.github.io/pdfs/p4transformer.pdf
- Code: None
36. Gaussian Context Transformer
- Paper: None
- Code: None
37. General Multi-Label Image Classification With Transformers
- Paper: https://arxiv.org/abs/2011.14027
- Code: None
38. Bottleneck Transformers for Visual Recognition
- Paper: https://arxiv.org/abs/2101.11605
- Code: None
39. VLN BERT: A Recurrent Vision-and-Language BERT for Navigation
- Paper(Oral): https://arxiv.org/abs/2011.13922
- Code: https://github.com/YicongHong/Recurrent-VLN-BERT
40. Less Is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
- Paper(Oral): https://arxiv.org/abs/2102.06183
- Code: https://github.com/jayleicn/ClipBERT
41. Self-attention based Text Knowledge Mining for Text Detection
- Paper: None
- Code: https://github.com/CVI-SZU/STKM
42. SSAN: Separable Self-Attention Network for Video Representation Learning
- Paper: None
- Code: None
43. Scaling Local Self-Attention For Parameter Efficient Visual Backbones
-
Paper(Oral): https://arxiv.org/abs/2103.12731
-
Code: None
Regularization
Regularizing Neural Networks via Adversarial Model Perturbation
- Paper: https://arxiv.org/abs/2010.04925
- Code: https://github.com/hiyouga/AMP-Regularizer
SLAM
Differentiable SLAM-net: Learning Particle SLAM for Visual Navigation
- Paper: https://arxiv.org/abs/2105.07593
- Code: None
Generalizing to the Open World: Deep Visual Odometry with Online Adaptation
- Paper: https://arxiv.org/abs/2103.15279
- Code: https://arxiv.org/abs/2103.15279
长尾分布(Long-Tailed)
Adversarial Robustness under Long-Tailed Distribution
- Paper(Oral): https://arxiv.org/abs/2104.02703
- Code: https://github.com/wutong16/Adversarial_Long-Tail
Distribution Alignment: A Unified Framework for Long-tail Visual Recognition
- Paper: https://arxiv.org/abs/2103.16370
- Code: https://github.com/Megvii-BaseDetection/DisAlign
Adaptive Class Suppression Loss for Long-Tail Object Detection
- Paper: https://arxiv.org/abs/2104.00885
- Code: https://github.com/CASIA-IVA-Lab/ACSL
Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification
- Paper: https://arxiv.org/abs/2103.14267
- Code: None
数据增广(Data Augmentation)
Scale-aware Automatic Augmentation for Object Detection
-
Paper: https://arxiv.org/abs/2103.17220
-
Code: https://github.com/Jia-Research-Lab/SA-AutoAug
无监督/自监督(Un/Self-Supervised)
Domain-Specific Suppression for Adaptive Object Detection
- Paper: https://arxiv.org/abs/2105.03570
- Code: None
A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning
-
Paper: https://arxiv.org/abs/2104.14558
-
Code: https://github.com/facebookresearch/SlowFast
Unsupervised Multi-Source Domain Adaptation for Person Re-Identification
- Paper: https://arxiv.org/abs/2104.12961
- Code: None
Self-supervised Video Representation Learning by Context and Motion Decoupling
- Paper: https://arxiv.org/abs/2104.00862
- Code: None
Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning
- Homepage: https://fingerrec.github.io/index_files/jinpeng/papers/CVPR2021/project_website.html
- Paper: https://arxiv.org/abs/2009.05769
- Code: https://github.com/FingerRec/BE
Spatially Consistent Representation Learning
- Paper: https://arxiv.org/abs/2103.06122
- Code: None
VideoMoCo: Contrastive Video Representation Learning with Temporally Adversarial Examples
- Paper: https://arxiv.org/abs/2103.05905
- Code: https://github.com/tinapan-pt/VideoMoCo
Exploring Simple Siamese Representation Learning
- Paper(Oral): https://arxiv.org/abs/2011.10566
- Code: None
Dense Contrastive Learning for Self-Supervised Visual Pre-Training
- Paper(Oral): https://arxiv.org/abs/2011.09157
- Code: https://github.com/WXinlong/DenseCL
半监督学习(Semi-Supervised )
Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework
-
作者单位: 阿里巴巴
-
Paper: https://arxiv.org/abs/2103.11402
-
Code: None
Adaptive Consistency Regularization for Semi-Supervised Transfer Learning
- Paper: https://arxiv.org/abs/2103.02193
- Code: https://github.com/SHI-Labs/Semi-Supervised-Transfer-Learning
胶囊网络(Capsule Network)
Capsule Network is Not More Robust than Convolutional Network
- Paper: https://arxiv.org/abs/2103.15459
- Code: None
图像分类(Image Classification)
Correlated Input-Dependent Label Noise in Large-Scale Image Classification
- Paper(Oral): https://arxiv.org/abs/2105.10305
- Code: https://github.com/google/uncertainty-baselines/tree/master/baselines/imagenet
2D目标检测(Object Detection)
2D目标检测
1. Scaled-YOLOv4: Scaling Cross Stage Partial Network
- 作者单位: 中央研究院, 英特尔, 静宜大学
- Paper: https://arxiv.org/abs/2011.08036
- Code: https://github.com/WongKinYiu/ScaledYOLOv4
- 中文解读: YOLOv4官方改进版来了!55.8% AP!速度最高达1774 FPS,Scaled-YOLOv4正式开源!
2. You Only Look One-level Feature
- 作者单位: 中科院, 国科大, 旷视科技
- Paper: https://arxiv.org/abs/2103.09460
- Code: https://github.com/megvii-model/YOLOF
- 中文解读: CVPR 2021 | 没有FPN!中科院&旷视提出YOLOF:你只需看一层特征
3. Sparse R-CNN: End-to-End Object Detection with Learnable Proposals
- 作者单位: 香港大学, 同济大学, 字节跳动AI Lab, 加利福尼亚大学伯克利分校
- Paper: https://arxiv.org/abs/2011.12450
- Code: https://github.com/PeizeSun/SparseR-CNN
- 中文解读: 目标检测新范式!港大同济伯克利提出Sparse R-CNN,代码刚刚开源!
4. End-to-End Object Detection with Fully Convolutional Network
- 作者单位: 旷视科技, 西安交通大学
- Paper: https://arxiv.org/abs/2012.03544
- Code: https://github.com/Megvii-BaseDetection/DeFCN
5. Dynamic Head: Unifying Object Detection Heads with Attentions
- 作者单位: 微软
- Paper: https://arxiv.org/abs/2106.08322
- Code: https://github.com/microsoft/DynamicHead
- 中文解读: 60.6 AP!打破COCO记录!微软提出DyHead:将注意力与目标检测Heads统一
6. Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection
- 作者单位: 南京理工大学, Momenta, 南京大学, 清华大学
- Paper: https://arxiv.org/abs/2011.12885
- Code: https://github.com/implus/GFocalV2
- 中文解读:CVPR 2021 | GFLV2:目标检测良心技术,无Cost涨点!
7. UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
- 作者单位: 华南理工大学, 腾讯微信AI
- Paper(Oral): https://arxiv.org/abs/2011.09094
- Code: https://github.com/dddzg/up-detr
- 中文解读: CVPR 2021 Oral | Transformer再发力!华南理工和微信提出UP-DETR:无监督预训练检测器
8. MobileDets: Searching for Object Detection Architectures for Mobile Accelerators
-
作者单位: 威斯康星大学, 谷歌
-
Paper: https://openaccess.thecvf.com/content/CVPR2021/papers/Xiong_MobileDets_Searching_for_Object_Detection_Architectures_for_Mobile_Accelerators_CVPR_2021_paper.pdf
-
Code: https://github.com/tensorflow/models/tree/master/research/object_detection
9. Tracking Pedestrian Heads in Dense Crowd
- 作者单位: 雷恩第一大学
- Homepage: https://project.inria.fr/crowdscience/project/dense-crowd-head-tracking/
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Sundararaman_Tracking_Pedestrian_Heads_in_Dense_Crowd_CVPR_2021_paper.html
- Code1: https://github.com/Sentient07/HeadHunter
- Code2: https://github.com/Sentient07/HeadHunter%E2%80%93T
- Dataset: https://project.inria.fr/crowdscience/project/dense-crowd-head-tracking/
10. Joint-DetNAS: Upgrade Your Detector with NAS, Pruning and Dynamic Distillation
- 作者单位: 香港科技大学, 华为诺亚
- Paper: https://arxiv.org/abs/2105.12971
- Code: None
11. PSRR-MaxpoolNMS: Pyramid Shifted MaxpoolNMS with Relationship Recovery
- 作者单位: A*star, 四川大学, 南洋理工大学
- Paper: https://arxiv.org/abs/2105.12990
- Code: None
12. IQDet: Instance-wise Quality Distribution Sampling for Object Detection
- 作者单位: 旷视科技
- Paper: https://arxiv.org/abs/2104.06936
- Code: None
13. Multi-Scale Aligned Distillation for Low-Resolution Detection
- 作者单位: 香港中文大学, Adobe研究院, 思谋科技
- Paper: https://jiaya.me/papers/ms_align_distill_cvpr21.pdf
- Code: https://github.com/Jia-Research-Lab/MSAD
14. Adaptive Class Suppression Loss for Long-Tail Object Detection
-
作者单位: 中科院, 国科大, ObjectEye, 北京大学, 鹏城实验室, Nexwise
-
Paper: https://arxiv.org/abs/2104.00885
-
Code: https://github.com/CASIA-IVA-Lab/ACSL
15. VarifocalNet: An IoU-aware Dense Object Detector
- 作者单位: 昆士兰科技大学, 昆士兰大学
- Paper(Oral): https://arxiv.org/abs/2008.13367
- Code: https://github.com/hyz-xmaster/VarifocalNet
16. OTA: Optimal Transport Assignment for Object Detection
-
作者单位: 早稻田大学, 旷视科技
-
Paper: https://arxiv.org/abs/2103.14259
-
Code: https://github.com/Megvii-BaseDetection/OTA
17. Distilling Object Detectors via Decoupled Features
- 作者单位: 华为诺亚, 悉尼大学
- Paper: https://arxiv.org/abs/2103.14475
- Code: https://github.com/ggjy/DeFeat.pytorch
18. Robust and Accurate Object Detection via Adversarial Learning
-
作者单位: 谷歌, UCLA, UCSC
-
Paper: https://arxiv.org/abs/2103.13886
-
Code: None
19. OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection
- 作者单位: 北京大学, Anyvision, 石溪大学
- Paper: https://arxiv.org/abs/2103.04507
- Code: https://github.com/VDIGPKU/OPANAS
20. Multiple Instance Active Learning for Object Detection
- 作者单位: 国科大, 华为诺亚, 清华大学
- Paper: https://openaccess.thecvf.com/content/CVPR2021/papers/Yuan_Multiple_Instance_Active_Learning_for_Object_Detection_CVPR_2021_paper.pdf
- Code: https://github.com/yuantn/MI-AOD
21. Towards Open World Object Detection
- 作者单位: 印度理工学院, MBZUAI, 澳大利亚国立大学, 林雪平大学
- Paper(Oral): https://arxiv.org/abs/2103.02603
- Code: https://github.com/JosephKJ/OWOD
22. RankDetNet: Delving Into Ranking Constraints for Object Detection
- 作者单位: 赛灵思
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Liu_RankDetNet_Delving_Into_Ranking_Constraints_for_Object_Detection_CVPR_2021_paper.html
- Code: None
旋转目标检测
23. Dense Label Encoding for Boundary Discontinuity Free Rotation Detection
- 作者单位: 上海交通大学, 国科大
- Paper: https://arxiv.org/abs/2011.09670
- Code1: https://github.com/Thinklab-SJTU/DCL_RetinaNet_Tensorflow
- Code2: https://github.com/yangxue0827/RotationDetection
24. ReDet: A Rotation-equivariant Detector for Aerial Object Detection
-
作者单位: 武汉大学
-
Paper: https://arxiv.org/abs/2103.07733
-
Code: https://github.com/csuhan/ReDet
25. Beyond Bounding-Box: Convex-Hull Feature Adaptation for Oriented and Densely Packed Object Detection
- 作者单位: 国科大, 清华大学
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Guo_Beyond_Bounding-Box_Convex-Hull_Feature_Adaptation_for_Oriented_and_Densely_Packed_CVPR_2021_paper.html
- Code: https://github.com/SDL-GuoZonghao/BeyondBoundingBox
Few-Shot目标检测
26. Accurate Few-Shot Object Detection With Support-Query Mutual Guidance and Hybrid Loss
-
作者单位: 复旦大学, 同济大学, 浙江大学
-
Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Zhang_Accurate_Few-Shot_Object_Detection_With_Support-Query_Mutual_Guidance_and_Hybrid_CVPR_2021_paper.html
-
Code: None
27. Adaptive Image Transformer for One-Shot Object Detection
- 作者单位: 中央研究院, 台湾AI Labs
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Chen_Adaptive_Image_Transformer_for_One-Shot_Object_Detection_CVPR_2021_paper.html
- Code: None
28. Dense Relation Distillation with Context-aware Aggregation for Few-Shot Object Detection
- 作者单位: 北京大学, 北邮
- Paper: https://arxiv.org/abs/2103.17115
- Code: https://github.com/hzhupku/DCNet
29. Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection
-
作者单位: 卡内基梅隆大学(CMU)
-
Paper: https://arxiv.org/abs/2103.01903
-
Code: None
30. FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding
- 作者单位: 南加利福尼亚大学, 旷视科技
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Sun_FSCE_Few-Shot_Object_Detection_via_Contrastive_Proposal_Encoding_CVPR_2021_paper.html
- Code: https://github.com/MegviiDetection/FSCE
31. Hallucination Improves Few-Shot Object Detection
- 作者单位: 伊利诺伊大学厄巴纳-香槟分校
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Zhang_Hallucination_Improves_Few-Shot_Object_Detection_CVPR_2021_paper.html
- Code: https://github.com/pppplin/HallucFsDet
32. Few-Shot Object Detection via Classification Refinement and Distractor Retreatment
- 作者单位: 新加坡国立大学, SIMTech
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Li_Few-Shot_Object_Detection_via_Classification_Refinement_and_Distractor_Retreatment_CVPR_2021_paper.html
- Code: None
33. Generalized Few-Shot Object Detection Without Forgetting
- 作者单位: 旷视科技
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Fan_Generalized_Few-Shot_Object_Detection_Without_Forgetting_CVPR_2021_paper.html
- Code: None
34. Transformation Invariant Few-Shot Object Detection
-
作者单位: 华为诺亚方舟实验室
-
Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Li_Transformation_Invariant_Few-Shot_Object_Detection_CVPR_2021_paper.html
-
Code: None
35. UniT: Unified Knowledge Transfer for Any-Shot Object Detection and Segmentation
- 作者单位: 不列颠哥伦比亚大学, Vector AI, CIFAR AI Chair
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Khandelwal_UniT_Unified_Knowledge_Transfer_for_Any-Shot_Object_Detection_and_Segmentation_CVPR_2021_paper.html
- Code: https://github.com/ubc-vision/UniT
36. Beyond Max-Margin: Class Margin Equilibrium for Few-Shot Object Detection
- 作者单位: 国科大, 厦门大学, 鹏城实验室
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Li_Beyond_Max-Margin_Class_Margin_Equilibrium_for_Few-Shot_Object_Detection_CVPR_2021_paper.html
- Code: https://github.com/Bohao-Lee/CME
半监督目标检测
37. Points As Queries: Weakly Semi-Supervised Object Detection by Points]
- 作者单位: 旷视科技, 复旦大学
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Chen_Points_As_Queries_Weakly_Semi-Supervised_Object_Detection_by_Points_CVPR_2021_paper.html
- Code: None
38. Data-Uncertainty Guided Multi-Phase Learning for Semi-Supervised Object Detection
- 作者单位: 清华大学
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Wang_Data-Uncertainty_Guided_Multi-Phase_Learning_for_Semi-Supervised_Object_Detection_CVPR_2021_paper.html
- Code: None
39. Positive-Unlabeled Data Purification in the Wild for Object Detection
-
作者单位: 华为诺亚方舟实验室, 悉尼大学, 北京大学
-
Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Guo_Positive-Unlabeled_Data_Purification_in_the_Wild_for_Object_Detection_CVPR_2021_paper.html
-
Code: None
40. Interactive Self-Training With Mean Teachers for Semi-Supervised Object Detection
- 作者单位: 阿里巴巴, 香港理工大学
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Yang_Interactive_Self-Training_With_Mean_Teachers_for_Semi-Supervised_Object_Detection_CVPR_2021_paper.html
- Code: None
41. Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework
- 作者单位: 阿里巴巴
- Paper: https://arxiv.org/abs/2103.11402
- Code: None
42. Humble Teachers Teach Better Students for Semi-Supervised Object Detection
- 作者单位: 卡内基梅隆大学(CMU), 亚马逊
- Homepage: https://yihet.com/humble-teacher
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Tang_Humble_Teachers_Teach_Better_Students_for_Semi-Supervised_Object_Detection_CVPR_2021_paper.html
- Code: https://github.com/lryta/HumbleTeacher
43. Interpolation-Based Semi-Supervised Learning for Object Detection
- 作者单位: 首尔大学, 阿尔托大学等
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Jeong_Interpolation-Based_Semi-Supervised_Learning_for_Object_Detection_CVPR_2021_paper.html
- Code: https://github.com/soo89/ISD-SSD
域自适应目标检测
44. Domain-Specific Suppression for Adaptive Object Detection
- 作者单位: 中科院, 寒武纪, 国科大
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Wang_Domain-Specific_Suppression_for_Adaptive_Object_Detection_CVPR_2021_paper.html
- Code: None
45. MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection
- 作者单位: 约翰斯·霍普金斯大学, 梅赛德斯—奔驰
- Paper: https://arxiv.org/abs/2103.04224
- Code: None
46. Unbiased Mean Teacher for Cross-Domain Object Detection
- 作者单位: 电子科技大学, ETH Zurich
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Deng_Unbiased_Mean_Teacher_for_Cross-Domain_Object_Detection_CVPR_2021_paper.html
- Code: https://github.com/kinredon/umt
47. I^3Net: Implicit Instance-Invariant Network for Adapting One-Stage Object Detectors
- 作者单位: 香港大学, 厦门大学, Deepwise AI Lab
- Paper: https://arxiv.org/abs/2103.13757
- Code: None
自监督目标检测
48. There Is More Than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking With Sound by Distilling Multimodal Knowledge
- 作者单位: 弗莱堡大学
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Valverde_There_Is_More_Than_Meets_the_Eye_Self-Supervised_Multi-Object_Detection_CVPR_2021_paper.html
- Code: http://rl.uni-freiburg.de/research/multimodal-distill
49. Instance Localization for Self-supervised Detection Pretraining
- 作者单位: 香港中文大学, 微软亚洲研究院
- Paper: https://arxiv.org/abs/2102.08318
- Code: https://github.com/limbo0000/InstanceLoc
弱监督目标检测
50. Informative and Consistent Correspondence Mining for Cross-Domain Weakly Supervised Object Detection
- 作者单位: 北航, 鹏城实验室, 商汤科技
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Hou_Informative_and_Consistent_Correspondence_Mining_for_Cross-Domain_Weakly_Supervised_Object_CVPR_2021_paper.html
- Code: None
51. DAP: Detection-Aware Pre-training with Weak Supervision
- 作者单位: UIUC, 微软
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Zhong_DAP_Detection-Aware_Pre-Training_With_Weak_Supervision_CVPR_2021_paper.html
- Code: None
其他
52. Open-Vocabulary Object Detection Using Captions
-
作者单位:Snap, 哥伦比亚大学
-
Paper(Oral): https://openaccess.thecvf.com/content/CVPR2021/html/Zareian_Open-Vocabulary_Object_Detection_Using_Captions_CVPR_2021_paper.html
-
Code: https://github.com/alirezazareian/ovr-cnn
53. Depth From Camera Motion and Object Detection
-
作者单位: 密歇根大学, SIAI
-
Paper: https://arxiv.org/abs/2103.01468
-
Code: https://github.com/griffbr/ODMD
-
Dataset: https://github.com/griffbr/ODMD
54. Unsupervised Object Detection With LIDAR Clues
- 作者单位: 商汤科技, 国科大, 中科大
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Tian_Unsupervised_Object_Detection_With_LIDAR_Clues_CVPR_2021_paper.html
- Code: None
55. GAIA: A Transfer Learning System of Object Detection That Fits Your Needs
- 作者单位: 国科大, 北理, 中科院, 商汤科技
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Bu_GAIA_A_Transfer_Learning_System_of_Object_Detection_That_Fits_CVPR_2021_paper.html
- Code: https://github.com/GAIA-vision/GAIA-det
56. General Instance Distillation for Object Detection
- 作者单位: 旷视科技, 北航
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Dai_General_Instance_Distillation_for_Object_Detection_CVPR_2021_paper.html
- Code: None
57. AQD: Towards Accurate Quantized Object Detection
- 作者单位: 蒙纳士大学, 阿德莱德大学, 华南理工大学
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Chen_AQD_Towards_Accurate_Quantized_Object_Detection_CVPR_2021_paper.html
- Code: https://github.com/aim-uofa/model-quantization
58. Scale-Aware Automatic Augmentation for Object Detection
- 作者单位: 香港中文大学, 字节跳动AI Lab, 思谋科技
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Chen_Scale-Aware_Automatic_Augmentation_for_Object_Detection_CVPR_2021_paper.html
- Code: https://github.com/Jia-Research-Lab/SA-AutoAug
59. Equalization Loss v2: A New Gradient Balance Approach for Long-Tailed Object Detection
- 作者单位: 同济大学, 商汤科技, 清华大学
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Tan_Equalization_Loss_v2_A_New_Gradient_Balance_Approach_for_Long-Tailed_CVPR_2021_paper.html
- Code: https://github.com/tztztztztz/eqlv2
60. Class-Aware Robust Adversarial Training for Object Detection
- 作者单位: 哥伦比亚大学, 中央研究院
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Chen_Class-Aware_Robust_Adversarial_Training_for_Object_Detection_CVPR_2021_paper.html
- Code: None
61. Improved Handling of Motion Blur in Online Object Detection
- 作者单位: 伦敦大学学院
- Homepage: http://visual.cs.ucl.ac.uk/pubs/handlingMotionBlur/
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Sayed_Improved_Handling_of_Motion_Blur_in_Online_Object_Detection_CVPR_2021_paper.html
- Code: None
62. Multiple Instance Active Learning for Object Detection
- 作者单位: 国科大, 华为诺亚
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Yuan_Multiple_Instance_Active_Learning_for_Object_Detection_CVPR_2021_paper.html
- Code: https://github.com/yuantn/MI-AOD
63. Neural Auto-Exposure for High-Dynamic Range Object Detection
- 作者单位: Algolux, 普林斯顿大学
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Onzon_Neural_Auto-Exposure_for_High-Dynamic_Range_Object_Detection_CVPR_2021_paper.html
- Code: None
64. Generalizable Pedestrian Detection: The Elephant in the Room
- 作者单位: IIAI, 阿尔托大学
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Hasan_Generalizable_Pedestrian_Detection_The_Elephant_in_the_Room_CVPR_2021_paper.html
- Code: https://github.com/hasanirtiza/Pedestron
65. Neural Auto-Exposure for High-Dynamic Range Object Detection
- 作者单位: Algolux, 普林斯顿大学
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Onzon_Neural_Auto-Exposure_for_High-Dynamic_Range_Object_Detection_CVPR_2021_paper.html
- Code: None
单/多目标跟踪(Object Tracking)
单目标跟踪
LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search
-
Paper: https://arxiv.org/abs/2104.14545
-
Code: https://github.com/researchmm/LightTrack
Towards More Flexible and Accurate Object Tracking with Natural Language: Algorithms and Benchmark
-
Homepage: https://sites.google.com/view/langtrackbenchmark/
-
Paper: https://arxiv.org/abs/2103.16746
-
Evaluation Toolkit: https://github.com/wangxiao5791509/TNL2K_evaluation_toolkit
-
Demo Video: https://www.youtube.com/watch?v=7lvVDlkkff0&ab_channel=XiaoWang
IoU Attack: Towards Temporally Coherent Black-Box Adversarial Attack for Visual Object Tracking
- Paper: https://arxiv.org/abs/2103.14938
- Code: https://github.com/VISION-SJTU/IoUattack
Graph Attention Tracking
- Paper: https://arxiv.org/abs/2011.11204
- Code: https://github.com/ohhhyeahhh/SiamGAT
Rotation Equivariant Siamese Networks for Tracking
- Paper: https://arxiv.org/abs/2012.13078
- Code: None
Track to Detect and Segment: An Online Multi-Object Tracker
- Homepage: https://jialianwu.com/projects/TraDeS.html
- Paper: None
- Code: None
Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking
-
Paper(Oral): https://arxiv.org/abs/2103.11681
-
Code: https://github.com/594422814/TransformerTrack
Transformer Tracking
- Paper: https://arxiv.org/abs/2103.15436
- Code: https://github.com/chenxin-dlut/TransT
多目标跟踪
Tracking Pedestrian Heads in Dense Crowd
- Homepage: https://project.inria.fr/crowdscience/project/dense-crowd-head-tracking/
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Sundararaman_Tracking_Pedestrian_Heads_in_Dense_Crowd_CVPR_2021_paper.html
- Code1: https://github.com/Sentient07/HeadHunter
- Code2: https://github.com/Sentient07/HeadHunter%E2%80%93T
- Dataset: https://project.inria.fr/crowdscience/project/dense-crowd-head-tracking/
Multiple Object Tracking with Correlation Learning
- Paper: https://arxiv.org/abs/2104.03541
- Code: None
Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking
- Paper: https://arxiv.org/abs/2012.02337
- Code: None
Learning a Proposal Classifier for Multiple Object Tracking
- Paper: https://arxiv.org/abs/2103.07889
- Code: https://github.com/daip13/LPC_MOT.git
Track to Detect and Segment: An Online Multi-Object Tracker
- Homepage: https://jialianwu.com/projects/TraDeS.html
- Paper: https://arxiv.org/abs/2103.08808
- Code: https://github.com/JialianW/TraDeS
语义分割(Semantic Segmentation)
1. HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation
-
作者单位: Facebook AI, 巴伊兰大学, 特拉维夫大学
-
Homepage: https://nirkin.com/hyperseg/
-
Paper: https://openaccess.thecvf.com/content/CVPR2021/papers/Nirkin_HyperSeg_Patch-Wise_Hypernetwork_for_Real-Time_Semantic_Segmentation_CVPR_2021_paper.pdf
-
Code: https://github.com/YuvalNirkin/hyperseg
2. Rethinking BiSeNet For Real-time Semantic Segmentation
-
作者单位: 美团
-
Paper: https://arxiv.org/abs/2104.13188
-
Code: https://github.com/MichaelFan01/STDC-Seg
3. Progressive Semantic Segmentation
- 作者单位: VinAI Research, VinUniversity, 阿肯色大学, 石溪大学
- Paper: https://arxiv.org/abs/2104.03778
- Code: https://github.com/VinAIResearch/MagNet
4. Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
- 作者单位: 复旦大学, 牛津大学, 萨里大学, 腾讯优图, Facebook AI
- Homepage: https://fudan-zvg.github.io/SETR
- Paper: https://arxiv.org/abs/2012.15840
- Code: https://github.com/fudan-zvg/SETR
5. Capturing Omni-Range Context for Omnidirectional Segmentation
- 作者单位: 卡尔斯鲁厄理工学院, 卡尔·蔡司, 华为
- Paper: https://arxiv.org/abs/2103.05687
- Code: None
6. Learning Statistical Texture for Semantic Segmentation
- 作者单位: 北航, 商汤科技
- Paper: https://arxiv.org/abs/2103.04133
- Code: None
7. InverseForm: A Loss Function for Structured Boundary-Aware Segmentation
- 作者单位: 高通AI研究院
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Borse_InverseForm_A_Loss_Function_for_Structured_Boundary-Aware_Segmentation_CVPR_2021_paper.html
- Code: None
8. DCNAS: Densely Connected Neural Architecture Search for Semantic Image Segmentation
- 作者单位: Joyy Inc, 快手, 北航等
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Zhang_DCNAS_Densely_Connected_Neural_Architecture_Search_for_Semantic_Image_Segmentation_CVPR_2021_paper.html
- Code: None
弱监督语义分割
9. Railroad Is Not a Train: Saliency As Pseudo-Pixel Supervision for Weakly Supervised Semantic Segmentation
- 作者单位: 延世大学, 成均馆大学
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Lee_Railroad_Is_Not_a_Train_Saliency_As_Pseudo-Pixel_Supervision_for_CVPR_2021_paper.html
- Code: https://github.com/halbielee/EPS
10. Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised Semantic Segmentation
- 作者单位: 延世大学
- Homepage: https://cvlab.yonsei.ac.kr/projects/BANA/
- Paper: https://arxiv.org/abs/2104.00905
- Code: None
11. Non-Salient Region Object Mining for Weakly Supervised Semantic Segmentation
-
作者单位: 南京理工大学, MBZUAI, 电子科技大学, 阿德莱德大学, 悉尼科技大学
-
Paper: https://arxiv.org/abs/2103.14581
-
Code: https://github.com/NUST-Machine-Intelligence-Laboratory/nsrom
12. Embedded Discriminative Attention Mechanism for Weakly Supervised Semantic Segmentation
- 作者单位: 北京理工大学, 美团
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Wu_Embedded_Discriminative_Attention_Mechanism_for_Weakly_Supervised_Semantic_Segmentation_CVPR_2021_paper.html
- Code: https://github.com/allenwu97/EDAM
13. BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation
- 作者单位: 首尔大学
- Paper: https://arxiv.org/abs/2103.08907
- Code: https://github.com/jbeomlee93/BBAM
半监督语义分割
14. Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision
- 作者单位: 北京大学, 微软亚洲研究院
- Paper: https://arxiv.org/abs/2106.01226
- Code: https://github.com/charlesCXK/TorchSemiSeg
15. Semi-supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation
- 作者单位: 华为, 大连理工大学, 北京大学
- Paper: https://arxiv.org/abs/2103.04705
- Code: None
16. Semi-Supervised Semantic Segmentation With Directional Context-Aware Consistency
- 作者单位: 香港中文大学, 思谋科技, 牛津大学
- Paper: https://openaccess.thecvf.com/content/CVPR2021/html/Lai_Semi-Supervised_Semantic_Segmentation_With_Directional_Context-Aware_Consistency_CVPR_2021_paper.html
- Code: None
17. Semantic Segmentation With Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization
- 作者单位: NVIDIA, 多伦多大学, 耶鲁大学, MIT, Vector Institute
- Paper: http
以上是关于CVPR 2021 论文和开源项目合集的主要内容,如果未能解决你的问题,请参考以下文章