ICCV 2021 最新200篇ICCV2021论文分方向汇总
Posted 等待破茧
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了ICCV 2021 最新200篇ICCV2021论文分方向汇总相关的知识,希望对你有一定的参考价值。
ICCV 2021 结果出炉!最新200篇ICCV2021论文分方向汇总(更新中) - 知乎
不久前,计算机视觉三大顶会之一ICCV2021接收结果已经公布,本次ICCV共计 6236 篇有效提交论文,其中有 1617 篇论文被接收,接收率为25.9%。
极市平台对此次ICCV2021接收的论文进行了分类汇总,分为检测、分割、估计、跟踪、视觉定位、底层图像处理、图像视频检索、三维视觉等多个方向。所有关于ICCV2021的论文整理都汇总在了我们的Github项目中,该项目目前已收获1300 Star。
这个Github项目将持续更新,项目地址:
Build software better, together
github.com/extreme-assistant/ICCV2021-Paper-Code-Interpretation/edit/master/ICCV2021.md正在上传…重新上传取消
目前整理的论文(8月19日更新):
检测
2D目标检测(2D Object Detection)
[12] G-DetKD: Towards General Distillation Framework for Object Detectors via Contrastive and Semantic-guided Feature Imitation
paper
[11] Vector-Decomposed Disentanglement for Domain-Invariant Object Detection
paper
[10] Oriented R-CNN for Object Detection
paper | code
[9] Conditional DETR for Fast Training Convergence
paper | code
[8] Boosting Weakly Supervised Object Detection via Learning Bounding Box Adjusters
paper | code
[7] GraphFPN: Graph Feature Pyramid Network for Object Detection
paper
解读:复旦&港大提出GraphFPN:用图特征金字塔提升目标检测性能!
[6] SimROD: A Simple Adaptation Method for Robust Object Detection
paper
[5] Active Learning for Deep Object Detection via Probabilistic Modeling
paper
[4] Detecting Invisible People
paper | project | video
[3] Conditional Variational Capsule Network for Open Set Recognition
paper | code
[2] MDETR : Modulated Detection for End-to-End Multi-Modal Understanding(Oral)
paper | code | project | colab
解读:无需检测器提取特征!LeCun团队提出MDETR:实现真正的端到端多模态推理
[1] DetCo: Unsupervised Contrastive Learning for Object Detection
paper | code
解读:性能优于何恺明团队MoCo v2,DetCo:为目标检测定制任务的对比学习
3D目标检测(3D Object Detection)
[6] LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based 3D Detector
paper
[5] RandomRooms: Unsupervised Pre-training from Synthetic Shapes and Randomized Layouts for 3D Object Detection
paper
[4] Is Pseudo-Lidar needed for Monocular 3D Object detection?
paper
[3] Fog Simulation on Real LiDAR Point Clouds for 3D Object Detection in Adverse Weather
paper | code
[2] Geometry Uncertainty Projection Network for Monocular 3D Object Detection
paper
[1] Unsupervised Domain Adaptive 3D Detection with Multi-Level Consistency
paper
显著性目标检测(Saliency Object Detection)
[2] Specificity-preserving RGB-D Saliency Detection
paper | code
[1] Disentangled High Quality Salient Object Detection
paper
伪装目标检测(Camouflaged Object Detection)
[1] TransForensics: Image Forgery Localization with Dense Self-Attention
paper
图像异常检测/表面缺陷检测(Anomally Detection in Image)
[2] DRÆM -- A discriminatively trained reconstruction embedding for surface anomaly detection
paper
[1] Divide-and-Assemble: Learning Block-wise Memory for Unsupervised Anomaly Detection
paper
边缘检测(Edge Detection)
[2] Pixel Difference Networks for Efficient Edge Detection
paper | code
[1] RINDNet: Edge Detection for Discontinuity in Reflectance, Illumination, Normal and Depth
paper
分割(Segmentation)
图像分割(Image Segmentation)
[2] Labels4Free: Unsupervised Segmentation using StyleGAN
paper | code | project
[1] Mining Latent Classes for Few-shot Segmentation(Oral)
paper | code
实例分割(Instance Segmentation)
[5] Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks
paper | code
[4] SOTR: Segmenting Objects with Transformers
paper | code
[3] Hierarchical Aggregation for 3D Instance Segmentation
paper | code
[2] Crossover Learning for Fast Online Video Instance Segmentation
code
[1] Instances as Queries
paper | code
语义分割(Semantic Segmentation)
[18] Multi-Anchor Active Domain Adaptation for Semantic Segmentation(Oral)
paper
[17] Multi-Target Adversarial Frameworks for Domain Adaptation in Semantic Segmentation
paper
[16] Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation
paper
[15] LabOR: Labeling Only if Required for Domain Adaptive Semantic Segmentation(Oral)
paper
[14] Dual Path Learning for Domain Adaptation of Semantic Segmentation
paper | code
[13] Deep Metric Learning for Open World Semantic Segmentation
paper
[12] Complementary Patch for Weakly Supervised Semantic Segmentation
paper
[11] RECALL: Replay-based Continual Learning in Semantic Segmentation
paper
[10] Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer
paper | code
[9] Learning Meta-class Memory for Few-Shot Semantic Segmentation
paper
[8] Personalized Image Semantic Segmentation
paper
[7] VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation
paper | code
[6] Leveraging Auxiliary Tasks with Affinity Learning for Weakly Supervised Semantic Segmentation
paper
[5] ReDAL: Region-based and Diversity-aware Active Learning for Point Cloud Semantic Segmentation(点云语义分割)
paper
[4] Domain Adaptive Video Segmentation via Temporal Consistency Regularization(video semantic segmentation)
paper | code
[3] Standardized Max Logits: A Simple yet Effective Approach for Identifying Unexpected Road Obstacles in Urban-Scene Segmentation(Oral)
paper
[2] Re-distributing Biased Pseudo Labels for Semi-supervised Semantic Segmentation: A Baseline Investigation(Oral)
paper | code
[1] Calibrated Adversarial Refinement for Stochastic Semantic Segmentation
paper | code
视频目标分割(Video Object Segmentation)
[2] Joint Inductive and Transductive Learning for Video Object Segmentation
paper | code
[1] Full-Duplex Strategy for Video Object Segmentation
paper | project
参考图像分割(Referring Image Segmentation)
[1] Vision-Language Transformer and Query Generation for Referring Segmentation
paper | code
密集预测(Dense Prediction)
[1] FaPN: Feature-aligned Pyramid Network for Dense Image Prediction
paper | code
人脸(Face)
[1] Learning Facial Representations from the Cycle-consistency of Face
paper
人脸识别/检测(Facial Recognition/Detection)
[2] SynFace: Face Recognition with Synthetic Data
paper
[1] PASS: Protected Attribute Suppression System for Mitigating Bias in Face Recognition
paper
人脸生成/合成/重建/编辑(Face Generation/Face Synthesis/Face Reconstruction/Face Editing)
[5] FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning
paper
[4] Disentangled Lifespan Face Synthesis
paper | code
[3] MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement(音频驱动面部动画)
paper | video
[2] Focal Frequency Loss for Image Reconstruction and Synthesis
paper | code
[1] HeadGAN: One-shot Neural Head Synthesis and Editing
paper
人脸伪造/反欺骗(Face Forgery/Face Anti-Spoofing)
[1] Exploring Temporal Coherence for More General Video Face Forgery Detection
paper
三维视觉(3D Vision)
[3] Differentiable Surface Rendering via Non-Differentiable Sampling
paper
[2] M3D-VTON: A Monocular-to-3D Virtual Try-On Network(3D试穿)
paper
[1] Score-Based Point Cloud Denoising
paper
点云(Point Cloud)
[10] ME-PCN: Point Completion Conditioned on Mask Emptiness(点云补全)
paper
[9] Adaptive Graph Convolution for Point Cloud Analysis
paper | code
[8] PICCOLO: Point Cloud-Centric Omnidirectional Localization
paper
[7] AdaFit: Rethinking Learning-based Normal Estimation on Point Clouds
paper | code
[6] SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer
paper | code
[5] DRINet: A Dual-Representation Iterative Learning Network for Point Cloud Segmentation
paper
[4] Unsupervised Learning of Fine Structure Generation for 3D Point Clouds by 2D Projection Matching
paper | code
[3] (Just) A Spoonful of Refinements Helps the Registration Error Go Down(Oral)
paper
[2] Learning with Noisy Labels for Robust Point Cloud Segmentation(点云分割)
paper | code
[1] HRegNet: A Hierarchical Network for Large-scale Outdoor LiDAR Point Cloud Registration
paper | project
三维重建(3D Reconstruction)
[6] Deep Hybrid Self-Prior for Full 3D Mesh Generation
paper | project
[5] PR-RRN: Pairwise-Regularized Residual-Recursive Networks for Non-rigid Structure-from-Motion
paper
[4] Learning Canonical 3D Object Representation for Fine-Grained Recognition
paper
[3] ELLIPSDF: Joint Object Pose and Shape Optimization with a Bi-level Ellipsoid and Signed Distance Function Description
paper
[2] Discovering 3D Parts from Image Collections
paper | project
[1] PlaneTR: Structure-Guided Transformers for 3D Plane Recovery
paper | code
神经网络设计与优化(Neural Network Structure Design & Optimization)
[2] Unifying Nonlocal Blocks for Neural Networks
paper
[1] Energy-Based Open-World Uncertainty Modeling for Confidence Calibration(置信度校准)
paper
CNN
[3] MicroNet: Improving Image Recognition with Extremely Low FLOPs
paper | code1 | code2
[2] Learning to Resize Images for Computer Vision Tasks
paper
[1] Bias Loss for Mobile Neural Networks
paper
解读:超越MobileNet V3 | 详解SkipNet+Bias Loss=轻量化模型新的里程碑
Attention
[4] Residual Attention: A Simple but Effective Method for Multi-Label Recognition
paper
[3] Fast Convergence of DETR with Spatially Modulated Co-Attention
paper | code
[2] SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition
paper | code
[1] FcaNet: Frequency Channel Attention Networks
paper | code
Transformer
[10] An Empirical Study of Training Self-Supervised Vision Transformers(Oral)
paper
解读:解决训练不稳定性,何恺明团队新作来了!自监督学习+Transformer=MoCoV3
[9] LeViT: a Vision Transformer in ConvNet’s Clothing for Faster Inference
paper | code
解读:FaceBook提出LeViT,0.077ms的单图处理速度却拥有ResNet50的精度
[8] Emerging Properties in Self-Supervised Vision Transformers
paper | code
解读:当Transformer遇见自监督学习!Facebook重磅开源DINO
[7] Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
paper | code
解读:ResNet被全面超越了,是Transformer干的:依图科技开源“可大可小”T2T-ViT,轻量版优于MobileNet
[6] Vision Transformer with Progressive Sampling
paper | code
[5] Rethinking and Improving Relative Position Encoding for Vision Transformer
paper | code
解读:Vision Transformer中的相对位置编码
[4] AutoFormer: Searching Transformers for Visual Recognition
paper | code
[3] Rethinking Spatial Dimensions of Vision Transformers
paper | code
[2] Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers(Oral)
paper | code
[1] Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions(Oral)
paper | code
解读:金字塔视觉Transformer(PVT):用于密集预测的多功能backbone
神经网络架构搜索(NAS)
[3] BN-NAS: Neural Architecture Search with Batch Normalization
paper
[2] NASOA: Towards Faster Task-oriented Online Fine-tuning with a Zoo of Models
paper
[1] AutoFormer: Searching Transformers for Visual Recognition
paper | code
损失函数(Loss Function)
[3] Rank & Sort Loss for Object Detection and Instance Segmentation(Oral)
paper | code
解读:拒绝调参,显著提点!检测分割任务的新损失函数RS Loss开源
[2] Focal Frequency Loss for Image Reconstruction and Synthesis
paper | code
[1] Orthogonal Projection Loss
paper | code
可视化/可解释性(Visualization/Interpretability)
[1] Finding Representative Interpretations on Convolutional Neural Networks
paper
模型训练/泛化(Model Training/Generalization)
[3] MultiTask-CenterNet (MCN): Efficient and Diverse Multitask Learning using an Anchor Free Approach(多任务学习)
paper
[2] Impact of Aliasing on Generalization in Deep Convolutional Networks
paper
[1] Learning Compatible Embeddings
paper | code
噪声标签(Noisy Label)
[1] Learning with Noisy Labels via Sparse Regularization
paper | code
长尾分布(Long-Tailed Distribution)
[1] ACE: Ally Complementary Experts for Solving Long-Tailed Recognition in One-Shot(Oral)
paper | code
分布外样本检测(Out of Distribution Detection)
[2] Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for Open-Set Semi-Supervised Learning
paper
[1] CODEs: Chamfer Out-of-Distribution Examples against Overconfidence Issue
paper
模型压缩(Model Compression)
知识蒸馏(Knowledge Distillation)
[4] G-DetKD: Towards General Distillation Framework for Object Detectors via Contrastive and Semantic-guided Feature Imitation
paper
[3] Online Multi-Granularity Distillation for GAN Compression
paper | code
[2] Distilling Holistic Knowledge with Graph Neural Networks
paper | code
[1] AGKD-BML: Defense Against Adversarial Attack by Attention Guided Knowledge Distillation and Bi-directional Metric Learning
paper | code
剪枝(Pruning)
剪枝(Pruning)
量化(Quantization)
[2] Distance-aware Quantization
paper
[1] Generalizable Mixed-Precision Quantization via Attribution Rank Preservation
paper | code
图像生成/合成(Image Generation/Image Synthesis)
[7] Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates(手势生成)
paper | code
[6] Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generation
paper | code
[5] ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models(Oral)
paper
[4] Toward Spatially Unbiased Generative Models
paper
[3] A Light Stage on Every Desk
paper | project
[2] Handwriting Transformers
paper
[1] On Generating Transferable Targeted Perturbations
paper | code
视图合成(View Synthesis)
[1] PixelSynth: Generating a 3D-Consistent Experience from a Single Image
paper | project
GAN/生成式/对抗式(GAN/Generative/Adversarial)
[13] Unsupervised Geodesic-preserved Generative Adversarial Networks for Unconstrained 3D Pose Transfer
paper | code
[12] Online Multi-Granularity Distillation for GAN Compression
paper | code
[11] AGKD-BML: Defense Against Adversarial Attack by Attention Guided Knowledge Distillation and Bi-directional Metric Learning
paper | code
[10] Meta Gradient Adversarial Attack
paper
[9] Sketch Your Own GAN
paper | code | project
解读:用一张草图创建GAN模型,新手也能玩转,朱俊彦团队新研究入选ICCV 2021
[8] Feature Importance-aware Transferable Adversarial Attacks
paper | code
[7] From Continuity to Editability: Inverting GANs with Consecutive Images
paper | code
[6] Learnable Boundary Guided Adversarial Training
paper | code
[5] Transporting Causal Mechanisms for Unsupervised Domain Adaptation(Oral)
paper
[4] Robustness via Cross-Domain Ensembles(Oral)
paper | code | model | homepage | video
[3] HeadGAN: One-shot Neural Head Synthesis and Editing
paper
[2] Labels4Free: Unsupervised Segmentation using StyleGAN
paper | code | project
[1] EigenGAN: Layer-Wise Eigen-Learning for GANs
paper | code
图像处理(Image Processing)
[3] Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling
paper | code
[2] Accelerating Atmospheric Turbulence Simulation via Learned Phase-to-Space Transform
paper
[1] Equivariant Imaging: Learning Beyond the Range Space(Oral)
paper
超分辨率(Super Resolution)
[2] Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution
paper | code
[1] Learning for Scale-Arbitrary Super-Resolution from Scale-Specific Networks
paper | code
图像去噪/去模糊/去雨去雾(Image Denoising)
[1] Rethinking Coarse-to-Fine Approach in Single Image Deblurring
paper | code
图像编辑/修复(Image Edit/Image Inpainting)
[1] Occlusion-Aware Video Object Inpainting(视频修复)
paper
风格迁移(Style Transfer)
[5] SSH: A Self-Supervised Framework for Image Harmonization(图像协调)
paper | code
[4] Domain-Aware Universal Style Transfer
paper
[3] AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer
paper | code1 | code2
[2] ALADIN: All Layer Adaptive Instance Normalization for Fine-grained Style Similarity(风格迁移)
paper
[1] Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts(字体生成)
paper | code
图像质量评估(Image Quality Assessment)
[1] MUSIQ: Multi-scale Image Quality Transformer
paper
估计(Estimation)
姿态估计(Human Pose Estimation)
[8] Learning Skeletal Graph Neural Networks for Hard 3D Pose Estimation
paper | code
[7] EventHPE: Event-based 3D Human Pose and Shape Estimation
paper
[6] HandFoldingNet: A 3D Hand Pose Estimation Network Using Multiscale-Feature Guided Folding of a 2D Hand Skeleton
paper | code
[5] Online Knowledge Distillation for Efficient Pose Estimation
paper
[4] Probabilistic Monocular 3D Human Pose Estimation with Normalizing Flows
paper
[3] Human Pose Regression with Residual Log-likelihood Estimation(Oral)
paper | code
[2] PyMAF: 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop(Oral)
paper | code | project
[1] HuMoR: 3D Human Motion Model for Robust Pose Estimation(Oral)
paper | video | project
深度估计(Depth Estimation)
[4] Self-supervised Monocular Depth Estimation for All Day Images using Domain Separation
paper
[3] Towards Interpretable Deep Networks for Monocular Depth Estimation
paper | code
[2] Regularizing Nighttime Weirdness: Efficient Self-supervised Monocular Depth Estimation in the Dark
paper
[1] MonoIndoor: Towards Good Practice of Self-Supervised Monocular Depth Estimation for Indoor Environments
paper
图像&视频检索/理解(Image&Video Retrieval/Video Understanding)
[5] ASMR: Learning Attribute-Based Person Search with Adaptive Semantic Margin Regularizer
paper
[4] Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models
paper | code
[3] DOLG: Single-Stage Image Retrieval with Deep Orthogonal Fusion of Local and Global Features
paper
[2] Hand Image Understanding via Deep Multi-Task Learning(手部图像理解)
paper
[1] Cross-Sentence Temporal and Semantic Relations in Video Activity Localisation
paper
行为识别/行为识别/动作识别/检测/分割(Action/Activity Recognition)
[7] Group-aware Contrastive Regression for Action Quality Assessment(动作质量评估)
paper
[6] Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization(动作定位)
paper | code
[5] Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization(动作定位)
paper | code
[4] Elaborative Rehearsal for Zero-shot Action Recognition
paper | code
[3] Skeleton Cloud Colorization for Unsupervised 3D Action Representation Learning
paper
[2] Enriching Local and Global Contexts for Temporal Action Localization
paper
[1] Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition
paper | code
行人重识别/检测(Re-Identification/Detection)
[6] Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences
paper
[5] Towards Discriminative Representation Learning for Unsupervised Person Re-identification
paper
[4] Learning Instance-level Spatial-Temporal Patterns for Person Re-identification
paper | Cleaned database
[3] An Intermediate Domain Module for Domain Adaptive Person Re-ID(Oral)
paper | code
[2] Spatio-Temporal Representation Factorization for Video-based Person Re-Identification
paper
[1] TransReID: Transformer-based Object Re-Identification
paper | code
解读:来自Transformer的降维打击:ReID各项任务全面领先,阿里&浙大提出TransReID
图像/视频字幕(Image/Video Caption)
[1] End-to-End Dense Video Captioning with Parallel Decoding
paper | code
视觉定位(Visual Localization)
[4] PICCOLO: Point Cloud-Centric Omnidirectional Localization
paper
[3] Normalization Matters in Weakly Supervised Object Localization
paper
[2] TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization
paper | code
[1] Boundary-sensitive Pre-training for Temporal Localization in Videos
paper
图像匹配(Image Matching)
[5] Pixel-Perfect Structure-from-Motion with Featuremetric Refinement
paper | code
[4] Progressive Correspondence Pruning by Consensus Learning
paper | code | project
解读:CLNet:基于一致性学习的渐进式匹配筛选
[3] Multi-scale Matching Networks for Semantic Correspondence
paper
[2] Warp Consistency for Unsupervised Learning of Dense Correspondences(Oral)
paper | code
[1] COTR: Correspondence Transformer for Matching Across Images
paper
三维视觉(3D Vision)
[1] MVTN: Multi-View Transformation Network for 3D Shape Recognition
paper
目标跟踪(Object Tracking)
[8] Learning Spatio-Temporal Transformer for Visual Tracking
paper | code
解读:屠榜目标跟踪!大连理工和MSRA提出STARK:基于Transformer的目标跟踪器
[7] Box-Aware Feature Enhancement for Single Object Tracking on Point Clouds
paper
[6] Video Annotation for Visual Tracking via Selection and Refinement
paper
[5] Saliency-Associated Object Tracking
paper
[4] Learn to Match: Automatic Matching Network Design for Visual Tracking
paper | code
[3] HiFT: Hierarchical Feature Transformer for Aerial Tracking
paper | code
[2] Learning to Adversarially Blur Visual Object Tracking
paper | code
[1] Detecting Invisible People
paper | project | video
医学影像(Medical Imaging)
[2] Recurrent Mask Refinement for Few-Shot Medical Image Segmentation
美团主办ICCV2021研讨会及挑战赛,专家齐聚食品分析,论文比赛斩获佳绩
英伟达新研究:不用动捕,直接通过视频就能捕获3D人体动作|ICCV 2021
ICCV 2021 |首届 SoMoF 人体序列预测比赛冠军方案分享