Skip to content

Latest commit

 

History

History
23 lines (12 loc) · 5.26 KB

File metadata and controls

23 lines (12 loc) · 5.26 KB

2018-10-17

这篇文章介绍两篇 ECCV 2018关于语义分割(Semantic Segmentation)最新的 paper,一篇提出双边分割网络(Bilateral Segmentation Network,BiSeNet)在不牺牲空间分辨率(spatial resolution)的情况下来实现实时inference速度;另一篇提出UDA框架和CBST框架,并引入空间先验(spatial prior)来细化生成的标签。

Semantic Segmentation

《BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation》

Abstract:Semantic segmentation requires both rich spatial information and sizeable receptive field. However, modern approaches usually compromise spatial resolution to achieve real-time inference speed, which leads to poor performance. In this paper, we address this dilemma with a novel Bilateral Segmentation Network (BiSeNet). We first design a Spatial Path with a small stride to preserve the spatial information and generate high-resolution features. Meanwhile, a Context Path with a fast downsampling strategy is employed to obtain sufficient receptive field. On top of the two paths, we introduce a new Feature Fusion Module to combine features efficiently. The proposed architecture makes a right balance between the speed and segmentation performance on Cityscapes, CamVid, and COCO-Stuff datasets. Specifically, for a 2048x1024 input, we achieve 68.4% Mean IOU on the Cityscapes test dataset with speed of 105 FPS on one NVIDIA Titan XP card, which is significantly faster than the existing methods with comparable performance.

摘要:语义分割(semantic segmentation)需要丰富的空间信息和相当大的感受野(receptive field)。但是,现代方法通常会牺牲空间分辨率(spatial resolution)来实现实时inference速度,从而导致性能不佳。在本文中,我们通过一种新颖的双边分割网络(Bilateral Segmentation Network,BiSeNet)来解决这一难题。我们首先设计一个小步幅的 Spatial Path,以保留空间信息并生成高分辨率特征。同时,采用具有快速下采样策略的 Context Path 来获得足够的感受野。在这两条 path 的顶部,我们引入了一个新的特征融合模块(Feature Fusion Module),以有效地结合特征。所提出的BiSeNet框架在Cityscapes,CamVid和COCO-Stuff数据集上的速度和分割性能之间取得了适当的平衡。具体来说,对于2048x1024输入,我们在Cityscapes测试数据集上实现了68.4%的Mean IOU,在一块NVIDIA Titan XP卡上的速度为105 FPS,这明显快于当前其它可比的方法。

arXiv:http://arxiv.org/abs/1808.00897

注:源码未放出

《Unsupervised Domain Adaptation for Semantic Segmentation via Class-Balanced Self-Training》

Abstract:Recent deep networks achieved state of the art performance on a variety of semantic segmentation tasks. Despite such progress, these models often face challenges in real world “wild tasks” where large difference between labeled training/source data and unseen test/target data exists. In particular, such difference is often referred to as “domain gap”, and could cause significantly decreased performance which cannot be easily remedied by further increasing the representation power. Unsupervised domain adaptation (UDA) seeks to overcome such problem without target domain labels. In this paper, we propose a novel UDA framework based on an iterative self-training (ST) procedure, where the problem is formulated as latent variable loss minimization, and can be solved by alternatively generating pseudo labels on target data and re-training the model with these labels. On top of ST, we also propose a novel classbalanced self-training (CBST) framework to avoid the gradual dominance of large classes on pseudo-label generation, and introduce spatial priors to refine generated labels. Comprehensive experiments show that the proposed methods achieve state of the art semantic segmentation performance under multiple major UDA settings.

摘要:最近的深度网络在各种语义分割任务上实现了最先进的性能。尽管取得了这些进展,但这些模型经常面临现实世界“wild tasks”中的挑战,其中存在标记的训练/源数据与看不见的测试/目标数据之间的巨大差异。特别地,这种差异通常被称为“domain gap”,并且可能导致显著的性能降低。这并不能通过进一步增加表示能力而容易地补救。无监督域适应(Unsupervised Domain Adaptation,UDA)试图在没有目标域标签的情况下克服这种问题。在本文中,我们提出了一种基于迭代自训练(Self-training,ST)过程的新型UDA框架,其中该问题被公式化为潜在变量损失最小化,并且可以通过在目标数据上交替生成伪标签(pseudo labels)并重新训练来解决。带有这些标签的模型。在ST之上,我们还提出了一种新颖的类平衡自我训练(Class Balanced Self-training,CBST)框架,avoid the gradual dominance of large classes on pseudo-label generation,并引入空间先验(spatial prior)来细化生成的标签。综合实验表明,所提出的方法在多个主要UDA设置下实现了最先进的语义分割性能。

paper:http://openaccess.thecvf.com/content_ECCV_2018/papers/Yang_Zou_Unsupervised_Domain_Adaptation_ECCV_2018_paper.pdf