美文网首页
ZoeDepth:结合相对和度量深度实现Zero-shot迁移

ZoeDepth:结合相对和度量深度实现Zero-shot迁移

作者: Valar_Morghulis | 来源:发表于2023-03-01 10:45 被阅读0次

    ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

    Feb 2023

    Shariq Farooq Bhat, Reiner Birkl, Diana Wofk, Peter Wonka, Matthias Müller

    [KAUST, Intel]

    https://arxiv.org/abs/2302.12288

    https://github.com/isl-org/ZoeDepth

    本文解决了从单个图像进行深度估计的问题。现有工作要么侧重于不考虑度量尺度的泛化性能,即相对深度估计,要么侧重于特定数据集上的最新结果,即度量深度估计。我们提出了第一种方法,该方法结合了两个方面,在保持度量尺度的同时,得到了具有优异泛化性能的模型。我们的旗舰模型ZoeD-M12-NK使用相对深度在12个数据集上进行预训练,并使用度量深度在两个数据集进行微调。我们使用了一个轻量级的头部,每个域都有一个称为度量bins模块的新型bins调整设计。在推断过程中,使用潜在分类器将每个输入图像自动路由到适当的头部。我们的框架允许多种配置,这取决于用于相对深度预训练和度量微调的数据集。在没有预训练的情况下,我们已经可以显著改善NYU Depth v2室内数据集的最新技术(SOTA)。对12个数据集进行预训练,并对NYU Depth v2室内数据集进行微调,我们可以进一步提高SOTA的相对绝对误差(REL),总计提高21%。最后,ZoeD-M12-NK是第一个可以在多个数据集(NYU Depth v2和KITTI)上联合训练而不会显著降低性能的模型,并对来自室内和室外领域的八个未知数据集实现了前所未有的zero-shot泛化性能。

    This paper tackles the problem of depth estimation from a single image. Existing work either focuses on generalization performance disregarding metric scale, i.e. relative depth estimation, or state-of-the-art results on specific datasets, i.e. metric depth estimation. We propose the first approach that combines both worlds, leading to a model with excellent generalization performance while maintaining metric scale. Our flagship model, ZoeD-M12-NK, is pre-trained on 12 datasets using relative depth and fine-tuned on two datasets using metric depth. We use a lightweight head with a novel bin adjustment design called metric bins module for each domain. During inference, each input image is automatically routed to the appropriate head using a latent classifier. Our framework admits multiple configurations depending on the datasets used for relative depth pre-training and metric fine-tuning. Without pre-training, we can already significantly improve the state of the art (SOTA) on the NYU Depth v2 indoor dataset. Pre-training on twelve datasets and fine-tuning on the NYU Depth v2 indoor dataset, we can further improve SOTA for a total of 21% in terms of relative absolute error (REL). Finally, ZoeD-M12-NK is the first model that can jointly train on multiple datasets (NYU Depth v2 and KITTI) without a significant drop in performance and achieve unprecedented zero-shot generalization performance to eight unseen datasets from both indoor and outdoor domains.

    相关文章

      网友评论

          本文标题:ZoeDepth:结合相对和度量深度实现Zero-shot迁移

          本文链接:https://www.haomeiwen.com/subject/cbhcldtx.html