美文网首页
可逆Vision Transformers

可逆Vision Transformers

作者: Valar_Morghulis | 来源:发表于2023-02-16 07:42 被阅读0次

Reversible Vision Transformers

Feb 2023

Oral at CVPR 2022, updated version

Karttikeya Mangalam, Haoqi Fan, Yanghao Li, Chao-Yuan Wu, Bo Xiong, Christoph Feichtenhofer, Jitendra Malik

[Meta AI, UC Berkeley]

完整开源:https://github.com/facebookresearch/slowfast

简单易读版本 :https://github.com/karttikeya/minREV

我们介绍了可逆视觉Transformers,这是一种用于视觉识别的内存高效架构设计。通过将GPU内存需求与模型深度解耦,可逆视觉Transformers能够以高效的内存使用率扩展架构。我们调整了两个流行的模型,即视觉Transformers和多尺度视觉Transformers,以适应可逆变体,并在图像分类、对象检测和视频分类的模型大小和任务中广泛地进行基准测试。可逆视觉Transformers在大致相同的模型复杂度、参数和精度下实现了高达15.5倍的内存占用缩减,证明了可逆视觉Transformers作为硬件资源有限训练机制的高效主干的前景。最后,我们发现,对于更深的模型,重新计算激活的额外计算负担要克服得多,其中吞吐量可以比不可逆的模型增加2.3倍。

We present Reversible Vision Transformers, a memory efficient architecture design for visual recognition. By decoupling the GPU memory requirement from the depth of the model, Reversible Vision Transformers enable scaling up architectures with efficient memory usage. We adapt two popular models, namely Vision Transformer and Multiscale Vision Transformers, to reversible variants and benchmark extensively across both model sizes and tasks of image classification, object detection and video classification. Reversible Vision Transformers achieve a reduced memory footprint of up to 15.5x at roughly identical model complexity, parameters and accuracy, demonstrating the promise of reversible vision transformers as an efficient backbone for hardware resource limited training regimes. Finally, we find that the additional computational burden of recomputing activations is more than overcome for deeper models, where throughput can increase up to 2.3x over their non-reversible counterparts.

相关文章

网友评论

      本文标题:可逆Vision Transformers

      本文链接:https://www.haomeiwen.com/subject/npzykdtx.html