状态空间模型：S4

作者: Valar_Morghulis | 来源:发表于2023-02-17 09:14 被阅读0次

线性时不变系统的状态反馈控制器设计
基于akka的分步式实时的分布式系统
CAD 图纸空间切换到模型空间最大化视口
【Unity Shader入门精要学习】数学基础（四）
统计学习方法-第一章
2.Linux内核学习之Linux进程调度初探（1）进程调度的策
[空间变换]图形管道中的各个空间
第四章相似度分析算法——向量空间模型
进程的优劣
AutoCAD 布局(图纸空间)详解

Efficiently Modeling Long Sequences with Structured State Spaces

ICLR 2022 (Outstanding Paper HM)

序列建模的一个中心目标是设计一个单一的原则性模型，该模型可以处理一系列模态和任务中的序列数据，特别是长期依赖性。尽管包括RNN、CNN和Transformer在内的传统模型有专门的变体来捕获长依赖关系，但它们仍然难以扩展到10000步或更多步的非常长的序列。最近一种很有前途的方法提出了通过模拟基本状态空间模型（SSM）x′（t）=Ax（t）+Bu（t，y（t）=Cx（t）+Du（t（t））来建模序列，并表明对于状态矩阵A的适当选择，该系统可以在数学和经验上处理长程相关性。然而，这种方法具有令人望而却步的计算和内存需求，使得它作为通用序列建模解决方案不可行。我们提出了基于SSM新参数化的结构化状态空间序列模型（S4），并表明它可以比现有方法更有效地计算，同时保持其理论优势。我们的技术涉及用低秩校正来调节A，允许它稳定地对角化，并将SSM简化为经过充分研究的柯西核计算。S4在各种既定基准上取得了强大的经验结果，包括（i）与更大的2-D ResNet相比，连续CIFAR-10的准确率为91%，没有数据增加或辅助损失，同时对远程竞技场基准测试中的每一项任务执行60倍更快的SoTA（iii），包括解决所有先前工作都失败的长度为16k的具有挑战性的Path-X任务，同时与所有竞争对手一样高效。

A central goal of sequence modeling is designing a single principled model that can address sequence data across a range of modalities and tasks, particularly on long-range dependencies. Although conventional models including RNNs, CNNs, and Transformers have specialized variants for capturing long dependencies, they still struggle to scale to very long sequences of 10000 or more steps. A promising recent approach proposed modeling sequences by simulating the fundamental state space model (SSM) x′(t)=Ax(t)+Bu(t),y(t)=Cx(t)+Du(t), and showed that for appropriate choices of the state matrix A, this system could handle long-range dependencies mathematically and empirically. However, this method has prohibitive computation and memory requirements, rendering it infeasible as a general sequence modeling solution. We propose the Structured State Space sequence model (S4) based on a new parameterization for the SSM, and show that it can be computed much more efficiently than prior approaches while preserving their theoretical strengths. Our technique involves conditioning A with a low-rank correction, allowing it to be diagonalized stably and reducing the SSM to the well-studied computation of a Cauchy kernel. S4 achieves strong empirical results across a diverse range of established benchmarks, including (i) 91\% accuracy on sequential CIFAR-10 with no data augmentation or auxiliary losses, on par with a larger 2-D ResNet, (ii) substantially closing the gap to Transformers on image and language modeling tasks, while performing generation 60× faster (iii) SoTA on every task from the Long Range Arena benchmark, including solving the challenging Path-X task of length 16k that all prior work fails on, while being as efficient as all competitors.