The Math Behind Selective State Space Models

by The Serialization PublicationDecember 18th, 2024

Read on Terminal Reader

Read this story w/o Javascript

Too Long; Didn't Read

This section examines the mechanics of Selective SSMs, detailing the discretization process, the role of learnable biases, and how the zero-order hold (ZOH) formulas shape efficient AI recurrences.

featured image - The Math Behind Selective State Space Models

‘ai algorithm on a laptop screen’ Image created by HackerNoon AI Image Generator

Authors:

(1) Albert Gu, Machine Learning Department, Carnegie Mellon University and with equal contribution;

(2) Tri Dao, Department of Computer Science, Princeton University and with equal contribution.

Table of Links

Abstract and 1 Introduction

2 State Space Models

3 Selective State Space Models and 3.1 Motivation: Selection as a Means of Compression

3.2 Improving SSMs with Selection

3.3 Efficient Implementation of Selective SSMs

3.4 A Simplified SSM Architecture

3.5 Properties of Selection Mechanisms

3.6 Additional Model Details

4 Empirical Evaluation and 4.1 Synthetic Tasks

4.2 Language Modeling

4.3 DNA Modeling

4.4 Audio Modeling and Generation

4.5 Speed and Memory Benchmarks

4.6 Model Ablations

6 Conclusion and References

A Discussion: Selection Mechanism

C Mechanics of Selective SSMs

D Hardware-aware Algorithm For Selective SSMs

E Experimental Details and Additional Results

C Mechanics of Selective SSMs

The discretization step size is

where we observe that the parameter can be viewed as a learnable bias and folded into the linear projection. Now applying the zero-order hold (ZOH) discretization formulas:

Thus the final discrete recurrence (2a) is

as desired.

This paper is available on arxiv under CC BY 4.0 DEED license.

HackerNoon Services

L O A D I N G
. . . comments & more!

About Author

The Serialization Publication@serialization

We cover the most cutting edge academic research and expert blog posts on serialization. Also big fans of the Serial pod

Read my stories Learn More

TOPICS

purcat-img

machine-learning #deep-learning #transformer-architecture #mamba-model #ai-sequence-modeling #genomics-ai-solutions #latent-state-ai-models #hyena-architecture #structured-state-space-models

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave

Read on Terminal Reader

Read this story w/o Javascript

Also published here

Join HackerNoon

Latest technology trends. Customized Experience. Curated Stories. Publish Your Ideas