3. Revisiting Normalization
3.1 Revisiting Euclidean Normalization
4 Riemannian Normalization on Lie Groups
5 LieBN on the Lie Groups of SPD Manifolds and 5.1 Deformed Lie Groups of SPD Manifolds
7 Conclusions, Acknowledgments, and References
APPENDIX CONTENTS
B Basic layes in SPDnet and TSMNet
C Statistical Results of Scaling in the LieBN
D LieBN as a Natural Generalization of Euclidean BN
E Domain-specific Momentum LieBN for EEG Classification
F Backpropagation of Matrix Functions
G Additional Details and Experiments of LieBN on SPD manifolds
H Preliminary Experiments on Rotation Matrices
I Proofs of the Lemmas and Theories in the Main Paper
The centering and biasing in Euclidean BN correspond to the group action of R. From a geometric perspective, the standard Euclidean metric is invariant under this group operation. Consequently, it is not surprising that our LieBN algorithm formulated in Alg. 1 serves as a natural generalization of standard Euclidean batch normalization. We formalize this fact in the following proposition.
where i is the index of the domain. We follow the official code of SPDDSMBN[5] to implement our DSMLieBN. In a word, the only difference between DSMLieBN and SPDDSMBN is the different way of normalization.
Analogous to Thm. 5.3, computations for DSMLieBN under pullback metrics can also be performed by mapping, calculating, and then remapping.
[5] https://github.com/rkobler/TSMNet
Our implementation of LieBN on SPD manifolds involves several matrix functions. Thus, we employ matrix backpropagation (BP) (Ionescu et al., 2015) for gradient computation. These matrix operations can be divided into Cholesky decomposition and the functions based on Eigende-composition.
The differentiation of the Cholesky decomposition can be found in Murray (2016, Eq. 8) or Lin (2019, Props. 4). Besides, our homemade BP of the Cholesky decomposition yields a similar gradient to the one generated by autograd of torch.linalg.cholesky. Therefore, during the experiments, we use torch.linalg.cholesky.
where ∇XL is the Euclidean gradient of the loss function L w.r.t. X. Matrix K is defined as
This paper is available on arxiv under CC BY-NC-SA 4.0 DEED license.
Authors:
(1) Ziheng Chen, University of Trento;
(2) Yue Song, University of Trento and a Corresponding author;
(3) Yunmei Liu, University of Louisville;
(4) Nicu Sebe, University of Trento.