neuromancer.slim.linear module
Structured linear maps which are drop in replacements for torch.nn.Linear
Pytorch weight initializations used in this module:
torch.nn.init.xavier_normal_(tensor, gain=1.0)
torch.nn.init.kaiming_normal_(tensor, a=0, mode=’fan_in’, nonlinearity=’leaky_relu’)
torch.nn.init.orthogonal_(tensor, gain=1)
torch.nn.init.sparse_(tensor, sparsity, std=0.01)
- class neuromancer.slim.linear.BoundedNormLinear(insize, outsize, bias=False, sigma_min=0.1, sigma_max=1.0, p=2, **kwargs)[source]
Bases:
Linear
sigma_min <= ||A||_p <= sigma_max p = type of the matrix norm sigma_min = minimum allowed value of eigenvalues sigma_max = maximum allowed value of eigenvalues
- class neuromancer.slim.linear.ButterflyLinear(insize, outsize, bias=False, complex=False, tied_weight=True, increasing_stride=True, ortho_init=False, **kwargs)[source]
Bases:
LinearBase
Sparse structured linear maps from: https://github.com/HazyResearch/learning-circuits
- class neuromancer.slim.linear.DampedSkewSymmetricLinear(insize, outsize, bias=False, sigma_min=0.1, sigma_max=0.5, **kwargs)[source]
Bases:
SkewSymmetricLinear
Skew-symmetric linear map with damping.
- class neuromancer.slim.linear.GershgorinLinear(insize, outsize, bias=False, sigma_min=0.0, sigma_max=1.0, real=True, **kwargs)[source]
Bases:
SquareLinear
Uses Gershgorin Disc parametrization to constrain eigenvalues of the matrix. See:
- neuromancer.slim.linear.Hprod(x, u, k)[source]
Helper function for computing matrix multiply via householder reflection representation. :param x: (torch.Tensor shape=[batchsize, dimension]) :param u: (torch.Tensor shape=[dimension]) :param k: (int) :return: (torch.Tensor shape=[batchsize, dimension])
- class neuromancer.slim.linear.IdentityGradReLU(*args, **kwargs)[source]
Bases:
Function
We can implement our own custom autograd Functions by subclassing torch.autograd.Function and implementing the forward and backward passes which operate on Tensors.
- static backward(ctx, grad_output)[source]
In the backward pass we receive a Tensor containing the gradient of the loss with respect to the output, and we need to compute the gradient of the loss with respect to the input. Here we are just passing through the previous gradient since we want the gradient for this max operation to be gradient of identity.
- static forward(ctx, input)[source]
In the forward pass we receive a Tensor containing the input and return a Tensor containing the output. ctx is a context object that can be used to stash information for backward computation. You can cache arbitrary objects for use in the backward pass using the ctx.save_for_backward method.
- class neuromancer.slim.linear.IdentityInitLinear(insize, outsize, bias=False, **kwargs)[source]
Bases:
Linear
Linear map initialized to Identity matrix.
- class neuromancer.slim.linear.IdentityLinear(insize, outsize, bias=False, **kwargs)[source]
Bases:
IdentityInitLinear
Identity operation compatible with all LinearBase functionality.
- class neuromancer.slim.linear.L0Linear(insize, outsize, bias=True, weight_decay=1.0, droprate_init=0.5, temperature=0.6666666666666666, lamda=1.0)[source]
Bases:
LinearBase
Implementation of L0 regularization for the input units of a fully connected layer
Reference implementation: https://github.com/AMLab-Amsterdam/L0_regularization/blob/master/l0_layers.py
Note
This implementation may need to be adjusted as there is the same sampling for each input in the minibatch which may inhibit convergence. Also, there will be a different sampling for each call during training so it may cause issues included in a layer for a recurrent computation (fx in state space model).
- effective_W()[source]
The matrix used in the equivalent matrix multiplication for the parametrization
- Returns:
(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply
- class neuromancer.slim.linear.LassoLinear(insize, outsize, bias=False, gamma=1.0, **kwargs)[source]
Bases:
LinearBase
From https://leon.bottou.org/publications/pdf/compstat-2010.pdf
- effective_W()[source]
The matrix used in the equivalent matrix multiplication for the parametrization
- Returns:
(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply
- class neuromancer.slim.linear.LassoLinearRELU(insize, outsize, bias=False, gamma=1.0, **kwargs)[source]
Bases:
LinearBase
From https://leon.bottou.org/publications/pdf/compstat-2010.pdf
- class neuromancer.slim.linear.LeftStochasticLinear(insize, outsize, bias=False, **kwargs)[source]
Bases:
LinearBase
A left stochastic matrix is a real square matrix, with each column summing to 1.
- class neuromancer.slim.linear.Linear(insize, outsize, bias=False, **kwargs)[source]
Bases:
LinearBase
Wrapper for torch.nn.Linear with additional slim methods returning matrix, eigenvectors, eigenvalues and regularization error.
- class neuromancer.slim.linear.LinearBase(insize, outsize, bias=False, provide_weights=True)[source]
Bases:
Module
,ABC
Base class defining linear map interface.
- property device
- abstract effective_W()[source]
The matrix used in the equivalent matrix multiplication for the parametrization
- Returns:
(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply
- eig()[source]
Returns the eigenvalues (optionally eigenvectors) of the linear map used in matrix multiplication.
- Returns:
(torch.Tensor) Vector of eigenvalues, optionally a tuple including a matrix of eigenvectors.
- class neuromancer.slim.linear.NonNegativeLinear(insize, outsize, bias=False, **kwargs)[source]
Bases:
LinearBase
Positive parametrization of linear map via Relu.
- class neuromancer.slim.linear.OrthogonalLinear(insize, outsize, bias=False, **kwargs)[source]
Bases:
SquareLinear
Orthogonal parametrization via householder reflection
- class neuromancer.slim.linear.PSDLinear(insize, outsize, bias=False, **kwargs)[source]
Bases:
SquareLinear
Symmetric Positive semi-definite matrix.
- class neuromancer.slim.linear.PerronFrobeniusLinear(insize, outsize, bias=False, sigma_min=0.8, sigma_max=1.0, **kwargs)[source]
Bases:
LinearBase
- class neuromancer.slim.linear.PowerBoundLinear(insize, outsize, max_p=1, pwr_iters=200, bias=False, **kwargs)[source]
Bases:
LinearBase
Linear map with constrained spectral radius via the power method.
- class neuromancer.slim.linear.RightStochasticLinear(insize, outsize, bias=False, **kwargs)[source]
Bases:
LinearBase
A right stochastic matrix is a real square matrix, with each row summing to 1.
- class neuromancer.slim.linear.SVDLinear(insize, outsize, bias=False, sigma_min=0.1, sigma_max=1.0, **kwargs)[source]
Bases:
LinearBase
Linear map with constrained eigenvalues via approximate SVD factorization. Soft SVD based regularization of matrix \(A\). \(A = U \Sigma V\). \(U,V\) are unitary matrices (orthogonal for real matrices \(A\)). \(\Sigma\) is a diagonal matrix of singular values (square roots of eigenvalues).
This below paper uses the same factorization and orthogonality constraint as implemented here but enforces a low rank prior on the map by introducing a sparse prior on the singular values:
Also a similar regularization on the factors as to our implementation:
- class neuromancer.slim.linear.SVDLinearLearnBounds(insize, outsize, bias=False, sigma_min=0.1, sigma_max=1.0, **kwargs)[source]
Bases:
SVDLinear
- class neuromancer.slim.linear.SchurDecompositionLinear(insize, outsize, bias=False, l2=0.01, **kwargs)[source]
Bases:
SquareLinear
- class neuromancer.slim.linear.SkewSymmetricLinear(insize, outsize, bias=False, **kwargs)[source]
Bases:
SquareLinear
Skew-symmetric (or antisymmetric) matrix \(A\) (effective_W) is a square matrix whose transpose equals its negative. \(A = -A^T\)
- class neuromancer.slim.linear.SpectralLinear(insize, outsize, bias=False, n_U_reflectors=None, n_V_reflectors=None, sigma_min=0.1, sigma_max=1.0, **kwargs)[source]
Bases:
LinearBase
SVD paramaterized linear map of form \(U \Sigma V\) via Householder reflection. Singular values can be constrained to a range. Translated from tensorflow code:
- class neuromancer.slim.linear.SplitLinear(insize, outsize, bias=False, **kwargs)[source]
Bases:
LinearBase
\(A = B − C\), with \(B ≥ 0\) and \(C ≥ 0\).
- class neuromancer.slim.linear.SquareLinear(insize, outsize, bias=False, provide_weights=True, **kwargs)[source]
Bases:
LinearBase
,ABC
Base class for linear map parametrizations that assume a square matrix.
- class neuromancer.slim.linear.StableSplitLinear(insize, outsize, bias=False, sigma_min=0.1, sigma_max=1.0, **kwargs)[source]
Bases:
LinearBase
\(A = B − C\), with stable B and stable C
- class neuromancer.slim.linear.SymmetricLinear(insize, outsize, bias=False, **kwargs)[source]
Bases:
SquareLinear
Symmetric matrix \(A\) (effective_W) is a square matrix that is equal to its transpose. \(A = A^T\)
- class neuromancer.slim.linear.SymmetricSVDLinear(insize, outsize, bias=False, sigma_min=0.1, sigma_max=1.0, **kwargs)[source]
Bases:
SVDLinear
\(U = V\)
- class neuromancer.slim.linear.SymmetricSpectralLinear(insize, outsize, bias=False, n_reflectors=None, sigma_min=0.1, sigma_max=1.0, **kwargs)[source]
Bases:
SpectralLinear
\(U = V\)
- class neuromancer.slim.linear.SymplecticLinear(insize, outsize, bias=False, **kwargs)[source]
Bases:
LinearBase
- class neuromancer.slim.linear.TrivialNullSpaceLinear(insize, outsize, bias=False, rank=None, epsilon=0.1, **kwargs)[source]
Bases:
LinearBase
Matrix with trivial null space as defined via eq. 2 in https://arxiv.org/abs/1808.00924