neuromancer.slim.linear module

Structured linear maps which are drop in replacements for torch.nn.Linear

Pytorch weight initializations used in this module:

  • torch.nn.init.xavier_normal_(tensor, gain=1.0)

  • torch.nn.init.kaiming_normal_(tensor, a=0, mode=’fan_in’, nonlinearity=’leaky_relu’)

  • torch.nn.init.orthogonal_(tensor, gain=1)

  • torch.nn.init.sparse_(tensor, sparsity, std=0.01)

class neuromancer.slim.linear.BoundedNormLinear(insize, outsize, bias=False, sigma_min=0.1, sigma_max=1.0, p=2, **kwargs)[source]

Bases: Linear

sigma_min <= ||A||_p <= sigma_max p = type of the matrix norm sigma_min = minimum allowed value of eigenvalues sigma_max = maximum allowed value of eigenvalues

reg_error()[source]

Regularization error associated with linear map parametrization.

Returns:

(torch.float)

class neuromancer.slim.linear.ButterflyLinear(insize, outsize, bias=False, complex=False, tied_weight=True, increasing_stride=True, ortho_init=False, **kwargs)[source]

Bases: LinearBase

Sparse structured linear maps from: https://github.com/HazyResearch/learning-circuits

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

forward(x)[source]
Parameters:

x – (torch.Tensor, shape=[batchsize, in_features])

Returns:

(torch.Tensor, shape=[batchsize, out_features])

class neuromancer.slim.linear.DampedSkewSymmetricLinear(insize, outsize, bias=False, sigma_min=0.1, sigma_max=0.5, **kwargs)[source]

Bases: SkewSymmetricLinear

Skew-symmetric linear map with damping.

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

class neuromancer.slim.linear.GershgorinLinear(insize, outsize, bias=False, sigma_min=0.0, sigma_max=1.0, real=True, **kwargs)[source]

Bases: SquareLinear

Uses Gershgorin Disc parametrization to constrain eigenvalues of the matrix. See:

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

neuromancer.slim.linear.Hprod(x, u, k)[source]

Helper function for computing matrix multiply via householder reflection representation. :param x: (torch.Tensor shape=[batchsize, dimension]) :param u: (torch.Tensor shape=[dimension]) :param k: (int) :return: (torch.Tensor shape=[batchsize, dimension])

class neuromancer.slim.linear.IdentityGradReLU(*args, **kwargs)[source]

Bases: Function

We can implement our own custom autograd Functions by subclassing torch.autograd.Function and implementing the forward and backward passes which operate on Tensors.

static backward(ctx, grad_output)[source]

In the backward pass we receive a Tensor containing the gradient of the loss with respect to the output, and we need to compute the gradient of the loss with respect to the input. Here we are just passing through the previous gradient since we want the gradient for this max operation to be gradient of identity.

static forward(ctx, input)[source]

In the forward pass we receive a Tensor containing the input and return a Tensor containing the output. ctx is a context object that can be used to stash information for backward computation. You can cache arbitrary objects for use in the backward pass using the ctx.save_for_backward method.

class neuromancer.slim.linear.IdentityInitLinear(insize, outsize, bias=False, **kwargs)[source]

Bases: Linear

Linear map initialized to Identity matrix.

class neuromancer.slim.linear.IdentityLinear(insize, outsize, bias=False, **kwargs)[source]

Bases: IdentityInitLinear

Identity operation compatible with all LinearBase functionality.

class neuromancer.slim.linear.L0Linear(insize, outsize, bias=True, weight_decay=1.0, droprate_init=0.5, temperature=0.6666666666666666, lamda=1.0)[source]

Bases: LinearBase

Implementation of L0 regularization for the input units of a fully connected layer

Note

This implementation may need to be adjusted as there is the same sampling for each input in the minibatch which may inhibit convergence. Also, there will be a different sampling for each call during training so it may cause issues included in a layer for a recurrent computation (fx in state space model).

cdf_qz(x)[source]

Implements the CDF of the ‘stretched’ concrete distribution

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

get_eps(size)[source]

Uniform random numbers for the concrete distribution

quantile_concrete(x)[source]

Implements the quantile, aka inverse CDF, of the ‘stretched’ concrete distribution

reg_error()[source]

Expected L0 norm under the stochastic gates, takes into account and re-weights also a potential L2 penalty

class neuromancer.slim.linear.LassoLinear(insize, outsize, bias=False, gamma=1.0, **kwargs)[source]

Bases: LinearBase

From https://leon.bottou.org/publications/pdf/compstat-2010.pdf

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

forward(x)[source]
Parameters:

x – (torch.Tensor, shape=[batchsize, in_features])

Returns:

(torch.Tensor, shape=[batchsize, out_features])

reg_error()[source]

Regularization error associated with linear map parametrization.

Returns:

(torch.float)

class neuromancer.slim.linear.LassoLinearRELU(insize, outsize, bias=False, gamma=1.0, **kwargs)[source]

Bases: LinearBase

From https://leon.bottou.org/publications/pdf/compstat-2010.pdf

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

reg_error()[source]

Regularization error associated with linear map parametrization.

Returns:

(torch.float)

class neuromancer.slim.linear.LeftStochasticLinear(insize, outsize, bias=False, **kwargs)[source]

Bases: LinearBase

A left stochastic matrix is a real square matrix, with each column summing to 1.

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

class neuromancer.slim.linear.Linear(insize, outsize, bias=False, **kwargs)[source]

Bases: LinearBase

Wrapper for torch.nn.Linear with additional slim methods returning matrix, eigenvectors, eigenvalues and regularization error.

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

forward(x)[source]
Parameters:

x – (torch.Tensor, shape=[batchsize, in_features])

Returns:

(torch.Tensor, shape=[batchsize, out_features])

class neuromancer.slim.linear.LinearBase(insize, outsize, bias=False, provide_weights=True)[source]

Bases: Module, ABC

Base class defining linear map interface.

property device
abstract effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

eig()[source]

Returns the eigenvalues (optionally eigenvectors) of the linear map used in matrix multiplication.

Returns:

(torch.Tensor) Vector of eigenvalues, optionally a tuple including a matrix of eigenvectors.

forward(x)[source]
Parameters:

x – (torch.Tensor, shape=[batchsize, in_features])

Returns:

(torch.Tensor, shape=[batchsize, out_features])

reg_error()[source]

Regularization error associated with linear map parametrization.

Returns:

(torch.float)

class neuromancer.slim.linear.NonNegativeLinear(insize, outsize, bias=False, **kwargs)[source]

Bases: LinearBase

Positive parametrization of linear map via Relu.

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

class neuromancer.slim.linear.OrthogonalLinear(insize, outsize, bias=False, **kwargs)[source]

Bases: SquareLinear

Orthogonal parametrization via householder reflection

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

forward(x)[source]
Parameters:

x – (torch.Tensor, shape=[batchsize, in_features])

Returns:

(torch.Tensor, shape=[batchsize, out_features])

class neuromancer.slim.linear.PSDLinear(insize, outsize, bias=False, **kwargs)[source]

Bases: SquareLinear

Symmetric Positive semi-definite matrix.

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

class neuromancer.slim.linear.PerronFrobeniusLinear(insize, outsize, bias=False, sigma_min=0.8, sigma_max=1.0, **kwargs)[source]

Bases: LinearBase

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

class neuromancer.slim.linear.PowerBoundLinear(insize, outsize, max_p=1, pwr_iters=200, bias=False, **kwargs)[source]

Bases: LinearBase

Linear map with constrained spectral radius via the power method.

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

eig_v_estimate()[source]
reg_error()[source]

Regularization error enforces upper bound on spectral radius

class neuromancer.slim.linear.RightStochasticLinear(insize, outsize, bias=False, **kwargs)[source]

Bases: LinearBase

A right stochastic matrix is a real square matrix, with each row summing to 1.

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

class neuromancer.slim.linear.SVDLinear(insize, outsize, bias=False, sigma_min=0.1, sigma_max=1.0, **kwargs)[source]

Bases: LinearBase

Linear map with constrained eigenvalues via approximate SVD factorization. Soft SVD based regularization of matrix \(A\). \(A = U \Sigma V\). \(U,V\) are unitary matrices (orthogonal for real matrices \(A\)). \(\Sigma\) is a diagonal matrix of singular values (square roots of eigenvalues).

This below paper uses the same factorization and orthogonality constraint as implemented here but enforces a low rank prior on the map by introducing a sparse prior on the singular values:

Also a similar regularization on the factors as to our implementation:

effective_W()[source]
Returns:

Matrix for linear transformation with dominant eigenvalue between sigma_max and sigma_min

orthogonal_error(weight)[source]
reg_error()[source]

Regularization error enforces orthogonality constraint for matrix factors

class neuromancer.slim.linear.SVDLinearLearnBounds(insize, outsize, bias=False, sigma_min=0.1, sigma_max=1.0, **kwargs)[source]

Bases: SVDLinear

class neuromancer.slim.linear.SchurDecompositionLinear(insize, outsize, bias=False, l2=0.01, **kwargs)[source]

Bases: SquareLinear

build_T(T)[source]
effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

reg_error()[source]

Regularization error associated with linear map parametrization.

Returns:

(torch.float)

class neuromancer.slim.linear.SkewSymmetricLinear(insize, outsize, bias=False, **kwargs)[source]

Bases: SquareLinear

Skew-symmetric (or antisymmetric) matrix \(A\) (effective_W) is a square matrix whose transpose equals its negative. \(A = -A^T\)

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

class neuromancer.slim.linear.SpectralLinear(insize, outsize, bias=False, n_U_reflectors=None, n_V_reflectors=None, sigma_min=0.1, sigma_max=1.0, **kwargs)[source]

Bases: LinearBase

SVD paramaterized linear map of form \(U \Sigma V\) via Householder reflection. Singular values can be constrained to a range. Translated from tensorflow code:

Sigma()[source]
Umultiply(x)[source]
Vmultiply(x)[source]
effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

forward(x)[source]
Parameters:

x – (torch.Tensor, shape=[batchsize, in_features])

Returns:

(torch.Tensor, shape=[batchsize, out_features])

class neuromancer.slim.linear.SplitLinear(insize, outsize, bias=False, **kwargs)[source]

Bases: LinearBase

\(A = B − C\), with \(B ≥ 0\) and \(C ≥ 0\).

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

class neuromancer.slim.linear.SquareLinear(insize, outsize, bias=False, provide_weights=True, **kwargs)[source]

Bases: LinearBase, ABC

Base class for linear map parametrizations that assume a square matrix.

abstract effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

class neuromancer.slim.linear.StableSplitLinear(insize, outsize, bias=False, sigma_min=0.1, sigma_max=1.0, **kwargs)[source]

Bases: LinearBase

\(A = B − C\), with stable B and stable C

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

class neuromancer.slim.linear.SymmetricLinear(insize, outsize, bias=False, **kwargs)[source]

Bases: SquareLinear

Symmetric matrix \(A\) (effective_W) is a square matrix that is equal to its transpose. \(A = A^T\)

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

class neuromancer.slim.linear.SymmetricSVDLinear(insize, outsize, bias=False, sigma_min=0.1, sigma_max=1.0, **kwargs)[source]

Bases: SVDLinear

\(U = V\)

class neuromancer.slim.linear.SymmetricSpectralLinear(insize, outsize, bias=False, n_reflectors=None, sigma_min=0.1, sigma_max=1.0, **kwargs)[source]

Bases: SpectralLinear

\(U = V\)

class neuromancer.slim.linear.SymplecticLinear(insize, outsize, bias=False, **kwargs)[source]

Bases: LinearBase

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply

class neuromancer.slim.linear.TrivialNullSpaceLinear(insize, outsize, bias=False, rank=None, epsilon=0.1, **kwargs)[source]

Bases: LinearBase

Matrix with trivial null space as defined via eq. 2 in https://arxiv.org/abs/1808.00924

effective_W()[source]

The matrix used in the equivalent matrix multiplication for the parametrization

Returns:

(torch.Tensor, shape=[insize, outsize]) Matrix used in matrix multiply