util¶
Python and numpy functions.
-
class
util.
ExponentialRunningMean
(alpha=1.0)[source]¶ Calculates the running mean of row vectors batchwise given a sequence of matrices.
Parameters: alpha – (float) Higher alpha discounts older observations faster. The smaller the alpha, the further you take into consideration the past.
-
class
util.
Parser
(prog=None, usage=None, description=None, epilog=None, version=None, parents=[], formatter_class=<class 'argparse.HelpFormatter'>, prefix_chars='-', fromfile_prefix_chars=None, argument_default=None, conflict_handler='error', add_help=True)[source]¶ Hack for Sphinx documentation of scripts to work correctly.
-
class
util.
RunningMean
(axis=0)[source]¶ Calculates the batchwise running mean from rows, columns, or values of a matrix.
Parameters: axis – The axis to calculate the running mean over. If axis==None then the running mean for the entire array is taken.
-
util.
get_mask
(lens, num_tokens)[source]¶ For masking output of lm_rnn for jagged sequences for correct gradient update. Sequence length of 0 will output nan for that row of mask so don’t do this.
Parameters: - lens – Numpy vector of sequence lengths
- num_tokens – (int) Number of predicted tokens in sentence.
Returns: A numpy array mask MB X num_tokens For each row there are: lens[i] values of 1/lens[i]
followed by num_tokens - lens[i] zeros
-
util.
get_multivariate_loss_names
(loss_spec)[source]¶ For use in conjunction with tf_ops.multivariate_loss. Gives the names of all contributors (columns) of the loss matrix.
Parameters: loss_spec – A list of 3-tuples of the form (input_name, loss_function, dimension) where input_name is the same as a target in datadict, loss_function takes two parameters, a target and prediction, and dimension is the dimension of the target. Returns: loss_names is a list concatenated_feature_size long with names of all loss contributors.
-
util.
make_feature_spec
(dataspec)[source]¶ Makes lists of all the continuous and categorical features to be used as input features of a neural network.
Parameters: dataspec – (dict) From a json specification of the purpose of fields in the csv input file (See docs for formatting) Returns: (dict) features {‘categorical’: [categorical_feature_1, …, categorical_feature_j], ‘continuous’: [continuous_feature_1, …, continuous_feature_k]}
-
util.
make_loss_spec
(dataspec, mvn)[source]¶ Makes a list of tuples for each target to be used in training a multiple output neural network modeling a mixed joint distribution of discrete and continuous variables. :param dataspec: (dict) From a json specification of the purpose of fields in the csv input file (See docs for formatting) :param mvn: Tensorflow function for calculating type of multivariate loss for continuous target vectors.
Can be tf_ops.diag_mvn_loss, tf_ops.full_mvn_loss, tf_ops.eyed_mvn_lossReturns: A list of tuples of the form: (target_name, loss_function, dimension) where dimension is the dimension of the target vector (for categorical features this is the number of classes, for continuous targets this is the size of the continuous target vector)