Welcome to NeuralNets’s documentation!¶

class
neuralnets.neural_nets.
NeuralNet
(n_neurons, activations='relu', learning_rate=0.005, n_epochs=200, batch_size=64, dropout=1, lambda_reg=0, init_strat=None, solver='sgd', beta_1=0.9, beta_2=0.999, seed=None, check_gradients=False, verbose=False)¶ A neural network model.
Parameters:  n_neurons (list) – list of H + 2 integers indicating the number of neurons in each layer, including input and output layers. First value is the number of features of a training example. Last value is either 1 (2, classes, logistic loss) or C > 1 (C > 2 classes, cross entropy), where C is the number of classes. Inbetween, the H values indicate the number of neurons of each of the H hidden layer.
 activations (str or list of str) – The activation functions to use for each of the H hidden layers. Allowed values are ‘sigmoid’, ‘tanh’, ‘relu’ or ‘linear’ (i.e. no activation). If a str is given, then all activations are the same for each hidden layer. If a list of string is given, it must be of size H. Default is ‘relu’. Note: the activation function of the last layer is automatically inferred from the value of n_neurons[1]: if the output layer size is 1 then a sigmoid is used, else it’s a softmax.
 learning_rate (float) – The learning rate for gradient descent. Default is .005.
 n_epochs (int) – The number of iteration of the gradient descent procedure, i.e. number of times the whole training set is gone through. Default is 200.
 batch_size (int) – The batch size. If 0, the full trainset is used. Default is 64.
 dropout (float) – Probability of keeping a neuron of the hidden layers in dropout. Default is 1, i.e. no dropout is applied.
 lambda_reg (float) – The regularization constant. Default is 0, i.e. no regularization.
 init_strat (str) – Initialization strategy for weights. Can be ‘He’ for ‘He’ initialization, recommended for relu layers. Default is None, which reverts to a centered normal distribution * 0.1.
 solver (str) – Solver to use: either ‘sgd’ or ‘adam’ for SGD or... adam ;). Default is ‘sgd’.
 beta_1 (float) – Exponential decay rate for first moment estimate (only used if solver is ‘adam’. Default is .9
 beta_2 (float) – Exponential decay rate for second moment estimate (only used if solver is ‘adam’. Default is .999.
 seed (int) – A random seed to use for the RNG at weights initialization. Default is None, i.e. no seeding is done.
 check_gradients (bool) – Whether to check gradients at each iteration, for each parameter. It’s done with np.isclose() with default tolerance values. Default is False.
 verbose (int) – if not False or 0, will print the loss every ‘verbose’ epochs. Default is False.
Note: The NeuralNet estimator is (roughly) compliant with scikitlearn API so the inputs X fit and predict is [n_entries, n_features], but internally we use X.T because it seems more convenient.

adam
(dW, db)¶ Adam step.

backward
(X, y, cache)¶ Backward pass. Returns gradients.

check_gradients
(X, y, dW, db)¶ Do gradient checking for every single parameter. Raises an exception if computed gradients and estimated gradients are not close enough.

fit
(X, y)¶ Fit model with input X[n_entries, n_features] and output y.

forward
(X)¶ Forward pass. Returns output layer and intermediate values in cache which will be used during backprop.

get_batches
(X, y)¶ Return a list of batches (X_b, yb) to train on.

init_activations
(activations)¶ Initialize activations.

init_adam
(beta_1, beta_2)¶ Initialize adam parameters.

init_params
(seed, init_strat)¶ Initialize weights and biases.

predict
(X)¶ Predict outputs of entries in X [n_entries, n_features]

sgd
(dW, db)¶ SGD step.