Adam#

class seli.opt.Adam(lr: float = 0.0003, beta1: float = 0.9, beta2: float = 0.999, eps: float = 1e-08)[source]#

Bases: Optimizer

Adaptive Moment Estimation optimizer.

Combines momentum and RMSProp, maintaining both first moment (mean) and second moment (variance) of gradients with bias correction.

Adam has become the de facto standard optimizer for deep learning.

Methods Summary

call_model(grads, **_)

Process the gradients of the whole model.

call_param(key, grad, **_)

Process the gradients of a single parameter.

Methods Documentation

call_model(grads, **_)[source]#

Process the gradients of the whole model. The absolute loss value and parameter values are also provided to the optimizer.

This function is useful for implementing custom optimizers that work on the whole model at once.

Parameters:
  • model (NodeType) – The model to process.

  • loss (Float[Array, ""]) – The absolute loss value.

  • grads (dict[str, Float[Array, "..."]]) – The gradients of the model parameters.

  • values (dict[str, Float[Array, "..."]]) – The parameter values of the model.

Returns:

grads – The processed gradients of the model parameters.

Return type:

dict[str, Float[Array, “…”]]

call_param(key: str, grad: Float[Array, '*s'], **_) Float[Array, '*s'][source]#

Process the gradients of a single parameter. This function is useful for implementing custom optimizers that essentially run the same function for all parameters. This is the case for most well known optimizers.

Parameters:
  • loss (Float[Array, ""]) – The absolute loss value.

  • key (str) – The key of the parameter.

  • grad (Float[Array]) – The gradients of the parameter.

  • param (Float[Array]) – The parameter values.

Returns:

grad – The processed gradients of the parameter.

Return type:

Float[Array]