Optimizers¶
Optimizers implement the update rule used to modify the optimized parameters at each step of the optimization process. Dr.TVAM uses the Optimizer sub-classes from Mitsuba 3, therefore any optimizer available in Mitsuba 3 can be used in Dr.TVAM. We additionally provide two variants of the L-BFGS optimizer, which we will detail shortly.
Optimizers in mitsuba are instatiated as regular objects:
opt = mi.ad.Adam(lr=1e-2)
They maintain a dictionary-like interface to set and access optimization parameters. Marking a parameter as optimizable is done like so:
opt['x'] = mi.TensorXf(1.0)
From that point on, opt['x'] will be considered as a differentiable
quantity, and will get a gradient after backpropagation. Please see the
corresponding tutorial
in the Mitsuba 3 documentation for more details.
Mitsuba 3 provides the following optimizers:
Additionally, we provide two variants of the L-BFGS optimizer:
L-BFGS (LBFGS)¶
This optimizer implements the classic limited-memory version of the BFGS algorithm. It is a quasi-Newton method that approximates the Hessian matrix of the loss function using past gradients.
The following additional parameters should be provided:
Key |
Type |
Description |
|---|---|---|
|
|
The number of past gradients to store. Typical values aare between 5 and 10. A higher value will require more memory, as more gradients are stored, which can be prohibitive for large problems. Defaults to 5. |
|
function |
The L-BFGS update determines the step size by performing a backtracking line search, which evaluates the loss for varying step sizes until one satisfying the Wolfe conditions is found. The line search function should take a dictionary of updated parameters as input, and return the loss value for those parameters. |
|
|
Whether to use the Wolfe conditions for the line search or only the simpler Armijo rule. Defaults to False (i.e. Armijo rule). |
|
|
The maximum number of iterations for the line search. Defaults to 20. |
Linear L-BFGS (Linear LBFGS)¶
The main use case of Dr.TVAM is to optimize patterns for printing. In that case, the forward model is linear with respects to the patterns, which enables a nice performance optimization: the line search in L-BFGS requires evaluating the forward model at each step, which can be expensive. If the operation is linear, we only need to compute it once for the search direction \(d\):
Then, only the loss function \(\mathcal{L}\) needs to be evaluated at each line search step, which is much faster than evaluating the full forward model.
This optimizer requires the following additional parameters:
Key |
Type |
Description |
|---|---|---|
|
|
The number of past gradients to store. Typical values aare between 5 and 10. A higher value will require more memory, as more gradients are stored, which can be prohibitive for large problems. Defaults to 5. |
|
function |
The line search function from L-BFGS is now split in two parts. The
|
|
function |
A function that takes as argument the recorded dose in the medium, and returns the loss value. This is the function that will be evaluated in the backtracking line search. |
|
|
The maximum number of iterations for the line search. Defaults to 20. |