.. _optimizer: Optimizers ========== Optimizers implement the update rule used to modify the optimized parameters at each step of the optimization process. Dr.TVAM uses the `Optimizer` sub-classes from Mitsuba 3, therefore any optimizer available in Mitsuba 3 can be used in Dr.TVAM. We additionally provide two variants of the L-BFGS optimizer, which we will detail shortly. Optimizers in mitsuba are instatiated as regular objects: .. code-block:: python opt = mi.ad.Adam(lr=1e-2) They maintain a dictionary-like interface to set and access optimization parameters. Marking a parameter as optimizable is done like so: .. code-block:: python opt['x'] = mi.TensorXf(1.0) From that point on, ``opt['x']`` will be considered as a differentiable quantity, and will get a gradient after backpropagation. Please see the corresponding `tutorial `_ in the Mitsuba 3 documentation for more details. Mitsuba 3 provides the following optimizers: * `SGD `_ * `Adam `_ Additionally, we provide two variants of the L-BFGS optimizer: L-BFGS (``LBFGS``) ------------------ This optimizer implements the classic limited-memory version of the BFGS algorithm. It is a quasi-Newton method that approximates the Hessian matrix of the loss function using past gradients. The following additional parameters should be provided: .. list-table:: :widths: 10 10 80 :header-rows: 1 * - Key - Type - Description * - ``m`` - ``int`` - The number of past gradients to store. Typical values aare between 5 and 10. A higher value will require more memory, as more gradients are stored, which can be prohibitive for large problems. Defaults to 5. * - ``line_search_fn`` - function - The L-BFGS update determines the step size by performing a backtracking line search, which evaluates the loss for varying step sizes until one satisfying the `Wolfe conditions `_ is found. The line search function should take a dictionary of updated parameters as input, and return the loss value for those parameters. * - ``wolfe`` - ``bool`` - Whether to use the Wolfe conditions for the line search or only the simpler Armijo rule. Defaults to False (i.e. Armijo rule). * - ``search_it`` - ``int`` - The maximum number of iterations for the line search. Defaults to 20. Linear L-BFGS (``Linear LBFGS``) -------------------------------- The main use case of Dr.TVAM is to optimize patterns for printing. In that case, the forward model is linear with respects to the patterns, which enables a nice performance optimization: the line search in L-BFGS requires evaluating the forward model at each step, which can be expensive. If the operation is linear, we only need to compute it once for the search direction :math:`d`: .. math:: \mathcal{L}(f(x + \alpha d)) = \mathcal{L}(f(x) + \alpha f(d)) Then, only the loss function :math:`\mathcal{L}` needs to be evaluated at each line search step, which is much faster than evaluating the full forward model. This optimizer requires the following additional parameters: .. list-table:: :widths: 10 10 80 :header-rows: 1 * - Key - Type - Description * - ``m`` - ``int`` - The number of past gradients to store. Typical values aare between 5 and 10. A higher value will require more memory, as more gradients are stored, which can be prohibitive for large problems. Defaults to 5. * - ``render_fn`` - function - The line search function from L-BFGS is now split in two parts. The ``render_fn`` evaluates the forward model, given a dictionary of parameters. It should return the recorded dose in the medium, i.e. the output of the rendering operation. * - ``loss_fn`` - function - A function that takes as argument the recorded dose in the medium, and returns the loss value. This is the function that will be evaluated in the backtracking line search. * - ``search_it`` - ``int`` - The maximum number of iterations for the line search. Defaults to 20.