esfit

Least-squares fitting of EPR and other data.

Syntax

esfit(data,fcn,p0,vary)
esfit(data,fcn,p0,lb,ub)
esfit(___,FitOpt)
fit = esfit(___)

See also the user guide on least-squares fitting.

Description

esfit fits EPR data simulated with any EasySpin function, e.g. garlic, pepper, chili, etc. to experimental spectral data using least-squares fitting algorithms and calculates uncertainties (error bars) for the fitted parameters. Additionally, it can also be used for fitting of any other type of data, given a user-defined model function. Global fitting of multiple data sets is supported for user-defined simulation and model functions (see user guide for details).

The input parameters are the following:

data is a 1D array containing the experimental spectral data or a cell array containing multiple data sets (in the latter case, a custom simulation function returning a cell array of calculated data needs to be defined for fitting).
fcn is the function handle of a simulation or model function: @garlic, @pepper, @chili, @saffron, @salt, or a user-supplied custom model function that takes an input parameter vector p and returns simulated data, datasim = fcn(p).

For global fitting of multiple data sets, a custom model function is required, which takes input parameters and returns a cell array of simulated data sets with dimensions matching the corresponding experimental data sets. Custom model functions based on EasySpin functions can be used with a spin system and experimental structure as input, which must be adapted internally for a series of different simulations that are then returned in a cell array (see the user guide for an example).
p0 defines the starting values for the fitting parameters.

For EasySpin functions and custom functions based on them, provide the spin system structure containing all the parameters of the spin system, the experimental parameter structure and, if desired, a simulation options structure Opt.
```
esfit(data,@pepper,{Sys0,Exp0},{varySys})
esfit(data,@pepper,{Sys0,Exp0,Opt},{varySys})
esfit(data,@pepper,{Sys0,Exp0,Opt},{varySys},FitOpt)
```
For other functions, p0 is an n-element vector.
```
userfcn = @(p) modelfunction(p);
esfit(data,userfcn,p0,vary)
```
For parameters that are to be varied during the fitting process, the values in p0 represent the starting values of the fit and the centers of the search ranges, if the fitting parameter variation is defined in vary.

For multi-component systems, Sys0 is a cell array of spin system structures, e.g. {Sys1,Sys2} for a two-component system. The weights for the different components are included in each component structure, e.g. Sys1.weight = 0.7 and Sys2.weight = 0.3. The weights are relative (0.7 and 0.3 is the same as 14 and 6) and can be included in the fit. If weight is omitted, it is assumed to be 1. There is no limit on the number of components.
```
esfit(data,@pepper,{{Sys1,Sys2},Exp0},{{varySys1,varySys2}})
```
Exp0 contains the experimental parameters needed for the simulation. For EPR spectral simulations, make sure to specify the field or frequency range and the number of points you used for the experimental data in Exp.Range or Exp.mwRange. If you let the simulation function automatically pick a range, it will likely be different from the experimental one, and esfit cannot do its work. For example, if x is the experimental field range, use Exp.Range = [min(x) max(x)] and Exp.nPoints = numel(x).

Opt contains simulation options that are forwarded directly to the simulation function. Refer to the documentation of chili, garlic, pepper, and salt for details.
vary contains the maximum variations for the parameters allowed during fitting.

For EasySpin-style functions, this should mirror the p0 input style, e.g. {SysVary} or {SysVary,ExpVary} if experimental parameters should also be varied in the fitting. For EasySpin functions, if a spin system or experimental parameter should be included in the fitting, give it a non-zero value in this structure. If it should not be included, set its value in this structure to zero, or don't include it at all.
```
Sys0.lwpp = 5;
SysVary.lwpp = 2;    % the line width is searched over a range of about 3 to 7
SysVary.lwpp = 0;    % the line width is kept constant at the value given in Sys0
```
For multiple components, SysVary should be a cell array containing one structure per component, e.g. {SysVary1,SysVary2} for two components. If none of the parameters of a component are varied in the fit, [] can be specified, e.g. {SysVary1,[]} for a two-component fit where only parameters of the first component are varied.
lb and ub contain the lower and upper bounds for variation of the parameters allowed in the fitting and can be used instead of vary. For EasySpin-style functions, this should mirror the p0 input style, e.g. {lbSys,lbExp} and {ubSys,ubExp}.
FitOpt is a structure containing settings for the optimization algorithms used in esfit. The possible settings are described further down on this page.

If no output is requested, esfit starts an interactive user interface that allows extensive control over the fitting process and also allows the export of the fitting results to the workspace. In the UI, the fitting algorithm can be changed, parameters can be excluded from the fit, and multiple fit sets can be stored.

If esfit is called with one output, it runs without a user interface, and returns a structure fit containing the fit results. A similar structure is also returned if esfit is used with the graphical user interface and a set of parameters is exported.

fit and fitraw

the simulated spectrum scaled to fit the experimental spectrum, and the raw simulated spectrum as returned by the simulation function

mask

logical array used as a mask during the fitting procedure (defined in the script or chosen interactively through the graphical user interface)

residuals

the residuals vector

baseline, baselinetype and scale

the baseline, baseline type and scaling factor used to fit the simulated spectrum to the experimental data

pfit and pnames

array of fitted parameter values and corresponding variable names

p_start, p_fixed and pfit_full

start values for the fitting parameters, logical array specifying fixed parameters and the full array of fitted parameter values, including not actively fitted parameters (if esfit is used from the graphical user interface)

argsfit

EasySpin-style structures containing the fitted parameters (if using EasySpin-style functions)

pstd

the standard deviation for all of the parameters

ci95

the 95% confidence intervals for all of the parameters

cov

the covariance matrix

corr

the correlation matrix (1 on the diagonal, correlation coefficients between -1 and +1 everywhere else)

ssr, rmsd and redchisquare

the sum of squared residuals and the root-mean-square deviation between the data and the fitted model, and the reduced chi square value determined considering a noise estimate

bestfithistory

structure containing a list of progressively improved rmsd values obtained during the fitting procedure and the corresponding list of fitting parameters. For EasySpin model functions, a conversion function returning the EasySpin input structures given a selected set of fitting parameters is also included.

The structure FitOpt contains fitting options. The most important ones select the fitting algorithm and the function to use for computing the residuals:

FitOpt.Method

Indicates the least-squares fitting method to be used, consisting of up to two keywords. One keyword specifies the algorithm, and another one selects the way residuals are computed during the fitting. Some examples:

FitOpt.Method = 'simplex fcn';
FitOpt.Method = 'genetic int';

The keywords for choosing one of the available algorithms are

'simplex': Nelder-Mead simplex algorithm (unconstrained)
'levmar': Levenberg-Marquardt (unconstrained)
'montecarlo': Monte Carlo random search (constrained)
'genetic': Genetic algorithm (constrained)
'grid': Systematic grid search (constrained)
'swarm': Particle swarm algorithm (constrained)
'lsqnonlin': lsqnonlin function from the Matlab Optimization Toolbox

If you don't specify anything, the default 'simplex' is used.

The second keyword specifies what the residuals and consequently the root-mean-square deviation (rmsd) should be computed from during the fitting procedure. Usually the residuals are computed by taking the difference of the experimental and the simulated spectrum. However, the spectra can be transformed before computing the residuals.

'fcn' means take the experimental and simulated spectral data directly.
'int': integrate the two spectra.
'dint': use the double integrals of the two spectra.
'diff': compute the derivatives of the two spectra.
'fft': FFT both spectra before computing the residuals.

The default value is 'fcn'. If there are many resolved lines, 'int' gives better convergence. In all cases, the residuals returned in the output refer to the experimental data as provided.

FitOpt.AutoScale

Possible options are 'none', 'lsq' and 'maxabs'.
If set to 'none', the simulated data is not rescaled. If set to 'lsq', the simulated data are rescaled before comparing to the experimental data, and the scale factor is reported in fit.scale. If set to 'maxabs', the simulated data is scaled to the maximum absolute of the experimental data.
By default, it is set to 'lsq' if using one of the EasySpin spectral simulation functions, otherwise it is set to 'none'. For global fitting of multiple data sets, if selected, FitOpt.AutoScale scales each data set individually.

FitOpt.BaseLine

Polynomial order for the baseline added to the simulated data to match the experimental data (0, 1, 2 or 3), set this to [] if no baseline should be included. For global fitting of multiple data sets, different polynomial order baselines can be used for the different data sets by specifying an array. The baseline used is returned in fit.baseline.

FitOpt.Mask

Array of 1 and 0 with the same size as the data vector, used to exclude parts of the experimental data from the fitting. Values with mask 0 are excluded from the fit. For global fitting of multiple data sets, FitOpt.Mask should be provided as a cell array. In the graphical user interface, the mask can be selected manually and is returned in the output structure.

FitOpt.CalculateUncertainties

1 (default) or 0. Determines whether esfit performs an uncertainty quantification for a converged fit.

FitOpt.weight

Array of weights to use when combining residual vectors of all data sets for global fitting of multiple data sets. Note that these weights are applied to the residuals of different data sets without prior normalization or extrapolation, therefore, depending on the data sets provided as input, an intrinsic weight determined by different amplitudes and number of points for each data set might also contribute.

There are more options which are recognized by all fitting algorithms:

FitOpt.x

Provides an x axis for plotting the data. If none is provided, esfit uses the data point indices (1:n where n is the number of data points) internally and does not display x axis labels.

Example: Assume the data is an EPR spectrum with a field axis of 1000 points between 350 and 370 mT:

Opt.x = linspace(350,370,1000);

Opt.x does not affect the fitting itself, just the plotting of the data.

FitOpt.maxTime

Time, in minutes, after which esfit will terminate even if the fitting has not yet converged.

FitOpt.Verbosity

A number, 0 or 1. If set to 0, the fitting functions don't write anything to the command window. If set to 1 (default), they print information about the progress of the fitting.

FitOpt.OutArg

This is an array with two numbers [nArgOut iArg] that tells esfit how to use the output arguments of the simulation function, which may only be required for advanced usage, e.g. when using a custom fitting function. nArgOut is the number of outputs provided by the function. esfit will call the function asking for this number of outputs. The second number, iArg, tells esfit which of the output arguments actually contains the simulated data.

E.g. Opt.OutArg = [3 2] indicates to call the function asking for three outputs and then to use output number 2 for the fitting.

For standard EasySpin simulation functions, OutArg is chosen automatically. For custom simulation functions that deviate in their output behavior, OutArg might be useful.

Each of the available fitting algorithms can be fine-tuned using a set of parameters.

Parameters for Nelder-Mead downhill simplex:

FitOpt.TolFun

Termination tolerance for error function value change (default = 1e-5).

FitOpt.delta

Edge length of the initial simplex. The default value is 0.1.

FitOpt.TolEdgeLength

Termination tolerance for the length of the parameter step (default is 1e-4).

FitOpt.SimplexPars

An array of four elements [rho chi psi sigma], where rho is the reflection coefficient, chi is the expansion coefficient, psi is the contraction coefficient, and sigma is the shrink coefficient. The default values are [1,2,0.5,0.5] for one- and two-dimensional problems (n=1 and 2), and [1,1+2/n,0.75-1/(2*n),1-1/n for higher dimensions (n>2).

FitOpt.ScaleParams

Rescale fitting parameters to within [-1,1] interval. Default is false.

FitOpt.RandomStart

If set to 1, the starting point in the parameter space is chosen randomly, within the given limits set by Vary. If set to 0, which is the default value, the center of the parameter space is used as the starting point.

Parameters for Levenberg-Marquardt:

FitOpt.Gradient

Termination tolerance for the gradient (default is 1e-5).

FitOpt.TolStep

Termination tolerance for the length of the parameter step (default is 1e-6).

FitOpt.lambda

Starting value of Marquardt parameter λ, default value is 1e-3.

FitOpt.delta

Step size for computing the finite-difference approximation of the Jacobian. Default is 1e-3.

FitOpt.ScaleParams

Rescale fitting parameters to within [-1,1] interval. Default is false. In some cases, better fits are obtained when setting this option to true.

FitOpt.RandomStart

Parameters for Monte Carlo:

FitOpt.nTrials

Number of random trial simulations before termination (default = 20000).

FitOpt.TolFun

Termination tolerance for error function value (default = 1e-5).

Parameters for the genetic algorithm:

FitOpt.PopulationSize

A number giving the size of the population, that is the number of parameter sets and simulations in one generation. The default value is 20, but for fittings with many parameters, this value should be increased.

FitOpt.EliteCount

A number specifying the number of populations (ordered in terms of decreasing score) carried over to the next generation without recombination and mutation. The default value is 2 or 10% of the population size, whichever is larger.

FitOpt.maxGenerations

A number specifying the maximum number of generations the algorithm should run. After this number has been reached, the algorithm terminates, no matter how good or bad the best fit so far is. The default value is 10000. A large value is appropriate for fittings with many parameters, if only very few parameters are fitted, maxGenerations can be decreased.

FitOpt.TolFun

Termination tolerance for error function value (default = 1e-5).

Parameters for the grid search:

FitOpt.GridSize

A number or an array that specifies how many grid points there should be for each parameter (default = 7). If one number is given, it is valid for all parameters. For example, let's assume that one g value and the line width are being fitted:

Vary.g = [0 0.001 0]; Vary.lw = 0.2;

Then GridSize can contain 1 or 2 numbers:

Opt.GridSize = 10;   % 10 points for each parameter, making 100 grid points total
Opt.GridSize = [20 3]; % 10 points along g and 3 along lw, giving a total of 60

FitOpt.randGrid

Randomize order of grid points (default = true).

FitOpt.TolFun

Termination tolerance for error function value (default = 1e-5).

Parameters for the particle swarm:

FitOpt.nParticles

Number of particles in the particle swarm. The default number corresponds to 20 + n*10, where n is the number of parameters.

FitOpt.SwarmParams

A vector of four parameters for the algorithm, [k w c1 c2]. k is the velocity limit and determines how far a particle can move in an iteration (default 0.2). w is the inertial coefficient (default 0.5) and describes the propensity of a particle to move in the same direction as the previous iteration. c1 is the cognitive coefficient (default 2) and determines to what degree a particle moves towards the currently optimal point. c2 is the social coefficient (default 1) and describes how much a particle follows the other particles. Together, k, c1 and c2 determine in which direction a particle will move in a given iteration.

FitOpt.TolStallIter

Maximum number of consecutive iterations over which the best function value doesn't change (default = 6). If the best function value stays unchanged for more than this number, the algorithm terminates.

FitOpt.TolFun

Termination tolerance for error function value (default = 1e-5).

Examples

Here is a very simple example. Assume the experimental data have been loaded into data. Then the following code performs a least-squares fitting using the Nelder-Mead downhill simplex algorithm.

Exp.mwFreq = 9.5; % GHz
Sys0.g = [2.1 2.2]
Sys0.lwpp = 0.1; % mT
SysVary.g = [0.05 0.02];
esfit(data,@pepper,{Sys0,Exp},{SysVary});

Global fitting of multiple data sets can be performed similarly, but requires definition of a custom function that returns the same number of data sets as the provided ones. Assuming data to be a cell array containing data sets recorded at X- and Q-band frequencies, the following code allows global fitting of both with the same spin system structure.

Exp.mwFreqX = 9.5; % GHz
Exp.mwFreqQ = 34; % GHz
Sys0.g = [2.1 2.2]
Sys0.lwpp = 0.1; % mT
SysVary.g = [0.05 0.02];

esfit(expdata,@pepperXQ,{Sys0,Exp},{SysVary});

function data = pepperXQ(Sys,Exp)
Exp.mwFreq = Exp.mwFreqX;
data{1} = pepper(Sys,Exp);
Exp.mwFreq = Exp.mwFreqQ;
data{2} = pepper(Sys,Exp);
end

See the example section for full examples. Also, read the user guide about fitting.