Least-squares fitting of EPR and other data.
esfit(data,fcn,p0,vary) esfit(data,fcn,p0,lb,ub) esfit(___,FitOpt) fit = esfit(___)
See also the user guide on least-squares fitting.
esfit
fits EPR data simulated with any EasySpin function, e.g.
garlic,
pepper,
chili, etc. to experimental spectral data
using least-squares fitting algorithms and calculates uncertainties (error bars) for the fitted parameters. Additionally, it can also be used for fitting of any other type of data, given a user-defined model function. Global fitting of multiple data sets is supported for user-defined simulation and model functions (see user guide for details).
The input parameters are the following:
data
is a 1D array containing the experimental spectral data or a cell array containing multiple data sets (in the latter case, a custom simulation function returning a cell array of calculated data needs to be defined for fitting).
fcn
is the function handle of a simulation or model function: @garlic
, @pepper
,
@chili
, @saffron
, @salt
, or a user-supplied custom model function that takes an input parameter vector p
and returns simulated data, datasim = fcn(p)
.
For global fitting of multiple data sets, a custom model function is required, which takes input parameters and returns a cell array of simulated data sets with dimensions matching the corresponding experimental data sets. Custom model functions based on EasySpin functions can be used with a spin system and experimental structure as input, which must be adapted internally for a series of different simulations that are then returned in a cell array (see the user guide for an example).
p0
defines the starting values for the fitting parameters.
For EasySpin functions and custom functions based on them, provide the spin system structure containing all the parameters of the spin system, the experimental parameter structure and, if desired, a simulation options structure Opt
.
esfit(data,@pepper,{Sys0,Exp0},{varySys}) esfit(data,@pepper,{Sys0,Exp0,Opt},{varySys}) esfit(data,@pepper,{Sys0,Exp0,Opt},{varySys},FitOpt)
For other functions, p0
is an n-element vector.
userfcn = @(p) modelfunction(p); esfit(data,userfcn,p0,vary)
For parameters that are to be varied during the fitting process, the values in p0
represent the starting values of the fit and the centers of the search ranges, if the fitting parameter variation is defined in vary
.
For multi-component systems, Sys0
is a cell array of spin system structures, e.g. {Sys1,Sys2}
for a two-component system. The weights for the different components are included in each component structure, e.g. Sys1.weight = 0.7
and Sys2.weight = 0.3
. The weights are relative (0.7 and 0.3 is the same as 14 and 6) and can be included in the fit. If weight is omitted, it is assumed to be 1. There is no limit on the number of components.
esfit(data,@pepper,{{Sys1,Sys2},Exp0},{{varySys1,varySys2}})
Exp0
contains the experimental parameters needed for the simulation. For EPR spectral simulations, make sure to specify the field or frequency range and the number of points you used for the experimental data in Exp.Range
or Exp.mwRange
. If you let the simulation function automatically pick a range, it will likely be different from the experimental one, and esfit
cannot do its work. For example, if x
is the experimental field range, use Exp.Range = [min(x) max(x)]
and Exp.nPoints = numel(x)
.
Opt
contains simulation options that are forwarded directly to the simulation function. Refer to the documentation of
chili,
garlic,
pepper, and
salt for details.
vary
contains the maximum variations for the parameters allowed during fitting.
For EasySpin-style functions, this should mirror the p0
input style, e.g. {SysVary}
or {SysVary,ExpVary}
if experimental parameters should also be varied in the fitting.
For EasySpin functions, if a spin system or experimental parameter should be included in the fitting, give it a non-zero value in this structure. If it should not be included, set its value in this structure to zero, or don't include it at all.
Sys0.lwpp = 5; SysVary.lwpp = 2; % the line width is searched over a range of about 3 to 7 SysVary.lwpp = 0; % the line width is kept constant at the value given in Sys0
For multiple components, SysVary
should be a cell array containing one structure per component, e.g. {SysVary1,SysVary2}
for two components. If none of the parameters of a component are varied in the fit, []
can be specified, e.g. {SysVary1,[]}
for a two-component fit where only parameters of the first component are varied.
lb
and ub
contain the lower and upper bounds for variation of the parameters allowed in the fitting and can be used instead of vary
. For EasySpin-style functions, this should mirror the p0
input style, e.g. {lbSys,lbExp}
and {ubSys,ubExp}
.
FitOpt
is a structure containing settings for the optimization algorithms used in esfit
. The possible settings are described further down on this page.
If no output is requested, esfit
starts an interactive user interface that allows extensive control over the fitting process and also allows the export of the fitting results to the workspace. In the UI, the fitting algorithm can be changed, parameters can be excluded from the fit, and multiple fit sets can be stored.
If esfit
is called with one output, it runs without a user interface, and returns a structure fit
containing the fit results.
A similar structure is also returned if esfit
is used with the graphical user interface and a set of parameters is exported.
fit
and fitraw
mask
residuals
baseline
, baselinetype
and scale
pfit
and pnames
p_start
, p_fixed
and pfit_full
esfit
is used from the graphical user interface)
argsfit
pstd
ci95
cov
corr
ssr
, rmsd
and redchisquare
bestfithistory
The structure FitOpt
contains fitting options. The most important ones select the fitting algorithm and the function to use for computing the residuals:
FitOpt.Method
Indicates the least-squares fitting method to be used, consisting of up to two keywords. One keyword specifies the algorithm, and another one selects the way residuals are computed during the fitting. Some examples:
FitOpt.Method = 'simplex fcn'; FitOpt.Method = 'genetic int';
The keywords for choosing one of the available algorithms are
'simplex'
: Nelder-Mead simplex algorithm (unconstrained)
'levmar'
: Levenberg-Marquardt (unconstrained)
'montecarlo'
: Monte Carlo random search (constrained)
'genetic'
: Genetic algorithm (constrained)
'grid'
: Systematic grid search (constrained)
'swarm'
: Particle swarm algorithm (constrained)
'lsqnonlin'
: lsqnonlin
function from the Matlab Optimization Toolbox
'simplex'
is used.
The second keyword specifies what the residuals and consequently the root-mean-square deviation (rmsd) should be computed from during the fitting procedure. Usually the residuals are computed by taking the difference of the experimental and the simulated spectrum. However, the spectra can be transformed before computing the residuals.
'fcn'
means take the experimental and simulated spectral data directly.'int'
: integrate the two spectra.'dint'
: use the double integrals of the two spectra.'diff'
: compute the derivatives of the two spectra.'fft'
: FFT both spectra before computing the residuals.
The default value is 'fcn'
. If there are many resolved lines, 'int'
gives better convergence.
In all cases, the residuals returned in the output refer to the experimental data as provided.
FitOpt.AutoScale
Possible options are 'none'
, 'lsq'
and 'maxabs'
.
If set to 'none'
, the simulated data is not rescaled.
If set to 'lsq'
, the simulated data are rescaled before comparing to the experimental data, and the scale factor is reported in fit.scale
.
If set to 'maxabs'
, the simulated data is scaled to the maximum absolute of the experimental data.
By default, it is set to 'lsq'
if using one of the EasySpin spectral simulation functions, otherwise it is set to 'none'
. For global fitting of multiple data sets, if selected, FitOpt.AutoScale
scales each data set individually.
FitOpt.BaseLine
Polynomial order for the baseline added to the simulated data to match the experimental data (0, 1, 2 or 3), set this to []
if no baseline should be included. For global fitting of multiple data sets, different polynomial order baselines can be used for the different data sets by specifying an array. The baseline used is returned in fit.baseline
.
FitOpt.Mask
Array of 1 and 0 with the same size as the data vector, used to exclude parts of the experimental data from the fitting. Values with mask 0 are excluded from the fit. For global fitting of multiple data sets, FitOpt.Mask
should be provided as a cell array.
In the graphical user interface, the mask can be selected manually and is returned in the output structure.
FitOpt.CalculateUncertainties
1
(default) or 0
. Determines whether esfit
performs an uncertainty quantification for a converged fit.
FitOpt.weight
Array of weights to use when combining residual vectors of all data sets for global fitting of multiple data sets. Note that these weights are applied to the residuals of different data sets without prior normalization or extrapolation, therefore, depending on the data sets provided as input, an intrinsic weight determined by different amplitudes and number of points for each data set might also contribute.
There are more options which are recognized by all fitting algorithms:
FitOpt.x
Provides an x axis for plotting the data. If none is provided, esfit
uses the data point indices (1:n
where n
is the number of data points) internally and does not display x axis labels.
Opt.x = linspace(350,370,1000);
Opt.x
does not affect the fitting itself, just the plotting of the data.
FitOpt.maxTime
Time, in minutes, after which esfit
will terminate even if the fitting has not yet converged.
FitOpt.Verbosity
A number, 0 or 1. If set to 0, the fitting functions don't write anything to the command window. If set to 1 (default), they print information about the progress of the fitting.
FitOpt.OutArg
This is an array with two numbers [nArgOut iArg]
that tells esfit
how to use the output arguments of the simulation function, which may only be required for advanced usage, e.g. when using a custom fitting function. nArgOut
is the number of outputs provided by the function. esfit
will call the function asking for this number of outputs. The second number, iArg
, tells esfit
which of the output arguments actually contains the simulated data.
E.g. Opt.OutArg = [3 2]
indicates to call the function asking for three outputs and then to use output number 2 for the fitting.
For standard EasySpin simulation functions, OutArg
is chosen automatically. For custom simulation functions that deviate in their output behavior, OutArg
might be useful.
Each of the available fitting algorithms can be fine-tuned using a set of parameters.
Parameters for Nelder-Mead downhill simplex:
FitOpt.TolFun
FitOpt.delta
FitOpt.TolEdgeLength
FitOpt.SimplexPars
[rho chi psi sigma]
, where rho
is the reflection coefficient, chi
is the expansion coefficient, psi
is the contraction coefficient, and sigma
is the shrink coefficient. The default values are [1,2,0.5,0.5]
for one- and two-dimensional problems (n=1 and 2), and [1,1+2/n,0.75-1/(2*n),1-1/n
for higher dimensions (n>2).
FitOpt.ScaleParams
FitOpt.RandomStart
Vary
. If set to 0, which is the default value, the center of the parameter space is used as the starting point.
Parameters for Levenberg-Marquardt:
FitOpt.Gradient
FitOpt.TolStep
FitOpt.lambda
FitOpt.delta
FitOpt.ScaleParams
FitOpt.RandomStart
Vary
. If set to 0, which is the default value, the center of the parameter space is used as the starting point.
Parameters for Monte Carlo:
FitOpt.nTrials
FitOpt.TolFun
Parameters for the genetic algorithm:
FitOpt.PopulationSize
FitOpt.EliteCount
FitOpt.maxGenerations
maxGenerations
can be decreased.
FitOpt.TolFun
Parameters for the grid search:
FitOpt.GridSize
Vary.g = [0 0.001 0]; Vary.lw = 0.2;Then
GridSize
can contain 1 or 2 numbers:
Opt.GridSize = 10; % 10 points for each parameter, making 100 grid points total Opt.GridSize = [20 3]; % 10 points along g and 3 along lw, giving a total of 60
FitOpt.randGrid
FitOpt.TolFun
Parameters for the particle swarm:
FitOpt.nParticles
20 + n*10
, where n
is the number of parameters.
FitOpt.SwarmParams
[k w c1 c2]
. k
is the velocity limit
and determines how far a particle can move in an iteration (default 0.2). w
is the inertial
coefficient (default 0.5) and describes the propensity of a particle to move in the same direction as the previous
iteration. c1
is the cognitive coefficient (default 2) and determines to what
degree a particle moves towards the currently optimal point. c2
is the social coefficient (default 1)
and describes how much a particle follows the other particles. Together, k
, c1
and c2
determine in which direction a particle will move in a given iteration.
FitOpt.TolStallIter
FitOpt.TolFun
Here is a very simple example. Assume the experimental data have been loaded into data
. Then the following code performs a least-squares fitting using the Nelder-Mead downhill simplex algorithm.
Exp.mwFreq = 9.5; % GHz Sys0.g = [2.1 2.2] Sys0.lwpp = 0.1; % mT SysVary.g = [0.05 0.02]; esfit(data,@pepper,{Sys0,Exp},{SysVary});
Global fitting of multiple data sets can be performed similarly, but requires definition of a custom function that returns the same number of data sets as the provided ones. Assuming data
to be a cell array containing data sets recorded at X- and Q-band frequencies, the following code allows global fitting of both with the same spin system structure.
Exp.mwFreqX = 9.5; % GHz Exp.mwFreqQ = 34; % GHz Sys0.g = [2.1 2.2] Sys0.lwpp = 0.1; % mT SysVary.g = [0.05 0.02]; esfit(expdata,@pepperXQ,{Sys0,Exp},{SysVary}); function data = pepperXQ(Sys,Exp) Exp.mwFreq = Exp.mwFreqX; data{1} = pepper(Sys,Exp); Exp.mwFreq = Exp.mwFreqQ; data{2} = pepper(Sys,Exp); end
See the example section for full examples. Also, read the user guide about fitting.