Combined Fits
A quick guide on how to combined several fits into a single one using the Minuit2
package.
You can also download this example as a Jupyter notebook and a plain Julia source file.
Table of contents
Load the Minuit2
module. We will also use the Distributions
, FHist
and Plots
modules to define cost functions and display results.
using Minuit2
using Distributions # Distributions
using Plots # Plotting
using FHist # Histogramming
using QuadGK # Numerical integration
Define the model
We define a model with two signal distributions and a background distribution. The model is defined as:
$ \text{pdf} = f{\text{bkg}} \times \text{bkg}(x,a0,a1) + (1-f{\text{bkg}}) \times (f{\text{sig1}} \times \text{sig1}(x,m,s1) + (1-f{\text{sig1}}) \times \text{sig2}(x,m,s2)) $
where:
a0
anda1
are the parameters of the background distribution,μ
,σ1
andσ2
are the parameters of the signal distributions,f_sig1
andf_bkg
are the fractions of the signal and background distributions.
We define the background and signal functions as follows:
const a, b = (0., 10.) # range of the x-axis
sig1(x, μ, σ1, f_bkg, f_sig1) = (1-f_bkg) * f_sig1 * pdf(truncated(Normal(μ,σ1),a,b),x)
sig2(x, μ, σ2, f_bkg, f_sig1) = (1-f_bkg) * (1-f_sig1) * pdf(truncated(Normal(μ,σ2),a,b),x)
bkg(x, a0, a1, f_bkg) = f_bkg * pdf(ChebyshevDist([1., a0, a1], a, b), x)
model(x, μ, σ1, σ2, a0, a1, f_sig1, f_bkg) = bkg(x, a0, a1, f_bkg) + sig1(x, μ, σ1,f_bkg,f_sig1) + sig2(x, μ, σ2, f_bkg, f_sig1)
model (generic function with 1 method)
Verify that the functions are normalized
@assert quadgk(x -> sig1(x, 5., 1., 0.0, 1.), a, b)[1] ≈ 1.
@assert quadgk(x -> sig2(x, 5., 1., 0.0, 0.0), a, b)[1] ≈ 1.
@assert quadgk(x -> bkg(x, 0.2, 0.5, 1.), a, b)[1] ≈ 1.
Lets generate some data
f_sig1 = 0.8
f_bkg = 0.5
μ = 5.
σ1 = 0.5
σ2 = 1.
a0 = 0.5
a1 = 0.2
const N = 1000
xsig1 = rand(truncated(Normal(μ,σ1),a,b), Int(round(N*(1-f_bkg)*f_sig1)))
xsig2 = rand(truncated(Normal(μ,σ2),a,b), Int(round(N*(1-f_bkg)*(1-f_sig1))))
xbkg = rand(ChebyshevDist([1., a0, a1], a, b), Int(round(N*f_bkg)))
data = vcat(xsig1, xsig2, xbkg);
Plot the data and the model
h = Hist1D(data, nbins=50)
plot(bincenters(h), bincounts(h), yerr=sqrt.(bincounts(h)), seriestype=:scatter, label="data")
plot!(x -> bkg(x, a0, a1, f_bkg)*N*(b-a)/50, a, b, label="bkg")
plot!(x -> (bkg(x, a0, a1, f_bkg)+sig2(x, μ, σ2, f_bkg, f_sig1))*N*(b-a)/50, a, b, label="sig2+bkg")
plot!(x -> (bkg(x, a0, a1, f_bkg)+sig2(x, μ, σ2, f_bkg, f_sig1)+sig1(x, μ, σ1, f_bkg, f_sig1))*N*(b-a)/50, a, b, label="sig1+sig2+bkg")
Define the unbinned NLL cost function for the model
cost1 = UnbinnedNLL(data, model) # Define the cost function
UnbinnedNLL cost function of "model" with parameters ["μ", "σ1", "σ2", "a0", "a1", "f_sig1", "f_bkg"]
Define the minimizer and perform the fit
m = Minuit(cost1, μ=5., σ1=0.5, σ2=1., a0=0.5, a1=0.2, f_sig1=0.8, f_bkg=0.5) # Define the minimizer
m.limits["σ1", "σ2"] = (0., Inf) # Set limits for the sigma
m.limits["f_sig1", "f_bkg"] = (0., 1.) # Set limits for the fractions
m = migrad!(m) # Perform the fit
┌──────────────┬──────────────┬───────────┬────────────┬──────────────┐
│ FCN │ Method │ Ncalls │ Iterations │ Elapsed time │
│ 3898.63 │ migrad │ 427 │ 22 │ 435.55 ms │
├──────────────┼──────────────┼───────────┼────────────┼──────────────┤
│ Valid Min. │ Valid Param. │ Above EDM │ Call limit │ Edm │
│ true │ true │ false │ false │ 4.58833e-5 │
├──────────────┼──────────────┼───────────┼────────────┼──────────────┤
│ Hesse failed │ Has cov. │ Accurate │ Pos. def. │ Forced │
│ false │ true │ true │ true │ false │
└──────────────┴──────────────┴───────────┴────────────┴──────────────┘
┌───┬────────┬──────────┬─────────────┬────────┬────────┬────────┬────────┬───────┬───────┐
│ │ Name │ Value │ Hesse Error │ Minos- │ Minos+ │ Limit- │ Limit+ │ Fixed │ Const │
├───┼────────┼──────────┼─────────────┼────────┼────────┼────────┼────────┼───────┼───────┤
│ 1 │ μ │ 4.9857 │ 0.0224586 │ │ │ │ │ │ │
│ 2 │ σ1 │ 0.349897 │ 0.113657 │ │ │ 0.0 │ │ │ │
│ 3 │ σ2 │ 0.597401 │ 0.0665994 │ │ │ 0.0 │ │ │ │
│ 4 │ a0 │ 0.525414 │ 0.0502174 │ │ │ │ │ │ │
│ 5 │ a1 │ 0.150649 │ 0.0695147 │ │ │ │ │ │ │
│ 6 │ f_sig1 │ 0.186072 │ 0.220011 │ │ │ 0.0 │ 1.0 │ │ │
│ 7 │ f_bkg │ 0.513445 │ 0.0196454 │ │ │ 0.0 │ 1.0 │ │ │
└───┴────────┴──────────┴─────────────┴────────┴────────┴────────┴────────┴───────┴───────┘
┌────────┬────────────┬────────────┬────────────┬────────────┬───────────┬────────────┬────────────┐
│ │ μ │ σ1 │ σ2 │ a0 │ a1 │ f_sig1 │ f_bkg │
├────────┼────────────┼────────────┼────────────┼────────────┼───────────┼────────────┼────────────┤
│ μ │ 1.0 │ -0.012335 │ -0.0201831 │ -0.0925558 │ 0.0186092 │ -0.0586901 │ -0.0614375 │
│ σ1 │ -0.012335 │ 1.0 │ 0.549535 │ 0.01125 │ 0.0462253 │ 0.812449 │ -0.0629312 │
│ σ2 │ -0.0201831 │ 0.549535 │ 1.0 │ 0.0479728 │ 0.353361 │ 0.856504 │ -0.429376 │
│ a0 │ -0.0925558 │ 0.01125 │ 0.0479728 │ 1.0 │ 0.0991529 │ 0.0278196 │ -0.0611686 │
│ a1 │ 0.0186092 │ 0.0462253 │ 0.353361 │ 0.0991529 │ 1.0 │ 0.138161 │ -0.605262 │
│ f_sig1 │ -0.0586901 │ 0.812449 │ 0.856504 │ 0.0278196 │ 0.138161 │ 1.0 │ -0.177867 │
│ f_bkg │ -0.0614375 │ -0.0629312 │ -0.429376 │ -0.0611686 │ -0.605262 │ -0.177867 │ 1.0 │
└────────┴────────────┴────────────┴────────────┴────────────┴───────────┴────────────┴────────────┘
Visualize the fit
visualize(m)
plot!(x -> model(x, μ, σ1, σ2, a0, a1, f_sig1, f_bkg)* N * (b-a)/50, a, b, label="truth")
Combination of two signal distributions and a background distribution
For each sample we define an ExtendedUnbinnedNLL
cost function since we would like to also fit the number of events for each of the sample, which is equivalent to the fractions for ech sample.
Please note that the ExtendedUnbinnedNLL
cost function is used to fit the number of events for each sample. the model returns two values: the number of events and the value of the pdf at the given point.
sig1_(x, μ, σ1, f_bkg, f_sig1) = N * (1-f_bkg) * f_sig1, N * (1-f_bkg) * f_sig1 * pdf(truncated(Normal(μ,σ1),a,b),x)
sig2_(x, μ, σ2, f_bkg, f_sig1) = N * (1-f_bkg) * (1-f_sig1), N *(1-f_bkg) * (1-f_sig1) * pdf(truncated(Normal(μ,σ2),a,b),x)
bkg_(x, a0, a1, f_bkg) = N * f_bkg, N * f_bkg * pdf(ChebyshevDist([1., a0, a1], a, b), x)
csig1 = ExtendedUnbinnedNLL(xsig1, sig1_)
csig2 = ExtendedUnbinnedNLL(xsig2, sig2_)
cbkg = ExtendedUnbinnedNLL(xbkg, bkg_)
ExtendedUnbinnedNLL cost function of "bkg_" with parameters ["a0", "a1", "f_bkg"]
Combining the fit is simply done by summing the cost functions
cost2 = csig1 + csig2 + cbkg
CostSum cost function of "unknown" with parameters ["μ", "σ1", "f_bkg", "f_sig1", "σ2", "a0", "a1"]
Define the minimizer and perform the fit
m = Minuit(cost2, μ=5., σ1=0.5, σ2=1., a0=0.5, a1=0.2, f_sig1=0.8, f_bkg=0.5)
m.limits["σ1", "σ2"] = (0., Inf)
m.limits["f_sig1", "f_bkg"] = (0., 1.)
m = migrad!(m)
┌──────────────┬──────────────┬───────────┬────────────┬──────────────┐
│ FCN │ Method │ Ncalls │ Iterations │ Elapsed time │
│ -6855.826 │ migrad │ 134 │ 5 │ 514.438 ms │
├──────────────┼──────────────┼───────────┼────────────┼──────────────┤
│ Valid Min. │ Valid Param. │ Above EDM │ Call limit │ Edm │
│ true │ true │ false │ false │ 2.46174e-7 │
├──────────────┼──────────────┼───────────┼────────────┼──────────────┤
│ Hesse failed │ Has cov. │ Accurate │ Pos. def. │ Forced │
│ false │ true │ true │ true │ false │
└──────────────┴──────────────┴───────────┴────────────┴──────────────┘
┌───┬────────┬──────────┬─────────────┬────────┬────────┬────────┬────────┬───────┬───────┐
│ │ Name │ Value │ Hesse Error │ Minos- │ Minos+ │ Limit- │ Limit+ │ Fixed │ Const │
├───┼────────┼──────────┼─────────────┼────────┼────────┼────────┼────────┼───────┼───────┤
│ 1 │ μ │ 4.95893 │ 0.0235943 │ │ │ │ │ │ │
│ 2 │ σ1 │ 0.487523 │ 0.0172362 │ │ │ 0.0 │ │ │ │
│ 3 │ f_bkg │ 0.5 │ 0.0158088 │ │ │ 0.0 │ 1.0 │ │ │
│ 4 │ f_sig1 │ 0.8 │ 0.0178826 │ │ │ 0.0 │ 1.0 │ │ │
│ 5 │ σ2 │ 0.936168 │ 0.0661937 │ │ │ 0.0 │ │ │ │
│ 6 │ a0 │ 0.545693 │ 0.0700288 │ │ │ │ │ │ │
│ 7 │ a1 │ 0.192593 │ 0.0654498 │ │ │ │ │ │ │
└───┴────────┴──────────┴─────────────┴────────┴────────┴────────┴────────┴───────┴───────┘
┌────────┬──────────────┬──────────────┬─────────────┬──────────────┬──────────────┬──────────────┬──────────────┐
│ │ μ │ σ1 │ f_bkg │ f_sig1 │ σ2 │ a0 │ a1 │
├────────┼──────────────┼──────────────┼─────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│ μ │ 1.0 │ 0.0100973 │ 8.37978e-10 │ -5.61867e-10 │ -0.0173501 │ 2.28937e-11 │ 5.56603e-10 │
│ σ1 │ 0.0100973 │ 1.0 │ 8.42306e-10 │ -5.6157e-10 │ -0.000175189 │ -5.56667e-10 │ -2.28977e-11 │
│ f_bkg │ 8.37978e-10 │ 8.42306e-10 │ 1.0 │ 2.78168e-10 │ 2.63391e-10 │ 2.9293e-10 │ 2.92756e-10 │
│ f_sig1 │ -5.61867e-10 │ -5.6157e-10 │ 2.78168e-10 │ 1.0 │ 9.74848e-12 │ 3.7823e-19 │ -2.15447e-19 │
│ σ2 │ -0.0173501 │ -0.000175189 │ 2.63391e-10 │ 9.74848e-12 │ 1.0 │ -3.9721e-13 │ -9.65713e-12 │
│ a0 │ 2.28937e-11 │ -5.56667e-10 │ 2.9293e-10 │ 3.7823e-19 │ -3.9721e-13 │ 1.0 │ 0.0512084 │
│ a1 │ 5.56603e-10 │ -2.28977e-11 │ 2.92756e-10 │ -2.15447e-19 │ -9.65713e-12 │ 0.0512084 │ 1.0 │
└────────┴──────────────┴──────────────┴─────────────┴──────────────┴──────────────┴──────────────┴──────────────┘
Visualize the fit. In this case we visualize the fit for each sample separately.
visualize(m)
This page was generated using Literate.jl.