Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
|
|
|
|
The first logical record in the SNOCR files don't have an EV bank which was
causing the output file to have an emtpy list element. This commit fixes the
issue by checking for an empty EV bank before printing the list delimiter.
|
|
|
|
The range and energy loss tables have different maximum values for electrons,
muons, and protons so we have to dynamically set the maximum energy of the fit
in order to avoid a GSL interpolation error.
This commit adds {electron,muon,proton}_get_max_energy() functions to return
the maximum energy in the tables and that is then used to set the maximum value
in the fit.
|
|
Also write out the data cleaning word to the YAML file.
|
|
Also increase the maximum kinetic energy to 10^4 GeV which is approximately the
maximum expected energy for cosmic muons at SNO.
|
|
This commit adds a data cleaning cut to tag incoming muons by looking for early
OWL hits. It also significantly updates the flasher cut to catch more flashers.
In particular, the flasher cut now does the following:
- loops over *all* paddle cards with at least 4 hits instead of just the paddle
cards with the most hits
- uses QLX to look for charge outliers in the paddle card
- fixes a few bugs (for example, uninitialized values in the charge array)
- adds a check to to see if the given slot is early with respect to all PMTs
within 4 meters to catch the case where the flashing channel is missing from
the event
|
|
|
|
This commit updates find_peaks() to only return peaks which are at least a
certain number of degrees apart from each other. This is because I found that
for many events the first few peaks would all be essentially the same direction
and so the fit was taking a lot of time fitting essentially the same seed
points. Since I now have to only try 3 peaks in order to get my grid jobs to
run for less than a few hours it's necessary to make sure we aren't just
fitting the same three directions for the "quick" minimization.
I also updated the fit to only use a maximum of 3 seed directions.
|
|
Also, update the step size for the energy during the final minimization to 10%.
|
|
|
|
This commit changes the format specifier for the values in sprintf_yaml_list()
from %.2g -> %.2f because YAML (at least the python parser) doesn't recognize
values like 1e+03 as floats.
|
|
- set number of shower points to 10 for the main fit
- set step size to 10% of the energy
- set max number of evals during quick minimization phase to 1000
|
|
|
|
I probably need to spend some time to optimize this along with the algorithm
for guessing the peaks, but for now I am just lowering this from 10 -> 5
because with 10 the number of quick minimizations for 3 particles is too big
and so the fits take way too long.
|
|
When plotting the likelihood function I realized that the fast likelihood
calculation was *very* noisy due to the way I calculated the shower and delta
ray charge. Although it works well for single particles, it is not suitable for
distinguishing which seed is the best when doing multi particle fits.
Eventually I may be able to fix this, but for now we just do the normal
likelihood calculation.
I also decreased the number of shower points from 100 -> 10 to speed things up.
|
|
This commit adds a new program called zdab-cat which is kind of like fit, but
just produces the YAML output without actually fitting anything.
|
|
|
|
|
|
delta rays
This commit introduces a new method for integrating over the particle track to
calculate the number of shower and delta ray photons expected at each PMT. The
reason for introducing a new method was that the previous method of just using
the trapezoidal rule was both inaccurate and not stable. By inaccurate I mean
that the trapezoidal rule was not producing a very good estimate of the true
integral and by not stable I mean that small changes in the fit parameters
(like theta and phi) could produce wildly different results. This meant that
the likelihood function was very noisy and was causing the minimizers to not be
able to find the global minimum.
The new integration method works *much* better than the trapezoidal rule for
the specific functions we are dealing with. The problem is essentially to
integrate the product of two functions over some interval, one of which is very
"peaky", i.e. we want to find:
\int f(x) g(x) dx
where f(x) is peaked around some region and g(x) is relatively smooth. For our
case, f(x) represents the angular distribution of the Cerenkov light and g(x)
represents the factors like solid angle, absorption, etc.
The technique I discovered was that you can approximate this integral via a
discrete sum:
constant \sum_i g(x_i)
where the x_i are chosen to have equal spacing along the range of the integral
of f(x), i.e.
x_i = F^(-1)(i*constant)
This new method produces likelihood functions which are *much* more smooth and
accurate than previously.
In addition, there are a few other fixes in this commit:
- switch from specifying a step size for the shower integration to a number of
points, i.e. dx_shower -> number of shower points
- only integrate to the PSUP
I realized that previously we were integrating to the end of the track even
if the particle left the PSUP, and that there was no code to deal with the
fact that light emitted beyond the PSUP can't make it back to the PMTs.
- only integrate to the Cerenkov threshold
When integrating over the particle track to calculate the expected number
of direct Cerenkov photons, we now only integrate the track up to the point
where the particle's velocity is 1/index. This should hopefully make the
likelihood smoother because previously the estimate would depend on exactly
whether the points we sampled the track were above or below this point.
- add a minimum theta0 value based on the angular width of the PMT
When calculating the expected number of Cerenkov photons we assumed that
the angular distribution was constant over the whole PMT. This is a bad
assumption when the particle is very close to the PMT. Really we should
average the function over all the angles of the PMT, but that would be too
computationally expensive so instead we just calculate a minimum theta0
value which depends on the distance and angle to the PMT. This seems to
make the likelihood much smoother for particles near the PSUP.
- add a factor of sin(theta) when checking if we can skip calculating the
charge in get_expected_charge()
- fix a nan in beta_root() when the momentum is negative
- update PSUP_RADIUS from 800 cm -> 840 cm
|
|
|
|
|
|
This commit updates the optics code to calculate the rayleigh scattering length
using the Einstein-Smoluchowski formula instead of using the effective rayleigh
scattering lengths from the RSPR bank.
|
|
|
|
|
|
Based on some initial testing it seems that the subplex minimization algorithm
performs *much* better than BOBYQA for multi-particle fits. It is also a bit
slower, so I will probably have to figure out how to speed things up.
|
|
|
|
To enable the fitter to run outside of the src directory, I created a new
function open_file() which works exactly like fopen() except that it searches
for the file in both the current working directory and the path specified by an
environment variable.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
In the processed zdab files (the SNOCR_* files), the first logical record just
has a run header bank and no EV bank.
|
|
|
|
|
|
|
|
This commit updates the zebra library files zebra.{c,h} so that it's now
possible to traverse the data structure using links! This was originally
motivated by wanting to figure out which MC particles were generated from the
MCGN bank (from which it's only possible to access the tracks and vertices
using structural links).
I've also added a new test to test-zebra which checks the consistency of all of
the next/up/orig, structural, and reference links in a zebra file.
|
|
Previously, the algorithm used to find peaks was to search for all peaks in the
Hough transform above some constant fraction of the highest peak. This
algorithm could have issues finding smaller peaks away from the highest peak.
The new algorithm instead finds the highest peak in the Hough transform and
then recomputes the Hough transform ignoring all PMT hits within the Cerenkov
cone of the first peak. The next peak is found from this transform and the
process is iteratively repeated until a certain number of peaks are found.
One disadvantage of this new system is that it will *always* find the same
number of peaks and this will usually be greater than the actual number of
rings in the event. This is not a problem though since when fitting the event
we loop over all possible peaks and do a quick fit to determine the starting
point and so false positives are OK because the real peaks will fit better
during this quick fit.
Another potential issue with this new method is that by rejecting all PMT hits
within the Cerenkov cone of the first peak we could miss a second peak very
close to the first peak. This is partially mitigated by the fact that when we
loop over all possible combinations of the particle ids and directions we allow
each peak to be used more than once. For example, when fitting for the
hypothesis that an event is caused by two electrons and one muon and given two
possible directions 1 and 2, we will fit for the following possible direction
combinations:
1 1 1
1 1 2
1 2 1
1 2 2
2 2 1
2 2 2
Therefore if there is a second ring close to the first it is possible to fit it
correctly since we will seed the quick fit with two particles pointing in the
same direction.
This commit also adds a few tests for new functions and changes the energy step
size during the quick fit to 10% of the starting energy value.
|
|
This commit updates the fit to use the fit_event2() function which can fit for
multi vertex hypotheses. It also uses the QUAD fitter and the Hough transform
of the event to seed the fit so the results for 1 particle fits will be
slightly different than before.
I also fixed a small bug in combinations_with_replacement().
|
|
|
|
|
|
|
|
This commit adds a new function fit_event2() to fit multiple vertices. To seed
the fit, fit_event2() does the following:
- use the QUAD fitter to find the position and initial time of the event
- call find_peaks() to find possible directions for the particles
- loop over all possible unique combinations of the particles and direction
vectors and do a "fast" minimization
The best minimum found from the "fast" minimizations is then used to start the fit.
This commit has a few other updates:
- adds a hit_only parameter to the nll() function. This was necessary since
previously PMTs which weren't hit were always skipped for the fast
minimization, but when fitting for multiple vertices we need to include PMTs
which aren't hit since we float the energy.
- add the function guess_energy() to guess the energy of a particle given a
position and direction. This function estimates the energy by summing up the
QHS for all PMTs hit within the Cerenkov cone and dividing by 6.
- fixed a bug which caused the fit to freeze when hitting ctrl-c during the
fast minimization phase.
|
|
|