aboutsummaryrefslogtreecommitdiff
path: root/src/fit.c
AgeCommit message (Collapse)Author
2019-06-02update find_peaks() to only return unique peakstlatorre
This commit updates find_peaks() to only return peaks which are at least a certain number of degrees apart from each other. This is because I found that for many events the first few peaks would all be essentially the same direction and so the fit was taking a lot of time fitting essentially the same seed points. Since I now have to only try 3 peaks in order to get my grid jobs to run for less than a few hours it's necessary to make sure we aren't just fitting the same three directions for the "quick" minimization. I also updated the fit to only use a maximum of 3 seed directions.
2019-05-29set step size on theta and phi to 0.1tlatorre
Also, update the step size for the energy during the final minimization to 10%.
2019-05-24add a script to submit jobs to the gridtlatorre
2019-05-24update sprintf_yaml_list()tlatorre
This commit changes the format specifier for the values in sprintf_yaml_list() from %.2g -> %.2f because YAML (at least the python parser) doesn't recognize values like 1e+03 as floats.
2019-05-24several small updates to fit.ctlatorre
- set number of shower points to 10 for the main fit - set step size to 10% of the energy - set max number of evals during quick minimization phase to 1000
2019-05-24switch to using BOBYQA since it's fastertlatorre
2019-05-24change MAX_NPEAKS to 5tlatorre
I probably need to spend some time to optimize this along with the algorithm for guessing the peaks, but for now I am just lowering this from 10 -> 5 because with 10 the number of quick minimizations for 3 particles is too big and so the fits take way too long.
2019-05-24don't do fast fit during quick minimization phasetlatorre
When plotting the likelihood function I realized that the fast likelihood calculation was *very* noisy due to the way I calculated the shower and delta ray charge. Although it works well for single particles, it is not suitable for distinguishing which seed is the best when doing multi particle fits. Eventually I may be able to fix this, but for now we just do the normal likelihood calculation. I also decreased the number of shower points from 100 -> 10 to speed things up.
2019-05-23add zdab-cattlatorre
This commit adds a new program called zdab-cat which is kind of like fit, but just produces the YAML output without actually fitting anything.
2019-05-23make float formatting consistent in sprintf_yaml_list()tlatorre
2019-05-14add --plot-likelihood option to fittlatorre
2019-05-13update method for calculating expected number of photons from shower and ↵tlatorre
delta rays This commit introduces a new method for integrating over the particle track to calculate the number of shower and delta ray photons expected at each PMT. The reason for introducing a new method was that the previous method of just using the trapezoidal rule was both inaccurate and not stable. By inaccurate I mean that the trapezoidal rule was not producing a very good estimate of the true integral and by not stable I mean that small changes in the fit parameters (like theta and phi) could produce wildly different results. This meant that the likelihood function was very noisy and was causing the minimizers to not be able to find the global minimum. The new integration method works *much* better than the trapezoidal rule for the specific functions we are dealing with. The problem is essentially to integrate the product of two functions over some interval, one of which is very "peaky", i.e. we want to find: \int f(x) g(x) dx where f(x) is peaked around some region and g(x) is relatively smooth. For our case, f(x) represents the angular distribution of the Cerenkov light and g(x) represents the factors like solid angle, absorption, etc. The technique I discovered was that you can approximate this integral via a discrete sum: constant \sum_i g(x_i) where the x_i are chosen to have equal spacing along the range of the integral of f(x), i.e. x_i = F^(-1)(i*constant) This new method produces likelihood functions which are *much* more smooth and accurate than previously. In addition, there are a few other fixes in this commit: - switch from specifying a step size for the shower integration to a number of points, i.e. dx_shower -> number of shower points - only integrate to the PSUP I realized that previously we were integrating to the end of the track even if the particle left the PSUP, and that there was no code to deal with the fact that light emitted beyond the PSUP can't make it back to the PMTs. - only integrate to the Cerenkov threshold When integrating over the particle track to calculate the expected number of direct Cerenkov photons, we now only integrate the track up to the point where the particle's velocity is 1/index. This should hopefully make the likelihood smoother because previously the estimate would depend on exactly whether the points we sampled the track were above or below this point. - add a minimum theta0 value based on the angular width of the PMT When calculating the expected number of Cerenkov photons we assumed that the angular distribution was constant over the whole PMT. This is a bad assumption when the particle is very close to the PMT. Really we should average the function over all the angles of the PMT, but that would be too computationally expensive so instead we just calculate a minimum theta0 value which depends on the distance and angle to the PMT. This seems to make the likelihood much smoother for particles near the PSUP. - add a factor of sin(theta) when checking if we can skip calculating the charge in get_expected_charge() - fix a nan in beta_root() when the momentum is negative - update PSUP_RADIUS from 800 cm -> 840 cm
2019-03-31switch back to using subplextlatorre
2019-03-26small fix to fit.ctlatorre
2019-03-25update rayleigh scattering calculationtlatorre
This commit updates the optics code to calculate the rayleigh scattering length using the Einstein-Smoluchowski formula instead of using the effective rayleigh scattering lengths from the RSPR bank.
2019-03-17set a relative tolerance of 1e-2 on the optimization parameters in the fast fittlatorre
2019-03-16add GPLv3 licensetlatorre
2019-03-16switch to using SBPLX for the minimizationtlatorre
Based on some initial testing it seems that the subplex minimization algorithm performs *much* better than BOBYQA for multi-particle fits. It is also a bit slower, so I will probably have to figure out how to speed things up.
2019-03-07update fit to automatically load DQXX file based on run numbertlatorre
2019-03-07update code to allow you to run the fit outside of the src directorytlatorre
To enable the fitter to run outside of the src directory, I created a new function open_file() which works exactly like fopen() except that it searches for the file in both the current working directory and the path specified by an environment variable.
2019-03-07don't fix the position during the fast fittlatorre
2019-03-05update quad() to not abort if the matrix is singulartlatorre
2019-03-04add a function to tag flasher eventstlatorre
2019-03-04update fit to print gtid and nhit even if we skip the eventtlatorre
2019-03-04add a --min-nhit command line argumenttlatorre
2019-03-04update get_event() to handle events without a pmt bank linktlatorre
2019-03-04skip logical record if there is no EV banktlatorre
In the processed zdab files (the SNOCR_* files), the first logical record just has a run header bank and no EV bank.
2019-03-04skip reading in mcgn banks if there is no mc banktlatorre
2019-03-04check that all links are nonzerotlatorre
2019-01-15fix a bug with getting the first MCTK banktlatorre
2019-01-15update zebra library to be able to use linkstlatorre
This commit updates the zebra library files zebra.{c,h} so that it's now possible to traverse the data structure using links! This was originally motivated by wanting to figure out which MC particles were generated from the MCGN bank (from which it's only possible to access the tracks and vertices using structural links). I've also added a new test to test-zebra which checks the consistency of all of the next/up/orig, structural, and reference links in a zebra file.
2019-01-10update find_peaks algorithmtlatorre
Previously, the algorithm used to find peaks was to search for all peaks in the Hough transform above some constant fraction of the highest peak. This algorithm could have issues finding smaller peaks away from the highest peak. The new algorithm instead finds the highest peak in the Hough transform and then recomputes the Hough transform ignoring all PMT hits within the Cerenkov cone of the first peak. The next peak is found from this transform and the process is iteratively repeated until a certain number of peaks are found. One disadvantage of this new system is that it will *always* find the same number of peaks and this will usually be greater than the actual number of rings in the event. This is not a problem though since when fitting the event we loop over all possible peaks and do a quick fit to determine the starting point and so false positives are OK because the real peaks will fit better during this quick fit. Another potential issue with this new method is that by rejecting all PMT hits within the Cerenkov cone of the first peak we could miss a second peak very close to the first peak. This is partially mitigated by the fact that when we loop over all possible combinations of the particle ids and directions we allow each peak to be used more than once. For example, when fitting for the hypothesis that an event is caused by two electrons and one muon and given two possible directions 1 and 2, we will fit for the following possible direction combinations: 1 1 1 1 1 2 1 2 1 1 2 2 2 2 1 2 2 2 Therefore if there is a second ring close to the first it is possible to fit it correctly since we will seed the quick fit with two particles pointing in the same direction. This commit also adds a few tests for new functions and changes the energy step size during the quick fit to 10% of the starting energy value.
2018-12-14switch to using fit_event2() by defaulttlatorre
This commit updates the fit to use the fit_event2() function which can fit for multi vertex hypotheses. It also uses the QUAD fitter and the Hough transform of the event to seed the fit so the results for 1 particle fits will be slightly different than before. I also fixed a small bug in combinations_with_replacement().
2018-12-14fix help stringtlatorre
2018-12-13add some more comments and fix a memory leaktlatorre
2018-12-13add some commentstlatorre
2018-12-13update fit.c to fit multiple verticestlatorre
This commit adds a new function fit_event2() to fit multiple vertices. To seed the fit, fit_event2() does the following: - use the QUAD fitter to find the position and initial time of the event - call find_peaks() to find possible directions for the particles - loop over all possible unique combinations of the particles and direction vectors and do a "fast" minimization The best minimum found from the "fast" minimizations is then used to start the fit. This commit has a few other updates: - adds a hit_only parameter to the nll() function. This was necessary since previously PMTs which weren't hit were always skipped for the fast minimization, but when fitting for multiple vertices we need to include PMTs which aren't hit since we float the energy. - add the function guess_energy() to guess the energy of a particle given a position and direction. This function estimates the energy by summing up the QHS for all PMTs hit within the Cerenkov cone and dividing by 6. - fixed a bug which caused the fit to freeze when hitting ctrl-c during the fast minimization phase.
2018-12-04don't quit when maxtime is reachedtlatorre
2018-12-04fix bugtlatorre
2018-12-04add a command line parameter to control the maximum time of the fittlatorre
This commit adds a parameter to stop the fit if it takes longer than a certain period of time in seconds. This parameter can be set on the command line. For example, to limit fits to 10 minutes: $ ./fit FILENAME --max-time 600.0
2018-12-04set a stopping criterion of 1% for the fit parameterstlatorre
2018-12-03add a goodness of fit parameter psi to the fittlatorre
2018-11-30sizeof()/sizeof() -> LEN()tlatorre
2018-11-30nll_muon -> nll and nll -> nopt_nlltlatorre
2018-11-30add ability to fit for multiple verticestlatorre
2018-11-28update sno_charge.ctlatorre
This commit adds lots of comments to sno_charge.c and makes a couple of other changes: - use interp1d() instead of the GSL interpolation routines - increase MAX_PE to 100 I increased MAX_PE because I determined that it had a rather large impact on the likelihood function for 500 MeV electrons. This unfortunately slows down the initialization by a lot. I think I could speed this up by convolving the single PE charge distribution with a gaussian *before* convolving the charge distributions to compute the charge distributions for multiple PE.
2018-11-27add rayleigh scatteringtlatorre
This commit adds Rayleigh scattering to the likelihood function. The Rayleigh scattering lengths come from rsp_rayleigh.dat from SNOMAN which only includes photons which scattered +/- 10 ns around the prompt peak. The fraction of light which scatters is treated the same in the likelihood as reflected light, i.e. it is uniform across all the PMTs in the detector and the time PDF is assumed to be a constant for a fixed amount of time after the prompt peak.
2018-11-27update dx_shower to 10 cm to speed things uptlatorre
2018-11-25add a separate `dx_shower` parameter for the spacing of the shower track ↵tlatorre
integral
2018-11-17speed up likelihood function and switch to using fixed dxtlatorre
This commit speeds up the likelihood function by about ~20% by using the precomputed track positions, directions, times, etc. instead of interpolating them on the fly. It also switches to computing the number of points to integrate along the track by dividing the track length by a specified distance, currently set to 1 cm. This should hopefully speed things up for lower energies and result in more stable fits at high energies.