aboutsummaryrefslogtreecommitdiff
path: root/src/fit.c
AgeCommit message (Collapse)Author
2019-06-19add FTP, RSP, and FTK results to the output filetlatorre
2019-06-14fix empty list item at top of YAML filetlatorre
The first logical record in the SNOCR files don't have an EV bank which was causing the output file to have an emtpy list element. This commit fixes the issue by checking for an empty EV bank before printing the list delimiter.
2019-06-14add trigger word and trigger time in ns to the YAML filetlatorre
2019-06-14set the maximum kinetic energy in the fit dynamically based on particle IDtlatorre
The range and energy loss tables have different maximum values for electrons, muons, and protons so we have to dynamically set the maximum energy of the fit in order to avoid a GSL interpolation error. This commit adds {electron,muon,proton}_get_max_energy() functions to return the maximum energy in the tables and that is then used to set the maximum value in the fit.
2019-06-14add a function to compute a data cleaning wordtlatorre
Also write out the data cleaning word to the YAML file.
2019-06-14set the starting energy to MAX_ENERGY if it's greatertlatorre
Also increase the maximum kinetic energy to 10^4 GeV which is approximately the maximum expected energy for cosmic muons at SNO.
2019-06-13add a data cleaning cut to tag incoming muonstlatorre
This commit adds a data cleaning cut to tag incoming muons by looking for early OWL hits. It also significantly updates the flasher cut to catch more flashers. In particular, the flasher cut now does the following: - loops over *all* paddle cards with at least 4 hits instead of just the paddle cards with the most hits - uses QLX to look for charge outliers in the paddle card - fixes a few bugs (for example, uninitialized values in the charge array) - adds a check to to see if the given slot is early with respect to all PMTs within 4 meters to catch the case where the flashing channel is missing from the event
2019-06-02add is_flasher field to outputtlatorre
2019-06-02update find_peaks() to only return unique peakstlatorre
This commit updates find_peaks() to only return peaks which are at least a certain number of degrees apart from each other. This is because I found that for many events the first few peaks would all be essentially the same direction and so the fit was taking a lot of time fitting essentially the same seed points. Since I now have to only try 3 peaks in order to get my grid jobs to run for less than a few hours it's necessary to make sure we aren't just fitting the same three directions for the "quick" minimization. I also updated the fit to only use a maximum of 3 seed directions.
2019-05-29set step size on theta and phi to 0.1tlatorre
Also, update the step size for the energy during the final minimization to 10%.
2019-05-24add a script to submit jobs to the gridtlatorre
2019-05-24update sprintf_yaml_list()tlatorre
This commit changes the format specifier for the values in sprintf_yaml_list() from %.2g -> %.2f because YAML (at least the python parser) doesn't recognize values like 1e+03 as floats.
2019-05-24several small updates to fit.ctlatorre
- set number of shower points to 10 for the main fit - set step size to 10% of the energy - set max number of evals during quick minimization phase to 1000
2019-05-24switch to using BOBYQA since it's fastertlatorre
2019-05-24change MAX_NPEAKS to 5tlatorre
I probably need to spend some time to optimize this along with the algorithm for guessing the peaks, but for now I am just lowering this from 10 -> 5 because with 10 the number of quick minimizations for 3 particles is too big and so the fits take way too long.
2019-05-24don't do fast fit during quick minimization phasetlatorre
When plotting the likelihood function I realized that the fast likelihood calculation was *very* noisy due to the way I calculated the shower and delta ray charge. Although it works well for single particles, it is not suitable for distinguishing which seed is the best when doing multi particle fits. Eventually I may be able to fix this, but for now we just do the normal likelihood calculation. I also decreased the number of shower points from 100 -> 10 to speed things up.
2019-05-23add zdab-cattlatorre
This commit adds a new program called zdab-cat which is kind of like fit, but just produces the YAML output without actually fitting anything.
2019-05-23make float formatting consistent in sprintf_yaml_list()tlatorre
2019-05-14add --plot-likelihood option to fittlatorre
2019-05-13update method for calculating expected number of photons from shower and ↵tlatorre
delta rays This commit introduces a new method for integrating over the particle track to calculate the number of shower and delta ray photons expected at each PMT. The reason for introducing a new method was that the previous method of just using the trapezoidal rule was both inaccurate and not stable. By inaccurate I mean that the trapezoidal rule was not producing a very good estimate of the true integral and by not stable I mean that small changes in the fit parameters (like theta and phi) could produce wildly different results. This meant that the likelihood function was very noisy and was causing the minimizers to not be able to find the global minimum. The new integration method works *much* better than the trapezoidal rule for the specific functions we are dealing with. The problem is essentially to integrate the product of two functions over some interval, one of which is very "peaky", i.e. we want to find: \int f(x) g(x) dx where f(x) is peaked around some region and g(x) is relatively smooth. For our case, f(x) represents the angular distribution of the Cerenkov light and g(x) represents the factors like solid angle, absorption, etc. The technique I discovered was that you can approximate this integral via a discrete sum: constant \sum_i g(x_i) where the x_i are chosen to have equal spacing along the range of the integral of f(x), i.e. x_i = F^(-1)(i*constant) This new method produces likelihood functions which are *much* more smooth and accurate than previously. In addition, there are a few other fixes in this commit: - switch from specifying a step size for the shower integration to a number of points, i.e. dx_shower -> number of shower points - only integrate to the PSUP I realized that previously we were integrating to the end of the track even if the particle left the PSUP, and that there was no code to deal with the fact that light emitted beyond the PSUP can't make it back to the PMTs. - only integrate to the Cerenkov threshold When integrating over the particle track to calculate the expected number of direct Cerenkov photons, we now only integrate the track up to the point where the particle's velocity is 1/index. This should hopefully make the likelihood smoother because previously the estimate would depend on exactly whether the points we sampled the track were above or below this point. - add a minimum theta0 value based on the angular width of the PMT When calculating the expected number of Cerenkov photons we assumed that the angular distribution was constant over the whole PMT. This is a bad assumption when the particle is very close to the PMT. Really we should average the function over all the angles of the PMT, but that would be too computationally expensive so instead we just calculate a minimum theta0 value which depends on the distance and angle to the PMT. This seems to make the likelihood much smoother for particles near the PSUP. - add a factor of sin(theta) when checking if we can skip calculating the charge in get_expected_charge() - fix a nan in beta_root() when the momentum is negative - update PSUP_RADIUS from 800 cm -> 840 cm
2019-03-31switch back to using subplextlatorre
2019-03-26small fix to fit.ctlatorre
2019-03-25update rayleigh scattering calculationtlatorre
This commit updates the optics code to calculate the rayleigh scattering length using the Einstein-Smoluchowski formula instead of using the effective rayleigh scattering lengths from the RSPR bank.
2019-03-17set a relative tolerance of 1e-2 on the optimization parameters in the fast fittlatorre
2019-03-16add GPLv3 licensetlatorre
2019-03-16switch to using SBPLX for the minimizationtlatorre
Based on some initial testing it seems that the subplex minimization algorithm performs *much* better than BOBYQA for multi-particle fits. It is also a bit slower, so I will probably have to figure out how to speed things up.
2019-03-07update fit to automatically load DQXX file based on run numbertlatorre
2019-03-07update code to allow you to run the fit outside of the src directorytlatorre
To enable the fitter to run outside of the src directory, I created a new function open_file() which works exactly like fopen() except that it searches for the file in both the current working directory and the path specified by an environment variable.
2019-03-07don't fix the position during the fast fittlatorre
2019-03-05update quad() to not abort if the matrix is singulartlatorre
2019-03-04add a function to tag flasher eventstlatorre
2019-03-04update fit to print gtid and nhit even if we skip the eventtlatorre
2019-03-04add a --min-nhit command line argumenttlatorre
2019-03-04update get_event() to handle events without a pmt bank linktlatorre
2019-03-04skip logical record if there is no EV banktlatorre
In the processed zdab files (the SNOCR_* files), the first logical record just has a run header bank and no EV bank.
2019-03-04skip reading in mcgn banks if there is no mc banktlatorre
2019-03-04check that all links are nonzerotlatorre
2019-01-15fix a bug with getting the first MCTK banktlatorre
2019-01-15update zebra library to be able to use linkstlatorre
This commit updates the zebra library files zebra.{c,h} so that it's now possible to traverse the data structure using links! This was originally motivated by wanting to figure out which MC particles were generated from the MCGN bank (from which it's only possible to access the tracks and vertices using structural links). I've also added a new test to test-zebra which checks the consistency of all of the next/up/orig, structural, and reference links in a zebra file.
2019-01-10update find_peaks algorithmtlatorre
Previously, the algorithm used to find peaks was to search for all peaks in the Hough transform above some constant fraction of the highest peak. This algorithm could have issues finding smaller peaks away from the highest peak. The new algorithm instead finds the highest peak in the Hough transform and then recomputes the Hough transform ignoring all PMT hits within the Cerenkov cone of the first peak. The next peak is found from this transform and the process is iteratively repeated until a certain number of peaks are found. One disadvantage of this new system is that it will *always* find the same number of peaks and this will usually be greater than the actual number of rings in the event. This is not a problem though since when fitting the event we loop over all possible peaks and do a quick fit to determine the starting point and so false positives are OK because the real peaks will fit better during this quick fit. Another potential issue with this new method is that by rejecting all PMT hits within the Cerenkov cone of the first peak we could miss a second peak very close to the first peak. This is partially mitigated by the fact that when we loop over all possible combinations of the particle ids and directions we allow each peak to be used more than once. For example, when fitting for the hypothesis that an event is caused by two electrons and one muon and given two possible directions 1 and 2, we will fit for the following possible direction combinations: 1 1 1 1 1 2 1 2 1 1 2 2 2 2 1 2 2 2 Therefore if there is a second ring close to the first it is possible to fit it correctly since we will seed the quick fit with two particles pointing in the same direction. This commit also adds a few tests for new functions and changes the energy step size during the quick fit to 10% of the starting energy value.
2018-12-14switch to using fit_event2() by defaulttlatorre
This commit updates the fit to use the fit_event2() function which can fit for multi vertex hypotheses. It also uses the QUAD fitter and the Hough transform of the event to seed the fit so the results for 1 particle fits will be slightly different than before. I also fixed a small bug in combinations_with_replacement().
2018-12-14fix help stringtlatorre
2018-12-13add some more comments and fix a memory leaktlatorre
2018-12-13add some commentstlatorre
2018-12-13update fit.c to fit multiple verticestlatorre
This commit adds a new function fit_event2() to fit multiple vertices. To seed the fit, fit_event2() does the following: - use the QUAD fitter to find the position and initial time of the event - call find_peaks() to find possible directions for the particles - loop over all possible unique combinations of the particles and direction vectors and do a "fast" minimization The best minimum found from the "fast" minimizations is then used to start the fit. This commit has a few other updates: - adds a hit_only parameter to the nll() function. This was necessary since previously PMTs which weren't hit were always skipped for the fast minimization, but when fitting for multiple vertices we need to include PMTs which aren't hit since we float the energy. - add the function guess_energy() to guess the energy of a particle given a position and direction. This function estimates the energy by summing up the QHS for all PMTs hit within the Cerenkov cone and dividing by 6. - fixed a bug which caused the fit to freeze when hitting ctrl-c during the fast minimization phase.
2018-12-04don't quit when maxtime is reachedtlatorre
2018-12-04fix bugtlatorre
2018-12-04add a command line parameter to control the maximum time of the fittlatorre
This commit adds a parameter to stop the fit if it takes longer than a certain period of time in seconds. This parameter can be set on the command line. For example, to limit fits to 10 minutes: $ ./fit FILENAME --max-time 600.0
2018-12-04set a stopping criterion of 1% for the fit parameterstlatorre
2018-12-03add a goodness of fit parameter psi to the fittlatorre