sddm - Code for the self-destructing dark matter search in SNO

Age	Commit message (Collapse)	Author
2019-07-16	use QLX if QHS is railed	tlatorre

2019-07-12	don't load DQXX file for run 10000 by default	tlatorre

2019-07-11	switch from YAML output to HDF5 to speed things up	tlatorre

2019-07-05	add MCVX time to the YAML file	tlatorre

2019-06-19	add FTP, RSP, and FTK results to the output file	tlatorre

2019-06-14	fix empty list item at top of YAML file	tlatorre
	The first logical record in the SNOCR files don't have an EV bank which was causing the output file to have an emtpy list element. This commit fixes the issue by checking for an empty EV bank before printing the list delimiter.
2019-06-14	add trigger word and trigger time in ns to the YAML file	tlatorre

2019-06-14	set the maximum kinetic energy in the fit dynamically based on particle ID	tlatorre
	The range and energy loss tables have different maximum values for electrons, muons, and protons so we have to dynamically set the maximum energy of the fit in order to avoid a GSL interpolation error. This commit adds {electron,muon,proton}_get_max_energy() functions to return the maximum energy in the tables and that is then used to set the maximum value in the fit.
2019-06-14	add a function to compute a data cleaning word	tlatorre
	Also write out the data cleaning word to the YAML file.
2019-06-14	set the starting energy to MAX_ENERGY if it's greater	tlatorre
	Also increase the maximum kinetic energy to 10^4 GeV which is approximately the maximum expected energy for cosmic muons at SNO.
2019-06-13	add a data cleaning cut to tag incoming muons	tlatorre
	This commit adds a data cleaning cut to tag incoming muons by looking for early OWL hits. It also significantly updates the flasher cut to catch more flashers. In particular, the flasher cut now does the following: - loops over all paddle cards with at least 4 hits instead of just the paddle cards with the most hits - uses QLX to look for charge outliers in the paddle card - fixes a few bugs (for example, uninitialized values in the charge array) - adds a check to to see if the given slot is early with respect to all PMTs within 4 meters to catch the case where the flashing channel is missing from the event
2019-06-02	add is_flasher field to output	tlatorre

2019-06-02	update find_peaks() to only return unique peaks	tlatorre
	This commit updates find_peaks() to only return peaks which are at least a certain number of degrees apart from each other. This is because I found that for many events the first few peaks would all be essentially the same direction and so the fit was taking a lot of time fitting essentially the same seed points. Since I now have to only try 3 peaks in order to get my grid jobs to run for less than a few hours it's necessary to make sure we aren't just fitting the same three directions for the "quick" minimization. I also updated the fit to only use a maximum of 3 seed directions.
2019-05-29	set step size on theta and phi to 0.1	tlatorre
	Also, update the step size for the energy during the final minimization to 10%.
2019-05-24	add a script to submit jobs to the grid	tlatorre

2019-05-24	update sprintf_yaml_list()	tlatorre
	This commit changes the format specifier for the values in sprintf_yaml_list() from %.2g -> %.2f because YAML (at least the python parser) doesn't recognize values like 1e+03 as floats.
2019-05-24	several small updates to fit.c	tlatorre
	- set number of shower points to 10 for the main fit - set step size to 10% of the energy - set max number of evals during quick minimization phase to 1000
2019-05-24	switch to using BOBYQA since it's faster	tlatorre

2019-05-24	change MAX_NPEAKS to 5	tlatorre
	I probably need to spend some time to optimize this along with the algorithm for guessing the peaks, but for now I am just lowering this from 10 -> 5 because with 10 the number of quick minimizations for 3 particles is too big and so the fits take way too long.
2019-05-24	don't do fast fit during quick minimization phase	tlatorre
	When plotting the likelihood function I realized that the fast likelihood calculation was very noisy due to the way I calculated the shower and delta ray charge. Although it works well for single particles, it is not suitable for distinguishing which seed is the best when doing multi particle fits. Eventually I may be able to fix this, but for now we just do the normal likelihood calculation. I also decreased the number of shower points from 100 -> 10 to speed things up.
2019-05-23	add zdab-cat	tlatorre
	This commit adds a new program called zdab-cat which is kind of like fit, but just produces the YAML output without actually fitting anything.
2019-05-23	make float formatting consistent in sprintf_yaml_list()	tlatorre

2019-05-14	add --plot-likelihood option to fit	tlatorre

2019-05-13	update method for calculating expected number of photons from shower and ↵	tlatorre
	delta rays This commit introduces a new method for integrating over the particle track to calculate the number of shower and delta ray photons expected at each PMT. The reason for introducing a new method was that the previous method of just using the trapezoidal rule was both inaccurate and not stable. By inaccurate I mean that the trapezoidal rule was not producing a very good estimate of the true integral and by not stable I mean that small changes in the fit parameters (like theta and phi) could produce wildly different results. This meant that the likelihood function was very noisy and was causing the minimizers to not be able to find the global minimum. The new integration method works much better than the trapezoidal rule for the specific functions we are dealing with. The problem is essentially to integrate the product of two functions over some interval, one of which is very "peaky", i.e. we want to find: \int f(x) g(x) dx where f(x) is peaked around some region and g(x) is relatively smooth. For our case, f(x) represents the angular distribution of the Cerenkov light and g(x) represents the factors like solid angle, absorption, etc. The technique I discovered was that you can approximate this integral via a discrete sum: constant \sum_i g(x_i) where the x_i are chosen to have equal spacing along the range of the integral of f(x), i.e. x_i = F^(-1)(iconstant) This new method produces likelihood functions which are much* more smooth and accurate than previously. In addition, there are a few other fixes in this commit: - switch from specifying a step size for the shower integration to a number of points, i.e. dx_shower -> number of shower points - only integrate to the PSUP I realized that previously we were integrating to the end of the track even if the particle left the PSUP, and that there was no code to deal with the fact that light emitted beyond the PSUP can't make it back to the PMTs. - only integrate to the Cerenkov threshold When integrating over the particle track to calculate the expected number of direct Cerenkov photons, we now only integrate the track up to the point where the particle's velocity is 1/index. This should hopefully make the likelihood smoother because previously the estimate would depend on exactly whether the points we sampled the track were above or below this point. - add a minimum theta0 value based on the angular width of the PMT When calculating the expected number of Cerenkov photons we assumed that the angular distribution was constant over the whole PMT. This is a bad assumption when the particle is very close to the PMT. Really we should average the function over all the angles of the PMT, but that would be too computationally expensive so instead we just calculate a minimum theta0 value which depends on the distance and angle to the PMT. This seems to make the likelihood much smoother for particles near the PSUP. - add a factor of sin(theta) when checking if we can skip calculating the charge in get_expected_charge() - fix a nan in beta_root() when the momentum is negative - update PSUP_RADIUS from 800 cm -> 840 cm
2019-03-31	switch back to using subplex	tlatorre

2019-03-26	small fix to fit.c	tlatorre

2019-03-25	update rayleigh scattering calculation	tlatorre
	This commit updates the optics code to calculate the rayleigh scattering length using the Einstein-Smoluchowski formula instead of using the effective rayleigh scattering lengths from the RSPR bank.
2019-03-17	set a relative tolerance of 1e-2 on the optimization parameters in the fast fit	tlatorre

2019-03-16	add GPLv3 license	tlatorre

2019-03-16	switch to using SBPLX for the minimization	tlatorre
	Based on some initial testing it seems that the subplex minimization algorithm performs much better than BOBYQA for multi-particle fits. It is also a bit slower, so I will probably have to figure out how to speed things up.
2019-03-07	update fit to automatically load DQXX file based on run number	tlatorre

2019-03-07	update code to allow you to run the fit outside of the src directory	tlatorre
	To enable the fitter to run outside of the src directory, I created a new function open_file() which works exactly like fopen() except that it searches for the file in both the current working directory and the path specified by an environment variable.
2019-03-07	don't fix the position during the fast fit	tlatorre

2019-03-05	update quad() to not abort if the matrix is singular	tlatorre

2019-03-04	add a function to tag flasher events	tlatorre

2019-03-04	update fit to print gtid and nhit even if we skip the event	tlatorre

2019-03-04	add a --min-nhit command line argument	tlatorre

2019-03-04	update get_event() to handle events without a pmt bank link	tlatorre

2019-03-04	skip logical record if there is no EV bank	tlatorre
	In the processed zdab files (the SNOCR_* files), the first logical record just has a run header bank and no EV bank.
2019-03-04	skip reading in mcgn banks if there is no mc bank	tlatorre

2019-03-04	check that all links are nonzero	tlatorre

2019-01-15	fix a bug with getting the first MCTK bank	tlatorre

2019-01-15	update zebra library to be able to use links	tlatorre
	This commit updates the zebra library files zebra.{c,h} so that it's now possible to traverse the data structure using links! This was originally motivated by wanting to figure out which MC particles were generated from the MCGN bank (from which it's only possible to access the tracks and vertices using structural links). I've also added a new test to test-zebra which checks the consistency of all of the next/up/orig, structural, and reference links in a zebra file.
2019-01-10	update find_peaks algorithm	tlatorre
	Previously, the algorithm used to find peaks was to search for all peaks in the Hough transform above some constant fraction of the highest peak. This algorithm could have issues finding smaller peaks away from the highest peak. The new algorithm instead finds the highest peak in the Hough transform and then recomputes the Hough transform ignoring all PMT hits within the Cerenkov cone of the first peak. The next peak is found from this transform and the process is iteratively repeated until a certain number of peaks are found. One disadvantage of this new system is that it will always find the same number of peaks and this will usually be greater than the actual number of rings in the event. This is not a problem though since when fitting the event we loop over all possible peaks and do a quick fit to determine the starting point and so false positives are OK because the real peaks will fit better during this quick fit. Another potential issue with this new method is that by rejecting all PMT hits within the Cerenkov cone of the first peak we could miss a second peak very close to the first peak. This is partially mitigated by the fact that when we loop over all possible combinations of the particle ids and directions we allow each peak to be used more than once. For example, when fitting for the hypothesis that an event is caused by two electrons and one muon and given two possible directions 1 and 2, we will fit for the following possible direction combinations: 1 1 1 1 1 2 1 2 1 1 2 2 2 2 1 2 2 2 Therefore if there is a second ring close to the first it is possible to fit it correctly since we will seed the quick fit with two particles pointing in the same direction. This commit also adds a few tests for new functions and changes the energy step size during the quick fit to 10% of the starting energy value.
2018-12-14	switch to using fit_event2() by default	tlatorre
	This commit updates the fit to use the fit_event2() function which can fit for multi vertex hypotheses. It also uses the QUAD fitter and the Hough transform of the event to seed the fit so the results for 1 particle fits will be slightly different than before. I also fixed a small bug in combinations_with_replacement().
2018-12-14	fix help string	tlatorre

2018-12-13	add some more comments and fix a memory leak	tlatorre

2018-12-13	add some comments	tlatorre

2018-12-13	update fit.c to fit multiple vertices	tlatorre
	This commit adds a new function fit_event2() to fit multiple vertices. To seed the fit, fit_event2() does the following: - use the QUAD fitter to find the position and initial time of the event - call find_peaks() to find possible directions for the particles - loop over all possible unique combinations of the particles and direction vectors and do a "fast" minimization The best minimum found from the "fast" minimizations is then used to start the fit. This commit has a few other updates: - adds a hit_only parameter to the nll() function. This was necessary since previously PMTs which weren't hit were always skipped for the fast minimization, but when fitting for multiple vertices we need to include PMTs which aren't hit since we float the energy. - add the function guess_energy() to guess the energy of a particle given a position and direction. This function estimates the energy by summing up the QHS for all PMTs hit within the Cerenkov cone and dividing by 6. - fixed a bug which caused the fit to freeze when hitting ctrl-c during the fast minimization phase.
2018-12-04	don't quit when maxtime is reached	tlatorre