aboutsummaryrefslogtreecommitdiff
path: root/utils/dc
AgeCommit message (Collapse)Author
2021-01-05print latex tabletlatorre
2021-01-05update dc flasher and noise codetlatorre
2021-01-05hack to get rid of flasher and muon events in breakdown sampletlatorre
2021-01-04get rid of nhit_threshtlatorre
2021-01-04update radius cut in dctlatorre
2020-11-16don't apply retrigger cut to MCtlatorre
2020-11-16loop over MC filenamestlatorre
2020-11-01don't apply nhit trigger cut to MCtlatorre
2020-10-05major updates to the chi2 analysistlatorre
This commit fixes the chi2 analysis so that it is no longer biased. Previously, the chi2 analysis pull plots showed a consistent bias. At first, I thought this was due to the fact that the posterior wasn't gaussian, but even after switching to percentile plots based on the algorithm outlined in "Validating Bayesian Inference Algorithms with Simulation-Based Calibration", I was still seeing a bias. I finally tracked it down to the fact that I was applying the energy scale parameters to the data instead of the Monte Carlo. Therefore, in this commit I update the posterior to now apply the energy scale parameters to the Monte Carlo instead of the data. This has the slight disadvantage that the final histograms will be binned in the biased energy, but that's not really a big deal. In addition, this commit contains several other updates: - switch to plotting percentile plots based on the algorithm in "Validating Bayesian Inference Algorithms with Simulation-Based Calibration" instead of pull plots - apply both the energy scale and resolution at the individual particle level, i.e. there is no longer an energy resolution term for electron + muon fits - separate pull plots and coverage plots. Previously I was making both the p-value coverage plots and the pull plots at the same time. However, the pull plots shouldn't have anything to do with the GENIE weights whereas the p-value coverage plots should draw samples weighted by the GENIE weights. In addition, for the pull plots we draw new truth parameters on every iteration whereas for the p-value coverage plots we only draw them once. - switch to using KDEMove() for the MCMC since I think it samples multimodal distributions a lot better than the default emcee move. - I now correct for the reconstruction energy bias in plot-michel and plot-muons
2020-09-07update python scripts to not call plt.show() when run with --savetlatorre
2020-08-31add estimate_errors to chi2 analysistlatorre
This commit updates the estimate_errors() function so that it works without a list of constraints and uses arrays of low and high bounds passed in instead of hardcoded constraints. I can now call this function from the chi2 analysis to get the stepsizes before running the MCMC.
2020-08-30update dc scriptstlatorre
- delete instrumental tags in both scripts since we already tag them in get_events() - apply the same cuts to the MC as to the data - add a comment about the fitted fraction - quit if we don't have at least 1 type of each instrumental
2020-07-27add 20 MeV cut to dc, dc-closure-test, and plot-dctlatorre
2020-07-27loop over runs in dc, dc-closure-test, and plot-dc to prevent using too much ↵tlatorre
memory
2020-06-14update dc and dc-closure-test to take into account fitted fraction of ↵tlatorre
instrumentals This commit updates the contamination analysis scripts to take into account the fact that we only fit a fraction of some of the instrumental events. Based on the recent rate at which my jobs have been running on the grid, fitting all the events would take *way* too long. Therefore, I'm now planning to only fit 10% of muon, flasher, and neck events. With this commit the contamination analysis will now correctly take into account the fact that we aren't fitting all the instrumental events.
2020-05-31update contamination analysis step sizetlatorre
This commit updates the step size used for the MCMC in the contamination analysis to 0.5 times the error returned by scanning near the minimum. I ran some tests and this seemed to be pretty efficient compared to either the full error or 0.1 times the error. I also reduced the number of workers to 10.
2020-05-25update contamination analysis stufftlatorre
- fix Constraint.renormalize_no_fix() which could enter an infinite loop if the fixed parameter was greater than 1 - EPSILON - don't divide by psi twice in get_events() - only use prompt events and cut on nhit_cal < 100
2020-05-12speed up the contamination analysis scripttlatorre
2020-05-12add a script to do a closure test on the contamination analysistlatorre
2020-05-11update ockham factor, remove hack, and don't submit all flasherstlatorre
This commit contains the following updates: - remove hack to get rid of low energy events in plot-energy since while writing the unidoc I realized it's not necessary now that we add +100 to multi-particle fits - update Ockham factor to use an energy resolution of 5% - update submit-grid-jobs to submit jobs according to the following criteria: - always submit prompt events with no data cleaning cuts - submit 10% of prompt flasher events - submit all other prompt events - submit followers only if they have no data cleaning cuts - update submit-grid-jobs to place the nhit cut of 100 on the calibrated nhit
2020-05-11update utils/ folder to make a python package called sddmtlatorre
This commit adds an sddm python package to the utils/ folder. This allows me to consolidate code used across all the various scripts. This package is now installed by default to /home/tlatorre/local/lib/python2.7/site-packages so you should add the following to your .bashrc file: export PYTHONPATH=$HOME/local/lib/python2.7/site-packages/:$PYTHONPATH before using the scripts installed to ~/local/bin.
2020-01-13update script to calculate contaminationtlatorre
This commit updates the dc script to calculate the instrumental contamination to now treat all 4 high level variables as correlated for muons. Previously I had assumed that the reconstructed radius was independent from udotr, z, and psi, but based on the corner plots it seems like the radius is strongly correlated with udotr. I also updated the plotting code when using the save command line argument to be similar to plot-fit-results.
2020-01-06add script to calculate background contaminationtlatorre
This commit adds a script to calculate the background contamination using a method inspired by the bifurcated analysis method used in SNO. The method works by looking at the distribution of several high level variables (radius, udotr, psi, and reconstructed z position) for events tagged by the different data cleaning cuts and assuming that any background events which sneak past the data cleaning cuts will have a similar distribution (for certain backgrounds this is assumed and for others I will actually test this assumption. For more details see the unidoc). Then, by looking at the distribution of these high level variables for all the untagged events we can use a maximum likelihood fit to determine the residual contamination. There are also a few other updates to the plot-energy script: - add a --dc command line argument to plot corner plots for the high level variables used in the contamination analysis - add a fudge factor to the Ockham factor of 100 per extra particle - fix a bug by correctly setting the final kinetic energy to the sum of the individual kinetic energies instead of just the first particle - fix calculation of prompt events by applying at the run level