| Age | Commit message (Collapse) | Author | 
|---|
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | This commit fixes the chi2 analysis so that it is no longer biased.
Previously, the chi2 analysis pull plots showed a consistent bias. At
first, I thought this was due to the fact that the posterior wasn't
gaussian, but even after switching to percentile plots based on the
algorithm outlined in "Validating Bayesian Inference Algorithms with
Simulation-Based Calibration", I was still seeing a bias. I finally
tracked it down to the fact that I was applying the energy scale
parameters to the data instead of the Monte Carlo. Therefore, in this
commit I update the posterior to now apply the energy scale parameters
to the Monte Carlo instead of the data. This has the slight disadvantage
that the final histograms will be binned in the biased energy, but
that's not really a big deal.
In addition, this commit contains several other updates:
- switch to plotting percentile plots based on the algorithm in
  "Validating Bayesian Inference Algorithms with Simulation-Based
  Calibration" instead of pull plots
- apply both the energy scale and resolution at the individual particle
  level, i.e. there is no longer an energy resolution term for electron
  + muon fits
- separate pull plots and coverage plots. Previously I was making both
  the p-value coverage plots and the pull plots at the same time.
  However, the pull plots shouldn't have anything to do with the GENIE
  weights whereas the p-value coverage plots should draw samples
  weighted by the GENIE weights. In addition, for the pull plots we draw
  new truth parameters on every iteration whereas for the p-value
  coverage plots we only draw them once.
- switch to using KDEMove() for the MCMC since I think it samples
  multimodal distributions a lot better than the default emcee move.
- I now correct for the reconstruction energy bias in plot-michel and
  plot-muons | 
|  |  | 
|  | This commit updates the estimate_errors() function so that it works
without a list of constraints and uses arrays of low and high bounds
passed in instead of hardcoded constraints.
I can now call this function from the chi2 analysis to get the stepsizes
before running the MCMC. | 
|  | - delete instrumental tags in both scripts since we already tag them in
  get_events()
- apply the same cuts to the MC as to the data
- add a comment about the fitted fraction
- quit if we don't have at least 1 type of each instrumental | 
|  |  | 
|  | memory | 
|  | instrumentals
This commit updates the contamination analysis scripts to take into account the
fact that we only fit a fraction of some of the instrumental events.
Based on the recent rate at which my jobs have been running on the grid,
fitting all the events would take *way* too long. Therefore, I'm now planning
to only fit 10% of muon, flasher, and neck events. With this commit the
contamination analysis will now correctly take into account the fact that we
aren't fitting all the instrumental events. | 
|  | This commit updates the step size used for the MCMC in the contamination
analysis to 0.5 times the error returned by scanning near the minimum. I ran
some tests and this seemed to be pretty efficient compared to either the full
error or 0.1 times the error. I also reduced the number of workers to 10. | 
|  | - fix Constraint.renormalize_no_fix() which could enter an infinite loop if the
  fixed parameter was greater than 1 - EPSILON
- don't divide by psi twice in get_events()
- only use prompt events and cut on nhit_cal < 100 | 
|  |  | 
|  |  | 
|  | This commit contains the following updates:
- remove hack to get rid of low energy events in plot-energy since while
  writing the unidoc I realized it's not necessary now that we add +100 to
  multi-particle fits
- update Ockham factor to use an energy resolution of 5%
- update submit-grid-jobs to submit jobs according to the following criteria:
    - always submit prompt events with no data cleaning cuts
    - submit 10% of prompt flasher events
    - submit all other prompt events
    - submit followers only if they have no data cleaning cuts
- update submit-grid-jobs to place the nhit cut of 100 on the calibrated nhit | 
|  | This commit adds an sddm python package to the utils/ folder. This allows me to
consolidate code used across all the various scripts. This package is now
installed by default to /home/tlatorre/local/lib/python2.7/site-packages so you
should add the following to your .bashrc file:
    export PYTHONPATH=$HOME/local/lib/python2.7/site-packages/:$PYTHONPATH
before using the scripts installed to ~/local/bin. | 
|  | This commit updates the dc script to calculate the instrumental contamination
to now treat all 4 high level variables as correlated for muons. Previously I
had assumed that the reconstructed radius was independent from udotr, z, and
psi, but based on the corner plots it seems like the radius is strongly
correlated with udotr.
I also updated the plotting code when using the save command line argument to
be similar to plot-fit-results. | 
|  | This commit adds a script to calculate the background contamination using a
method inspired by the bifurcated analysis method used in SNO. The method works
by looking at the distribution of several high level variables (radius, udotr,
psi, and reconstructed z position) for events tagged by the different data
cleaning cuts and assuming that any background events which sneak past the data
cleaning cuts will have a similar distribution (for certain backgrounds this is
assumed and for others I will actually test this assumption. For more details
see the unidoc). Then, by looking at the distribution of these high level
variables for all the untagged events we can use a maximum likelihood fit to
determine the residual contamination.
There are also a few other updates to the plot-energy script:
- add a --dc command line argument to plot corner plots for the high level
  variables used in the contamination analysis
- add a fudge factor to the Ockham factor of 100 per extra particle
- fix a bug by correctly setting the final kinetic energy to the sum of the
  individual kinetic energies instead of just the first particle
- fix calculation of prompt events by applying at the run level |