aboutsummaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Expand)Author
2018-09-17fix a bug in get_dEdx()tlatorre
2018-09-17fix a bug in interp1d()tlatorre
2018-09-17add a header file with the SNOMAN particle and target ID codestlatorre
2018-09-17add bitmasks for the mc track status wordtlatorre
2018-09-17add MC Track bank to zdab_utilstlatorre
2018-09-13fix width formatting for t0tlatorre
2018-09-13add a function to compute log(n) for integer ntlatorre
2018-09-13speed things up by introducing a minimum ratio between probabilitiestlatorre
2018-09-12small updates to speed things uptlatorre
2018-09-12update the starting parameterstlatorre
2018-09-11only print the likelihood value once for each iteration during the "quick" mi...tlatorre
2018-09-11fix the energy and position when doing the "quick" minimizationstlatorre
2018-09-11switch order of expressions to avoid a valgrind warningtlatorre
2018-09-11update fast likelihood function to include the pmt response and absorptiontlatorre
2018-09-11add absorption lengthtlatorre
2018-09-11update PMT_RADIUS to be the radius of the PMT concentratortlatorre
2018-09-11add PMT responsetlatorre
2018-09-10add a fast likelihood functiontlatorre
2018-09-09fix bug in charge PDF calculationtlatorre
2018-09-06compute theta0 in path_init() to speed things uptlatorre
2018-09-06update theta0 calculationtlatorre
2018-09-06introduce a minimum value for the scattering RMS theta0tlatorre
2018-09-04update kinetic energy step size to 2% of initial kinetic energy guesstlatorre
2018-09-04add a function to return the kahan sum of an arraytlatorre
2018-09-04update fit to guess energy, direction, and t0tlatorre
2018-08-31add muon critical energy for D2Otlatorre
2018-08-31start by doing a series of "quick" minimizationstlatorre
2018-08-31add epsrel argument to likelihood functiontlatorre
2018-08-31update the lower bound for the energy in the fittlatorre
2018-08-31update the criterion for the fit convergencetlatorre
2018-08-31update likelihood check to 1e-5 since that's what we pass to nlopttlatorre
2018-08-31fit in a do while loop until the fit converges to the same likelihood valuetlatorre
2018-08-31add option to save fit results to a text filetlatorre
2018-08-31update printf arguments to keep output alignedtlatorre
2018-08-31switch back to calling cquad just once to speed things uptlatorre
2018-08-31use interp1d() to interpolate path to speed things uptlatorre
2018-08-31add interp1d function to do fast interpolation when the x values are evenly s...tlatorre
2018-08-31compile with -O2 to speed things uptlatorre
2018-08-31rotate and translate the path in path_init to speed things uptlatorre
2018-08-31add theta0 argument to path_eval in test-path.ctlatorre
2018-08-31print out how long the likelihood function takestlatorre
2018-08-28add path to the likelihood fittlatorre
2018-08-27update tests since I switched to using the D2O muon tables from the PDGtlatorre
2018-08-27update code to use get_index_snoman* functions to calculate the index of refr...tlatorre
2018-08-27fix how multiple Coulomb scattering is treatedtlatorre
2018-08-27add code to expand the track of a particle using a KL expansiontlatorre
2018-08-14fix ev pointer bugtlatorre
2018-08-14update pmt hit array in event struct to be MAX_PMTS longtlatorre
2018-08-14add lower and upper bounds for the fit parameterstlatorre
2018-08-14add function to fit event and clear event after each fittlatorre
mpt */ .highlight .gs { font-weight: bold } /* Generic.Strong */ .highlight .gu { color: #666666 } /* Generic.Subheading */ .highlight .gt { color: #aa0000 } /* Generic.Traceback */ .highlight .kc { color: #008800; font-weight: bold } /* Keyword.Constant */ .highlight .kd { color: #008800; font-weight: bold } /* Keyword.Declaration */ .highlight .kn { color: #008800; font-weight: bold } /* Keyword.Namespace */ .highlight .kp { color: #008800 } /* Keyword.Pseudo */ .highlight .kr { color: #008800; font-weight: bold } /* Keyword.Reserved */ .highlight .kt { color: #888888; font-weight: bold } /* Keyword.Type */ .highlight .m { color: #0000DD; font-weight: bold } /* Literal.Number */ .highlight .s { color: #dd2200; background-color: #fff0f0 } /* Literal.String */ .highlight .na { color: #336699 } /* Name.Attribute */ .highlight .nb { color: #003388 } /* Name.Builtin */ .highlight .nc { color: #bb0066; font-weight: bold } /* Name.Class */ .highlight .no { color: #003366; font-weight: bold } /* Name.Constant */ .highlight .nd { color: #555555 } /* Name.Decorator */ .highlight .ne { color: #bb0066; font-weight: bold } /* Name.Exception */ .highlight .nf { color: #0066bb; font-weight: bold } /* Name.Function */ .highlight .nl { color: #336699; font-style: italic } /* Name.Label */ .highlight .nn { color: #bb0066; font-weight: bold } /* Name.Namespace */ .highlight .py { color: #336699; font-weight: bold } /* Name.Property */ .highlight .nt { color: #bb0066; font-weight: bold } /* Name.Tag */ .highlight .nv { color: #336699 } /* Name.Variable */ .highlight .ow { color: #008800 } /* Operator.Word */ .highlight .w { color: #bbbbbb } /* Text.Whitespace */ .highlight .mb { color: #0000DD; font-weight: bold } /* Literal.Number.Bin */ .highlight .mf { color: #0000DD; font-weight: bold } /* Literal.Number.Float */ .highlight .mh { color: #0000DD; font-weight: bold } /* Literal.Number.Hex */ .highlight .mi { color: #0000DD; font-weight: bold } /* Literal.Number.Integer */ .highlight .mo { color: #0000DD; font-weight: bold } /* Literal.Number.Oct */ .highlight .sa { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Affix */ .highlight .sb { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Backtick */ .highlight .sc { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Char */ .highlight .dl { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Delimiter */ .highlight .sd { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Doc */ .highlight .s2 { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Double */ .highlight .se { color: #0044dd; background-color: #fff0f0 } /* Literal.String.Escape */ .highlight .sh { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Heredoc */ .highlight .si { color: #3333bb; background-color: #fff0f0 } /* Literal.String.Interpol */ .highlight .sx { color: #22bb22; background-color: #f0fff0 } /* Literal.String.Other */ .highlight .sr { color: #008800; background-color: #fff0ff } /* Literal.String.Regex */ .highlight .s1 { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Single */ .highlight .ss { color: #aa6600; background-color: #fff0f0 } /* Literal.String.Symbol */ .highlight .bp { color: #003388 } /* Name.Builtin.Pseudo */ .highlight .fm { color: #0066bb; font-weight: bold } /* Name.Function.Magic */ .highlight .vc { color: #336699 } /* Name.Variable.Class */ .highlight .vg { color: #dd7700 } /* Name.Variable.Global */ .highlight .vi { color: #3333bb } /* Name.Variable.Instance */ .highlight .vm { color: #336699 } /* Name.Variable.Magic */ .highlight .il { color: #0000DD; font-weight: bold } /* Literal.Number.Integer.Long */
#!/usr/bin/env python
# Copyright (c) 2019, Anthony Latorre <tlatorre at uchicago>
#
# This program is free software: you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the Free
# Software Foundation, either version 3 of the License, or (at your option)
# any later version.
#
# This program is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
# more details.
#
# You should have received a copy of the GNU General Public License along with
# this program. If not, see <https://www.gnu.org/licenses/>.
"""
Script to combine the fit results from jobs submitted to the grid. It's
expected to be run from a cron job:

    PATH=/usr/bin:$HOME/local/bin
    SDDM_DATA=$HOME/sddm/src
    DQXX_DIR=$HOME/dqxx

    0 * * * * module load hdf5; module load py-h5py; module load zlib; cat-grid-jobs --loglevel debug --logfile cat.log --output-dir $HOME/fit_results

The script will loop through all entries in the database and try to combine the
fit results into a single output file.
"""

from __future__ import print_function, division
import os
import sys
import numpy as np
from datetime import datetime
import h5py
from os.path import join, split
from subprocess import check_call

DEBUG = 0
VERBOSE = 1
NOTICE = 2
WARNING = 3

class Logger(object):
    """
    Simple logger class that I wrote for the SNO+ DAQ. Very easy to use:

        log = Logger()
        log.set_logfile("test.log")
        log.notice("blah")
        log.warn("foo")

    The log file format is taken from the Redis log file format which is really
    nice since it shows the exact time and severity of each log message.
    """
    def __init__(self):
        self.logfile = sys.stdout
        # by default, we log everything
        self.verbosity = DEBUG

    def set_verbosity(self, level):
        if isinstance(level, int):
            self.verbosity = level
        elif isinstance(level, basestring):
            if level == 'debug':
                self.verbosity = DEBUG
            elif level == 'verbose':
                self.verbosity = VERBOSE
            elif level == 'notice':
                self.verbosity = NOTICE
            elif level == 'warning':
                self.verbosity = WARNING
            else:
                raise ValueError("unknown loglevel '%s'" % level)
        else:
            raise TypeError("level must be a string or integer")

    def set_logfile(self, filename):
        self.logfile = open(filename, 'a')

    def debug(self, msg):
        self.log(DEBUG, msg)

    def verbose(self, msg):
        self.log(VERBOSE, msg)

    def notice(self, msg):
        self.log(NOTICE, msg)

    def warn(self, msg):
        self.log(WARNING, msg)

    def log(self, level, msg):
        if level < self.verbosity:
            return

        c = '.-*#'
        pid = os.getpid()
        now = datetime.now()
        buf = now.strftime('%d %b %H:%M:%S.%f')[:-3]

        self.logfile.write('%d:%s %c %s\n' % (pid, buf, c[level], msg))
        self.logfile.flush()

log = Logger()

# Check that a given file can be accessed with the correct mode.
# Additionally check that `file` is not a directory, as on Windows
# directories pass the os.access check.
def _access_check(fn, mode):
    return (os.path.exists(fn) and os.access(fn, mode) and not os.path.isdir(fn))

def which(cmd, mode=os.F_OK | os.X_OK, path=None):
    """Given a command, mode, and a PATH string, return the path which
    conforms to the given mode on the PATH, or None if there is no such
    file.
    `mode` defaults to os.F_OK | os.X_OK. `path` defaults to the result
    of os.environ.get("PATH"), or can be overridden with a custom search
    path.
    """
    # If we're given a path with a directory part, look it up directly rather
    # than referring to PATH directories. This includes checking relative to the
    # current directory, e.g. ./script
    if os.path.dirname(cmd):
        if _access_check(cmd, mode):
            return cmd
        return None

    if path is None:
        path = os.environ.get("PATH", None)
        if path is None:
            try:
                path = os.confstr("CS_PATH")
            except (AttributeError, ValueError):
                # os.confstr() or CS_PATH is not available
                path = os.defpath
        # bpo-35755: Don't use os.defpath if the PATH environment variable is
        # set to an empty string

    # PATH='' doesn't match, whereas PATH=':' looks in the current directory
    if not path:
        return None

    path = path.split(os.pathsep)

    if sys.platform == "win32":
        # The current directory takes precedence on Windows.
        curdir = os.curdir
        if curdir not in path:
            path.insert(0, curdir)

        # PATHEXT is necessary to check on Windows.
        pathext = os.environ.get("PATHEXT", "").split(os.pathsep)
        # See if the given file matches any of the expected path extensions.
        # This will allow us to short circuit when given "python.exe".
        # If it does match, only test that one, otherwise we have to try
        # others.
        if any(cmd.lower().endswith(ext.lower()) for ext in pathext):
            files = [cmd]
        else:
            files = [cmd + ext for ext in pathext]
    else:
        # On other platforms you don't have things like PATHEXT to tell you
        # what file suffixes are executable, so just pass on cmd as-is.
        files = [cmd]

    seen = set()
    for dir in path:
        normdir = os.path.normcase(dir)
        if not normdir in seen:
            seen.add(normdir)
            for thefile in files:
                name = os.path.join(dir, thefile)
                if _access_check(name, mode):
                    return name
    return None

def splitext(path):
    """
    Like os.path.splitext() except it returns the full extension if the
    filename has multiple extensions, for example:

        splitext('foo.tar.gz') -> 'foo', '.tar.gz'
    """
    full_root, full_ext = os.path.splitext(path)
    while True:
        root, ext = os.path.splitext(full_root)
        if ext:
            full_ext = ext + full_ext
            full_root = root
        else:
            break

    return full_root, full_ext

def cat_grid_jobs(conn, output_dir):
    zdab_cat = which("zdab-cat")

    if zdab_cat is None:
        log.warn("couldn't find zdab-cat in path!",file=sys.stderr)
        return

    c = conn.cursor()

    results = c.execute('SELECT filename, uuid FROM state').fetchall()

    unique_results = set(results)

    for filename, uuid in unique_results:
        head, tail = split(filename)
        root, ext = splitext(tail)

        # First, find all hdf5 result files
        fit_results = c.execute("SELECT submit_file FROM state WHERE state = 'SUCCESS' AND filename = ? AND uuid = ?", (filename, uuid)).fetchall()
        fit_results = [fit_result_filename[0] for fit_result_filename in fit_results]
        fit_results = ['%s.hdf5' % splitext(fit_result_filename)[0] for fit_result_filename in fit_results]

        if len(fit_results) == 0:
            log.debug("No fit results found for %s (%s)" % (tail, uuid))
            continue

        output = join(output_dir,"%s_%s_fit_results.hdf5" % (root,uuid))

        if os.path.exists(output):
            with h5py.File(output,'r') as fout:
                if 'fits' in fout:
                    total_fits = fout['fits'].shape[0]

                    if total_fits >= len(fit_results):
                        log.debug("skipping %s because there are already %i fit results" % (tail,len(fit_results)))
                        continue

        # First we get the full event list along with the data cleaning word, FTP
        # position, FTK, and RSP energy from the original zdab and then add the fit
        # results.
        #
        # Note: We send stderr to /dev/null since there can be a lot of warnings
        # about PMT types and fit results
        with open(os.devnull, 'w') as f:
            log.debug("zdab-cat %s -o %s" % (filename,output))
            check_call([zdab_cat,filename,"-o",output],stderr=f)

        total_events = 0
        events_with_fit = 0
        total_fits = 0

        with h5py.File(output,"a") as fout:
            total_events = fout['ev'].shape[0]
            for filename in fit_results:
                head, tail = split(filename)
                with h5py.File(filename) as f:
                    if 'git_sha1' not in f.attrs:
                        log.warn("No git sha1 found for %s. Skipping..." % tail)
                        continue
                    # Check to see if the git sha1 match
                    if fout.attrs['git_sha1'] != f.attrs['git_sha1']:
                        log.debug("git_sha1 is %s for current version but %s for %s" % (fout.attrs['git_sha1'],f.attrs['git_sha1'],tail))
                    # get fits which match up with the events
                    valid_fits = f['fits'][np.isin(f['fits'][:][['run','gtid']],fout['ev'][:][['run','gtid']])]
                    # Add the fit results
                    fout['fits'].resize((fout['fits'].shape[0]+valid_fits.shape[0],))
                    fout['fits'][-valid_fits.shape[0]:] = valid_fits
                    events_with_fit += len(np.unique(valid_fits[['run','gtid']]))
                    total_fits += len(np.unique(f['fits']['run','gtid']))

        log.notice("%s_%s: added %i/%i fit results to a total of %i events" % (filename, uuid, events_with_fit, total_fits, total_events))

if __name__ == '__main__':
    import argparse
    import sqlite3

    parser = argparse.ArgumentParser("concatenate fit results from grid jobs into a single file")
    parser.add_argument("--db", type=str, help="database file", default=None)
    parser.add_argument('--loglevel',
                        help="logging level (debug, verbose, notice, warning)",
                        default='notice')
    parser.add_argument('--logfile', default=None,
                        help="filename for log file")
    parser.add_argument('--output-dir', default=None,
                        help="output directory for fit results")
    args = parser.parse_args()

    log.set_verbosity(args.loglevel)

    if args.logfile:
        log.set_logfile(args.logfile)

    home = os.path.expanduser("~")

    if args.db is None:
        args.db = join(home,'state.db')

    if args.output_dir is None:
        args.output_dir = home
    else:
        if not os.path.exists(args.output_dir):
            log.debug("mkdir %s" % args.output_dir)
            os.mkdir(args.output_dir)

    conn = sqlite3.connect(args.db)

    cat_grid_jobs(conn, args.output_dir)
    conn.close()