aboutsummaryrefslogtreecommitdiff
path: root/utils/submit-grid-jobs
AgeCommit message (Collapse)Author
2020-04-13speed up submit-grid-jobs a lot by only calling condor_q oncetlatorre
2020-01-21fix a few bugs in submit-grid-jobstlatorre
2020-01-20update submit-grid-jobs to not add new jobs for runs which are already in ↵tlatorre
the database Also add a -r or --reprocess command line option to reprocess runs which are already in the database.
2020-01-20specify -attributes to speed up condor_q in get_job_status()tlatorre
2019-12-04update submit-grid-jobs and cat-grid-jobstlatorre
This commit updates submit-grid-jobs so that it keeps a database of jobs. This allows the script to make sure that we only have a certain number of jobs in the job queue at a single time and automatically resubmitting failed jobs. The idea is that it can now be run once to add jobs to the database: $ submit-grid-jobs ~/zdabs/SNOCR_0000010000_000_p4_reduced.xzdab.gz and then be run periodically via crontab: PATH=/usr/bin:$HOME/local/bin SDDM_DATA=$HOME/sddm/src DQXX_DIR=$HOME/dqxx 0 * * * * submit-grid-jobs --auto --logfile ~/submit.log Similarly I updated cat-grid-jobs so that it uses the same database and can also be run via a cron job: PATH=/usr/bin:$HOME/local/bin SDDM_DATA=$HOME/sddm/src DQXX_DIR=$HOME/dqxx 0 * * * * cat-grid-jobs --logfile cat.log --output-dir $HOME/fit_results I also updated fit so that it keeps track of the total time elapsed including the initial fits instead of just counting the final fits.
2019-11-19update submit-grid-jobs to hopefully only run jobs on nodes which have modulestlatorre
I noticed that many of my jobs were failing with the following error: module: command not found My submit description files *should* only be selecting nodes with modules because of this line: requirements = (HAS_MODULES =?= true) && (OSGVO_OS_STRING == "RHEL 7") && (OpSys == "LINUX") which I think I got from https://support.opensciencegrid.org/support/solutions/articles/12000048518-accessing-software-using-distributed-environment-modules. I looked up what the =?= operator does and it's a case sensitive search. I also found another site (https://support.opensciencegrid.org/support/solutions/articles/5000633467-steer-your-jobs-with-htcondor-job-requirements) which uses the normal == operator. Therefore, I'm going to switch to the == operator and hope that fixes the issue.
2019-08-28update submit-grid-jobs to use my version of splitexttlatorre
This commit updates the submit-grid-jobs script to use my version of splitext() which removes the full extension from the filename. This fixes an issue where the output HDF5 files had xzdab in the name whenever the input file had the file extension .xzdab.gz.
2019-08-05add ability to specify a particle combo on the command linetlatorre
This commit updates the fit program to accept a particle combo from the command line so you can fit for a single particle combination hypothesis. For example running: $ ./fit ~/zdabs/mu_minus_700_1000.hdf5 -p 2020 would just fit for the 2 electron hypothesis. The reason for adding this ability is that my grid jobs were getting evicted when fitting muons in run 10,000 since it takes 10s of hours to fit for all the particle hypothesis. With this change, and a small update to the submit-grid-jobs script we now submit a single grid job per particle combination hypothesis which should make each grid job run approximately 4 times faster.
2019-07-11switch from YAML output to HDF5 to speed things uptlatorre
2019-06-20update zdab-cat to emit multiple YAML documentstlatorre
This commit updates zdab-cat to output each event as an individual YAML document. The advantage of this is that we can then iterate over these without loading the entire YAML document in submit-grid-jobs which means that we won't use GB of memory on the grid submission node.
2019-06-20update submit-grid-jobs to send stderr to /dev/nulltlatorre
2019-06-06update submit-grid-jobstlatorre
This commit updates submit-grid-jobs so that it keeps track of which files it's already submitted grid jobs for.
2019-06-05try to import CLoader if possible since it's *much* fastertlatorre
2019-06-02use yaml.loader.SafeLoadertlatorre
2019-06-02fix another bug in submit-grid-jobstlatorre
2019-06-02use full path in submit-grid-jobstlatorre
2019-06-02update submit-grid-jobs to create a new directorytlatorre
2019-05-24add a script to concatenate output from grid jobstlatorre
2019-05-24add the template for condor submit files to submit-grid-jobstlatorre
2019-05-24add a script to submit jobs to the gridtlatorre