Instructions for skimming thumbnails: - Nov 20/2002 ------------------------------------- Some updates on 11/27 and are marked with the flag 11/27/02 Here are detailed instructions on how to setup the code to skim thumbnails and make root-trees (2 separate steps), and run the executables. We'll first setup everything to run on d0mino. I choose d0mino over clued0, because once you've practiced skimming a thumbnail residing on disk, I would like you to submit a test job on SAM. All your code can reside on one of the B project disks. Once, this is all done, you can then follow instructions in Heidi's document on how to run on CAB. This will require you to build your code, which is sitting on the B project disk, on one of the (linux) d0lxbldN machines. You can then submit the Linux executable directly to CAB. We'll get to this step later (some info at the end of this file). ON d0mino: ---------- setup D0RunII p13.03.00 setup d0cvs cd /prj_root/882/ckm_1 (/prj_root/877/ckm_1 is another project disk) mkdir cd newrel p13.03.00 p1303_latest -> this makes a subdir p1303 with the latest release also referred to as your work-area cd p1303_latest for skimming thumbnails do the following: ----------------------------------------- addpkg thumbnail - gets package thumbnail from CVS addpkg np_tmb_stream - np_tmb_stream (11/27/02 - you should get a recent version by doing addpkg np_tmb_stream vNNNNNN) (11/27/02 - the foll. instructions are based on np_tmb_stream v01-00-09) You will now see sub-dirs called thumbnail and np_tmb_stream. The thumbnail package controls the unpacking of the thumbnail chunk, whereas np_tmb_stream is the package which reads the information in the thumbnail and writes out events which satisfy our skimming criteria. For np_tmb_stream package, you have to get some stuff from my area. from /prj_root/882/ckm_1/vj/p1303_latest/np_tmb_stream/rcp copy the foll. files, tags.rcp, write.rcp, D0TriggerFilter.rcp dimuonTag.rcp, smuonTag.rcp, smuonWrite.rcp, dimuonWrite.rcp (these files have parameters set by me. To avoid information overload, I can explain later, or ask me questions) from /prj_root/882/ckm_1/vj/p1303_latest/np_tmb_stream/np_tmb_stram get, ChPTag.hpp (11/27/02 - not necessary, since it is in the recent version of np_tmb_stream) from /prj_root/882/ckm_1/vj/p1303_latest/np_tmb_stream/src get, ChPTag.cpp, MuonTag.cpp, COMPONENTS, OBJECT_COMPONENTS, RegChPTag.cpp ChPTag_t.cpp (11/27/02 - only need MuonTag.cpp, since everything else is in the recent version of np_tmb_stream) from /prj_root/882/ckm_1/vj/p1303_latest/np_tmb_stream/bin get, OBJECTS (11/27/02 - not necessary, since it is in the recent version of np_tmb_stream) For the thumbnail package, get the following file, /prj_root/882/ckm_1/vj/p1303_latest/thumbnail/rcp/UnpThumbNail.rcp (I only unpack the muons. saves a lot of time) srt_setup SRT_QUAL=maxopt --> this will build the optimized build d0setwa --> sets up your personal rcp database gmake np_tmb_stream.all --> compiles and links this package We use the default version of thumbnail library If everything has gone well, you will see. lib/IRIX6.5-KCC_4_0-maxopt/libnp_tmb_stream.a along with other files bin/IRIX6.5-KCC_4_0-maxopt/TMBStream_x With a couple of more things, we are ready to run this executable. Since you are working on d0mino and can see my area, make a file called file.dat, which will contain the input thumbnail file, /prj_root/882/ckm_1/vj/recoT_all_0000166318_mrg_048-049.raw_p13.04.00 Now do, setup d0tools --> (this allows you to do rund0exe) followed by, rund0exe -exe=TMBStream_x -rcp=runTMBStream_extra.rcp -rcppkg=np_tmb_stream -localrcp -localbuild -localfwkrcp -name=vj_nseg -filelist=file.dat -batch As you can see here, you have specified the executable name. Since you did srt_setup SRT_QUAL=maxopt earlier, it knows to look in the relevant sub-dir. You've specified the local executable by saying -localbuild Once you've built the library and executable, you can avoid the srt_setup command and use the option -maxopt before, say -batch. Since you've said -localfwkrcp, it will look for the main steering rcp in locally. -rcp=runTMBStream_extra.rcp tells it which one. -rcppkg=np_tmb_stream gives it the local area. -localrcp flag tells it use the rcps in the local area (rather than the default release versions) -name is the jobname. All the results of this job will be found in this sub-dir -filelist=file.dat gives it the input files. -batch submits it on batch [default=short = 180 mins]. Other options are -batch -q=medium (= 300 mins), and -batch -q=large. If you skip, the -batch flag, it will run it interactively. Don't run interactive on anything more than 10 events or so. You can specify -num=100 to run on only 100 events. Good to make sure that everything is working. If you skip this step, it will run through all the files. Make sure that in np_tmb_stream/rcp/TMBSAMManager.rcp, the parameter UseSAM =0 In this sub-dir, smuon_bphys*.tmb_out and dimuon_bphys*.tmb_outare the output files. The names were specifed in np_tmb_stream/rcp/smuonWrite.rcp and dimuonWrite.rcp (the output file name is of the form smuon_bphys_%f, where %f is the name of the input file TMBSTream_x.out will have the output. Look in this file and see if it makes sense. It lists all the triggers (specified in np_tmb_stream/rcp/D0TriggerFilter.rcp) that we are interested in. np_tmb_stream only looks at events which pass these filters - save time. You then see the cuts used to skim events. At the bottom of the file, you will see output like D0TriggerFilter - processed NNNNN events filtered MMMMM events (effect of D0TriggerFilter). WriteEvent: Closing dimuon_bphys*.tmb_out Events written: YYYY WriteEvent: Closing smuon_bphys*.tmb_out Events written: XXXX This information should match events.write. events.read gives you information on the number of files and events read. Once you are happy with this, you can run a test job on SAM. Make sure that in np_tmb_stream/rcp/TMBSAMManager.rcp, the parameter UseSAM =1 srt_setup SRT_QUAL=maxopt (do this OR use the -maxopt option below) setenv SAM_PROJECT vj_proc_v1-`date +%Y%m%d-%H%M%S` (use your initials instead of vj) rund0exe -exe=TMBStream_x -rcp=runTMBStream.rcp -rcppkg=np_tmb_stream -localbuild -localrcp -localfwkrcp -name=vj_newversion_fromSAM -defname=vj_test_p1302_tmb -batch -fwkparams -num_files 10 This will run a job on queue sam_lo. It look for 10 files in the dataset vj_test_p1302_tmb. Your results will be in the sub-dir vj_newversion_fromSAM Look in /prj_root/882/ckm_1/vj/p1303_latest/vj_newversion_fromSAM for a run I did. see events.read file You will also see files called smuon*.metadata.py, etc. This files will be needed when I put the skimmed thumbnails back into SAM. That will come later If you use a dataset definition which has, say 1000 files, and you say -num_files 10, you will not be very efficient. SAM will stage all 1000 files, but processing will stop after 10 files. If you are just testing, use a dataset with a small number of files, like the one I have for making root-trees from the skimmed thumbnails ------------------------------------------------- addpkg thumbnail - gets package thumbnail from CVS addpkg tmb_tree addpkg tmb_analyze - tmb_analyze I will recommend that you make a work-area on a different disk or different username than above. The reason is that thumbnail/rcp/UnpThumbnail.rcp is different from above. My testing was done on /prj_root/877/ckm_1/vj/p1303 This step is simpler, in that we will change a few rcps only. We don't need to compile/link anything - we will use the default executable. Get thumbnail/rcp/UnpThumbNail.rcp - now I unpack most of the thumbnail tmb_tree/rcp/TMBCorePkg.rcp - I make most root-branches and two jet algos Make a file like, /prj_root/877/ckm_1/vj/p1303/dimuon_std.dat or smuon_chp.dat d0setwa --> sets up your personal rcp database (11/27/02 - probably don't need this step for rund0exe) setup d0tools rund0exe -exe=TMBAnalyze_x -rcp=runTMBTreeMaker.rcp -rcppkg=tmb_analyze -localrcp -localfwkrcp -name=vj_smuon_std -filelist=smuon_chp.dat -batch -q=medium -maxopt Notice that we don't say -localbuild anymore. We will use the default version of TMBAnalyze_x. In the sub-dir vj_smuon_chp, you will see the soft-link. The root-tree is vj_smuon_chp/tmb_tree.root Notice the use of -maxopt flag Even though we haven't changed the framework rcp runTMBTreeMaker.rcp in our local area, there is no harm in using this flag. This executable takes much longer. On d0mino, it will take about 1-2 seconds per event. On the Clued0 cluster (batch names are different) or CAB (we'll come to that later), it will be about 10 times faster. One reason I point this out is that you should choose the appropriate batch queue. Also, I am told that if you run on too many events at once, your code can crash. What is too many events? Don't run on more than 20K events. There is a cute trick around this, if your input file is say 40K events. rund0exe -exe=TMBAnalyze_x -rcp=runTMBTreeMaker.rcp -rcppkg=tmb_analyze -loc alrcp -localfwkrcp -name=vj_smuon_std_file1 -filelist=smuon_chp.dat -num=20000 -batch -q=large -maxopt followed by another job, rund0exe -exe=TMBAnalyze_x -rcp=runTMBTreeMaker.rcp -rcppkg=tmb_analyze -loc alrcp -localfwkrcp -name=vj_smuon_std_file2 -filelist=smuon_chp.dat -num=20000 -skip=20000 -batch -q=large -maxopt So, the first job analyzes the first 20K events, and the second job will skip the first 20K events in the input file and analyze the rest. Whew! Let me know if you have questions. Here are some instructions to look at the resulting root tree. Read tmb_analyze/macros/README.txt Use the macros that I have in /prj_root/877/ckm_1/vj/p1303/bin/IRIX6.5-KCC_4_0-maxopt/ for an intial test, do the following, root .x LoadTMBTree_so.C .L MYTreeClass_C.so - this is a pre-made shared object OR .L MYTreeClass.C++ - takes a while to compile the code and makes MYTreeClass_C.so. Do this if something changes. TMBZee t("dimuon.root") - ideally, the name of the class "TMBZee" should be the same as the file name (MYTreeClass.C) t.Loop("test.root",3,2.0) - look at MYTreeClass.h and MYTreeClass.C for these arguments This will printout some info and make a new root file, test.root, which has some histograms in it. The only way to look at is to exit root and then do root test.root. At the moment, this macro looks at muon and charged particle information only. How to run on CAB ------------------ 1) Running TMBStream_x on CAB, where it gets its input from SAM. (IMPORTANT: read documentation at http://www.nuhep.nwu.edu/~schellma/cab/cab_doc_v2.html) (You have to sign up to use CAB) on d0mino do the following: setup d0tools -t setup sam -q cab srt_setup SRT_QUAL=maxopt setenv SAM_PROJECT vj_proc_v1-`date +%Y%m%d-%H%M%S` (use your initials instead of vj) In the following example, I am running on a dataset in SAM called bphysics_p1302_tmb_nov20_notprocessed. At the moment, this dataset has no files in it, but if you've got to this point, talk to me, and I'll make up something for you (or show you what to do) rund0exe -exe=TMBStream_x -rcp=runTMBStream.rcp -rcppkg=np_tmb_stream -localbuild -localrcp -localfwkrcp -maxopt -name=vj_p1302_nov20_CAB -defname=bphysics_p1302_tmb_nov20_notprocessed -cab -scratch=/prj_root/882/ckm_1/vj/scratch -jobs=10 -jobname=vj_nov20_CAB -caboutpath=/prj_root/882/ckm_1/vj/skim_results -fwkparams -num_files 30 wow! there are a lot of arguments! some of the new ones (all of them are explained in Heidi's documentation) -cab tells you to run on CAB -scratch is where the tarball will be kept -jobs=10 tells the system to submit 10 parallel jobs. Each job will get files from the dataset in specified in -defname -jobname helps you find the job. use your name or something like that. -caboutpath tells where the output will go -name an old argument but is used in a slightly different way will make a subdirectory under the area specified by -caboutpath because you've done jobs=10, there will be 10 sub-dirs e.g., in my case I got sub-dirs like /prj_root/882/ckm_1/vj/skim_results/vj_p1302_nov20_CAB-3638 /prj_root/882/ckm_1/vj/skim_results/vj_p1302_nov20_CAB-3639 the numbers 3638, 3639 are given by CAB. -fwkparams -num_files 30 says that each job will get a maximum of 30 files. After you've done this once, then for subsequent submissions, there is another flag you can use. -cabtar= Specify the location and name of the tarball made in the previous step, and CAB will not remake the tarball - saves time. Remember not to have to too much junk below the sub-dir where you issue the rund0exe command (it will all go into the tarball!). 2) Running TMBAnalyze_x to make root-trees from thumbnails skimmed in step(1) rund0exe -exe=TMBAnalyze_x -rcp=runTMBTreeMaker.rcp -rcppkg=tmb_analyze -localrcp -localfwkrcp -filelist=smuon_recoT_all_0000168136_mrg_125-125.dat -maxopt -cab -scratch=/scratch/7/zhangxj -caboutpath=/projects/877/ckm_1/zhangxj/output_root_trees/smuon_1 -jobname=test_cab At the moment, we cannot parallelize jobs running on diskfiles. So, it is one job per set of input files.