Plotting Pulses With Cuts

Below are all of our import statements. You probably won't need to mess with these, but if you want to use a new package put it here!

In [1]:
%matplotlib inline
import h5py
import numpy as np
import matplotlib.pyplot as plt
import helperfunctions as hf
from collections import defaultdict
from itertools import product

Convert a raw data file to a set of pulse blocks

Be sure to change to the run number you want to study

In [ ]:
%run ../Scripts/convert_hdf5_raw.py -r Run_1323
%run ../Physics/pulseAnalysis_shapetime.py -r Run_1323

Get the directories we store our processed pulses and pulse blocks in

dataDir contains our processed, uninterlaced pulses
blocksDir contains our interlaced pulse blocks

In [3]:
run = 'Run_1323'
dataDir = hf.checkDir(hf.getRootDir()+"/Data/Processed/Pulses/"+run+"/")
blocksDir = hf.checkDir(hf.getRootDir()+"/Physics/Blocks_Final/"+run+"/")

Here we print out what amplitudes are available in this dataset, as well as which amplitude we are currently looking at

In [4]:
amps = hf.getAmps(dataDir)
print('The available amplitudes are:', ', '.join(amps))
amp = amps[0] # if your data has more than one amplitude, changing 0 will allow you to look at those too
print('The amplitude you are using is: ' + amp)
The available amplitudes are: 0p2, 0p5, 1p0
The amplitude you are using is: 0p2

Looking through the data and its attributes

One benefit of saving our data in a .hdf5 format is that we can easily save various attributes about the testboard while we are taking data. This allows us to know what we were doing at the testboard just by looking at the data.

Let's load up our data and see what attributes are saved in it

In [5]:
inFile = h5py.File(blocksDir+"Blocks_Amp"+amp+"_1x.hdf5", "r")
print([attr for attr in inFile.attrs])
['adc_freq', 'attribute_combos', 'awg_freq', 'block_length', 'dac_ibi_g20', 'hg_lg_c2', 'n_adcs', 'n_channels', 'on_g20', 'shaper_constants', 'sw_ibo_g20']

We can look at one of these attributes by accessing the inFile attributes like a dictionary. Let's see what the ADC sampling frequency in MHz is

In [6]:
print(inFile.attrs['adc_freq'])
40

Some attributes are saved everytime a measurement is taken, for example dac_ibi_g20

In [7]:
print(inFile.attrs['dac_ibi_g20'])
['63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63' '63'
 '63' '63' '63' '63' '63' '63' '30' '30' '30' '30' '30' '30' '30' '30'
 '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30'
 '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30'
 '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30'
 '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30'
 '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30'
 '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30' '30'
 '30' '30' '30' '30' '30' '30' '30' '30' '12' '12' '12' '12' '12' '12'
 '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12'
 '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12'
 '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12'
 '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12'
 '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12'
 '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '12'
 '12' '12' '12' '12' '12' '12' '12' '12' '12' '12' '0' '0' '0' '0' '0' '0'
 '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0'
 '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0'
 '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0'
 '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0'
 '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0' '0'
 '0' '0' '0' '0']

That's a lot to look at, so let's instead just look at the unique values of dac_ibi_g20

In [8]:
print(np.unique(inFile.attrs['dac_ibi_g20']))
['0' '12' '30' '63']

The main thing we want to do now that we have datasets with multiple configurations is to produce plots comparing them. To do this, we first need to see what HDF5 attributes were varied as we took data. While we're at it, let's also print out all the values these attributes can have

In [9]:
# First loop through all the attributes saved in inFile and save the ones with more than 1 unique value
cut_attrs = [attr for attr in inFile.attrs if len(np.unique(inFile.attrs[attr])) > 1 and attr != 'attribute_combos']
# Next, loop through these attributes and save the unique values they can take
cut_combos = [np.unique(inFile.attrs[attr]).tolist() for attr in cut_attrs]
print('The attributes we can vary are:', cut_attrs)
print('Their values are:', cut_combos)
The attributes we can vary are: ['dac_ibi_g20', 'on_g20', 'sw_ibo_g20']
Their values are: [['0', '12', '30', '63'], ['False', 'True'], ['False', 'True']]

Making Plots

Now that we know what we're working with, we can start making comparisons plots. The bit of code that produces these is rather tricky, but if we walk through step-by-step it should start to make sense:

First we use python's built-in product function to get all possible combinations of our cut attributes. This could be accomplished by nesting several for loops, but if we were varying 10 different parameters we'd need 10 nested loops

In [10]:
print(list(product(*cut_combos)))
[('0', 'False', 'False'), ('0', 'False', 'True'), ('0', 'True', 'False'), ('0', 'True', 'True'), ('12', 'False', 'False'), ('12', 'False', 'True'), ('12', 'True', 'False'), ('12', 'True', 'True'), ('30', 'False', 'False'), ('30', 'False', 'True'), ('30', 'True', 'False'), ('30', 'True', 'True'), ('63', 'False', 'False'), ('63', 'False', 'True'), ('63', 'True', 'False'), ('63', 'True', 'True')]

Next up is the first line within our for loop, where we define cut_index. Let's work through this with an example.

Let's take the third product of our cut attributes

In [11]:
prod = list(product(*cut_combos))[2]
print(prod)
print(cut_attrs) # print these so we know what each value corresponds to
('0', 'True', 'False')
['dac_ibi_g20', 'on_g20', 'sw_ibo_g20']

Now we're going to compare the value of each attribute to the value listed in prod, which will tell us which measurements had each of these attribute values. This will result in arrays of True and False, where True means the measurement had the attribute value we are looking at now

In [12]:
print( np.array( [inFile.attrs[cut_attrs[i]] == prod[i] for i in range(len(prod))]) )
[[False False False ...  True  True  True]
 [ True  True  True ...  True  True  True]
 [ True  True  True ...  True  True  True]]

Each row in this array corresponds to an attribute we varied, and each column corresponds to a different measurement during data taking.

Now that we know where each individual attribute is what we want, we want to combine them so that we can know where every attribute is what we want.

In [13]:
cut_index = np.logical_and.reduce(np.array( [inFile.attrs[cut_attrs[i]] == prod[i] for i in range(len(prod))]))
print( cut_index )
[False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True]

So we have this big array of True's and False's, now what? Well, numpy allows boolean indexing, allowing you to index an array using a boolean array, returning only those values where the indexing array was True. Let's see a quick example:

In [14]:
test_array = np.array([0,1,2,3,4])
test_index = [True, True, False, False, True]
print( test_array[test_index] )
[0 1 4]

We also want to check that there are actually any measurements that match the particular combination of variables we've chosen, which we can do using any

In [15]:
print( any(cut_index) )
True

If we were to instead look at the first combination of attributes python produced, we would find that we didn't actually take any measurements with that particular setup.

  • To see how well you understood what we just walked through, verify that this is the case now!

The rest of this bit of code is packaging the data up into a dict and then plotting them all on one set of axes

In [16]:
pulse_data = {}
for prod in product(*cut_combos):
    cut_index = np.logical_and.reduce(np.array( [inFile.attrs[cut_attrs[i]] == prod[i] for i in range(len(prod))] ))
    if not any(cut_index): continue
    
    cut_attrs_str_title = "; ".join([cut_attrs[i]+' '+str(prod[i]) for i in range(len(prod))])
        
    pulse_data.update({cut_attrs_str_title:inFile["/coluta1/channel2/samples"][()][cut_index][0:1200]})

for name, data in pulse_data.items():
    plt.plot(np.mean(data, axis = 0), label=name)
plt.legend()
plt.title('COLUTA 1, Channel 2, Pulse Comparison, Amplitude ' + amp)
plt.show()

Instead of plotting all 6 combinations at once, let's look at a comparison with the Gain 20 amplifier on and off and the other attributes at their default.

First let's print out the dict keys so we know how to access the data

In [17]:
print( list(pulse_data.keys()) )
['dac_ibi_g20 0; on_g20 True; sw_ibo_g20 False', 'dac_ibi_g20 12; on_g20 True; sw_ibo_g20 False', 'dac_ibi_g20 30; on_g20 True; sw_ibo_g20 False', 'dac_ibi_g20 63; on_g20 False; sw_ibo_g20 False', 'dac_ibi_g20 63; on_g20 True; sw_ibo_g20 False', 'dac_ibi_g20 63; on_g20 True; sw_ibo_g20 True']
In [18]:
plt.plot(np.mean(pulse_data['dac_ibi_g20 63; on_g20 False; sw_ibo_g20 False'], axis = 0), 
        label = 'dac_ibi_g20 63; on_g20 False; sw_ibo_g20 False')
plt.plot(np.mean(pulse_data['dac_ibi_g20 63; on_g20 True; sw_ibo_g20 False'], axis = 0), 
        label = 'dac_ibi_g20 63; on_g20 True; sw_ibo_g20 False')
plt.title('COLUTA 1, Channel 2, on_g20 Comparison, Amplitude ' + amp)
plt.legend()
plt.show()

So turning on_g20 on makes the pulse amplitude higher, as would be expected when we turn on an amplifier. But does it look like the gain is actually 20?

  • Determine the gain of the Gain 20 amplifier by comparing the heights of the two pulse above. Be sure to take into account the fact that the baseline for each pulse is not at 0, but is significantly higher!
  • You should find that the gain is much less than 20. Edit the above code to also save data from channel 1, and then compare that pulse height to the pulse height of channel 2. Is this ratio closer to 20?

Let's plot the noise for both of these pulses too:

In [19]:
plt.plot(np.std(pulse_data['dac_ibi_g20 63; on_g20 False; sw_ibo_g20 False'], axis = 0))
plt.title('COLUTA 1, Channel 2 Noise, on_g20 False, Amplitude ' + amp)
plt.show()
plt.plot(np.std(pulse_data['dac_ibi_g20 63; on_g20 True; sw_ibo_g20 False'], axis = 0))
plt.title('COLUTA 1, Channel 2 Noise, on_g20 True, Amplitude ' + amp)
plt.show()

These two plots are very different, and they're both different from the noise plot we saw in the previous notebook.
Here's a few ideas of how to investigate further:

  • Plot both of these noise distributions alongside their corresponding pulses and pulse derivatives
  • Look at what happens to the pulse and the noise when you vary other parameters. There are two more parameters that were varied during data taking in this run, so repeat the above analysis and plot-making for both of these

What now?

At this point you've seen how to load in a data file, loop through measurements, make cuts based on various attributes and make comparison plots. Now you can look at pulse features you might have noticed during these two notebooks, make plots comparing pulses in different ways, or look at pulses from newer data taking runs

In [ ]: