Decoding digitizer data¶
The primary function for data conversion into raw-tier LH5 files is
build_raw.build_raw(). This is a one-to many function: one input DAQ
file can generate one or more output raw files. Control of which data ends up
in which files, and in which HDF5 groups inside of each file, is controlled via
raw_buffer.
Currently we support the following DAQ data formats:
ORCA, reading out:
FlashCam
The examples in the following are based on the FlashCam (via ORCA) decoders.
Configuration¶
Basic usage of build_raw() requires zero configuration:
from daq2lh5 import build_raw
build_raw("daq-data.ext", out_spec="raw-data.lh5")
daq2lh5 will autodetect the DAQ format (if not, the in_stream_type is your
friend), decode all the data it can and save it to an LH5 file named
raw-data.lh5. The data in the output file is organized by record types
(e.g. event stream, DAQ hardware status, configuration, etc.)
Tip
Check the build_raw() documentation for a full list of useful options.
When the out_spec argument is a dictionary or a string ending with .json
or .yaml, it is interpreted as a configuration dictionary or a file
containing it, respectively. Technically, this dictionary configures a
RawBufferLibrary.
Tip
The full configuration format specification is documented in depth in
raw_buffer.RawBufferLibrary.set_from_dict().
Let’s use the following configuration file as an example:
raw-out-spec.yaml¶ 1 ORFlashCamWaveformDecoder:
2 "group1-{key:07d}/raw":
3 key_list:
4 - [1, 3]
5 - 9
6 out_stream: "{filename}"
7
8 "group2-{key:07d}/raw":
9 key_list:
10 - [11, 13]
11 out_stream: "{filename}"
12
13 OrcaHeaderDecoder:
14 header-data:
15 key_list: ["*"]
16 out_stream: "{filename}"
17
18 "*":
19 "extra/{name}":
20 key_list: ["*"]
21 out_stream: "extra.lh5"
The first-level keys specify the names of the
DataDecoder-derived classes to be used in the
decoding. In the example above,
ORFlashCamWaveformDecoder and
OrcaHeaderDecoder. The user can also use just
*, which matches any other decoder known to legend-daq2lh5.
The second-level dictionary keys are the names used to label the decoded
objects (RawBuffers) in the output file. These
string can include format specifiers for
variable expansion (see next section). The first key in
ORFlashCamWaveformDecoder, for example, will result in data being written
to group1-0000001/raw, group1-0000002/raw etc., depending on the value
of key. The computed label is stored in a variable called name, which
can be expanded in other configuration fields.
Note
If the first-level key is *, name is expanded to the data decoder
name instead of the raw buffer name. The last configuration block from the
example will result in data from the e.g. AuxDecoder1 decoder being
written as extra/AuxDecoder1 in the output file extra.lh5.
The first fundamental configuration inside this block is key_list. In this
context, “keys” refer to the labels used by the specific data decoder for
DAQ “streams” or “channels”. The key_list list can be effectively use to
select channels to be decoded. Examples of possible values:
[1, 3, 5]: channels 1, 3 and 5[[1, 7]]: all channels from 1 to 7["*"]: all available channels
During decoding, the value of the current key is stored in the variable
key, which can be expanded in other configuration fields. This feature
allows, as seen above, to label channel data individually and programmatically.
The second configuration block, for example, in ORFlashCamWaveformDecoder
will result in data from channels 11, 12, and 13 to be written as
group2-0000011/raw, group2-0000012/raw and group2-0000013/raw.
The second configuration field is out_stream, i.e. the output stream to
which the data should be written. A colon (:) can be used to separate the
stream name or address from an in-stream path or port. Examples:
LH5 file and group:
/path/filename.lh5:/groupSocket and port:
198.0.0.100:8000Variable to be expanded:
{filename}
Variable expansion¶
As mentioned, the build_raw() configuration supports variable expansion through
the format string syntax. The two
predefined variables are key and name, but any other variable can be
expanded by passing its value to build_raw() as keyword argument. For example,
for the the configuration shown above, filename must be defined like this:
build_raw("daq-data.orca", out_spec="raw-out-spec.yaml", filename="raw-data.lh5")
Note
key and name can be overloaded by keyword arguments in build_raw().
Output¶
Running build_raw() with the examined configuration on an example ORCA DAQ file
results in the following two LH5 files being produced:
raw-data.lh5
├── group1-000001
│ └── raw
├── group1-000002
│ └── raw
├── group1-000003
│ └── raw
├── group1-000009
│ └── raw
├── group2-000011
│ └── raw
├── group2-000013
│ └── raw
└── header-data
extra.lH5
└── extra
├── FCConfig
└── ORRunDecoderForRun
Data post-processing¶
Warning
to be written
Command line interface¶
A command line interface to build_raw() is available through the
legend-daq2lh5 executable. This can be used to quickly convert digitizer
data without custom scripting. Here are some examples of what can be achieved:
$ legend-daq2lh5 --help # display usage and exit
Convert files and save them in the original directory with the same filenames
(but new extension .lh5):
$ legend-daq2lh5 [-v] data/*.orca # increase verbosity with -v
$ legend-daq2lh5 --overwrite data/*.orca # overwrite output files
$ # set maximum number of rows to be considered from each file
$ legend-daq2lh5 --max-rows 100 data/*.orca
Customize the group layout of the LH5 files in a YAML configuration file (see above section):
FCEventDecoder:
"ch{key:0>3d}/raw":
key_list:
- [0, 58]
out_stream: "{orig_basename}.lh5"
and pass it to the command line:
$ legend-daq2lh5 --out-spec fcio-config.yaml data/*.fcio
Note
A special keyword orig_basename is automatically replaced in the YAML
configuration by the original DAQ file name without extension. Such a
feature is useful to users that want to customize the HDF5 group layout
without having to worry about file naming. This keyword is only available
through the command line.
See also
See build_raw() and legend-daq2lh5 --help for a full list of
conversion options.