-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathsurvey-fw.tex
201 lines (169 loc) · 16.6 KB
/
survey-fw.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
\section{Event Processing Software Frameworks}
% \fixme{(authors: Brett, Mike)}
\subsection{Description}
A software framework abstracts common functionality expected in some
domain. It provides some generic implementation of a full system in
an abstract way that lets application-specific functionality to be
added through a modular implementation of framework interfaces.
Toolkit libraries provide functionality addressing some domain in a
form that requires the user-programmer to develop their own
applications. In contrast, frameworks provide the overall flow
control and main function requiring the user-programmer to add
application specific code in the form of modules.
In the context of HEP software, the terms ``event'' and ``module" are often overloaded and poorly defined.
In the context of software frameworks, an ``event'' is a unit of data whose scope is dependent
on the ``module'' of code which is processing. In the context of a
code module that generates initial kinematics, an event is the
information about the interaction. In a module that simulates the
passage of particles through a detector, an event may contain all
energy depositions in active volumes. In a detector electronics
simulation, it may contain all signals collected from these active
volumes. In a trigger simulation module, it would be all readouts of
these signals above some threshold or other criteria. At this point,
data from real detectors gain symmetry with simulation. Going further,
data reduction, calibration, reconstruction and other analysis modules
each have a unique concept of the ``event'' they operate on.
Depending on the nature of the physics, the detector, and the follow-on
analysis, every module may not preserve the multiplicity of data. For
example, a single interaction may produce multiple triggers, or none.
With that description, an event processing software framework is
largely responsible for marshalling data through a series (in general a
directed and possibly cyclic graph) of such code modules which then mutate the data. To support
these modules the framework provides access to external services such
as data access, handle file I/O, access
to descriptions of the detectors, provide for visualization or
statistical summaries, and databases of conditions for applying
calibrations.
%
The implementation of these services may be left up to the experiment
or some may be generically applicable.
%
How a framework marshals and associates data together as an event is
largely varied across different HEP experiments and may be unique for
a given data collection methodology (beam gate, online trigger, raw
timing, etc).
% The defining nature of these frameworks are to aggregate and organize
% code modules provided by a disparate set of developers. As such,
% quality frameworks provide means to manage this complexity. The
% modular design itself is one.
\subsection{Gaudi}
The Gaudi event processing framework\cite{gaudi_lhcb,gaudi_atlas} provides a comprehensive set of
features and is extensible enough that it is suitable for a wide
variety of experiments. It was conceived by LHCb and adopted by ATLAS
and these two experiments still drive its development. It has been
adopted by a diverse set of experiments including HARP, Fermi/GLAST,
MINER$\nu$A, Daya Bay and others. The experience of Daya Bay is
illuminating for both Gaudi specifically and of more general issues of
this report.
First, the adoption of Gaudi by the Daya Bay collaboration
was greatly helped by the support from the LHCb
and ATLAS Gaudi developers. Although not strictly their
responsibility, they found the time to offer help and support to this and the other
adopting experiments. Without this, the success of the adoption would have been uncertain
and at best would have taken much more effort. Daya Bay recognized
the need and importance of such support and, partly selfishly, formed
a mailing list\cite{gauditalk} and solicited the involvement of Gaudi developers from many
of the experiments involved in its development and use. It became a forum that more efficiently spread
beneficial information from the main developers. It also offloaded
some support effort to the newly minted experts from the other experiments so that they could help
themselves.
There were, however areas that would improve the adoption of Gaudi.
While described specifically in terms of Gaudi they are general in nature.
The primary one would be direct guides on how to actually adopt it.
This is something that must come from the community and likely in
conjunction with some future adoption. Documentation on Gaudi itself
was also a problem particularly for Daya Bay developers where many of the basic
underlying framework concepts were new. Older Gaudi design documents and
some experiment-specific ones were available but they were not always accurate
nor focused on just what was needed for adoption. Over time, Daya Bay produced it's own
Daya Bay-specific documentation which unfortunately perpetuates this
problem.
Other aspects were beneficial to adoption. The Gaudi build system,
based on CMT\cite{cmt} is cross platform, open and easy to port. It has layers
of functionality (package build system, release build system, support
for experiment packages and ``external'' ones) but it does not require a full
all-or-nothing adoption. It supports a staged adoption approach that
allowed Daya Bay to get started using the framework more quickly.
The importance of having all Gaudi source code open and available can
not be diminished. Also important was that the Gaudi developers included
the growing community in the release process.
While Gaudi's CMT-based package and release build system ultimately
proved very useful, it hampered initial adoption as it was not commonly
used widely outside of Gaudi and the level of understanding required was high.
It is understood that there is now a
movement to provide a CMake based build system. This may alleviate
this particular hurdle for future adopters as CMake is widely used both inside and outside HEP projects.
Finally, although Gaudi is full-featured and flexible it did not come
with all needed framework-level functionality and, in its core, does
not provide some generally useful modules that do exist in experiment code repositories. In particular, Daya Bay
adopted three Gaudi extensions from LHCb's code base. These are
actually very general purpose but due to historical reasons were not
provided separately. These were GaudiObjDesc (data model definition),
GiGa (Geant4 interface) and DetDesc (detector description). Some
extensions developed by other experiments were rejected and in-house
implementations were developed. In particular, the extension that
provided for file I/O was considered too much effort to adopt. The
in-house implementation was simple, adequate but its performance was
somewhat lacking.
One aspect of the default Gaudi implementation that had to be modified
for use by Daya Bay was the event processing model. Unlike collider
experiments, Daya Bay necessarily had to deal with a non-sequential,
non-linear event stream. Multiple detectors at multiple sites
produced data in time order but not synchronously. Simulation and
processing did not preserve the same ``event'' multiplicity. Multiple
sources of events (many independent backgrounds in addition to signal)
must be properly mixed in time and at multiple stages in the
processing chain. Finally, delayed coincidence in time within one
detector stream and between those of different detectors had to be
formed. The flexibility of Gaudi allowed Daya Bay to extend its very
event processing model to add the support necessary for these
features.
\subsection{CMSSW and art}
In 2005, the CMS Experiment developed their current software framework, CMSSW \cite{cmssw}, as a replacement to the previous ORCA framework. The framework was built around two guiding principles: the modularity of software development and that exchange of information between modules can only take place through data products. Since implementing the CMSSW, the complexity of the CMS reconstruction software was greatly reduced compared with ORCA and the modularity lowered the barrier to entry for beginning software developers.\footnote{Thanks to Dr. Liz Sexton-Kennedy and Dr. Oli Gutsche for useful discussions concerning the history and design of CMSSW.}
The CMS Software Framework is designed around four basic elements: the framework, the event data model, software modules written by physicists, and the services needed by those modules\cite{cmssw_web}. The framework is intended to be a lightweight executable (cmsRun) that loads modules dynamically at run time. The configuration file for cmsRun defines the modules that are part of the processing and thus the loading of shared object libraries containing definitions of the modules. It also defines the configuration of modules parameters, the order of modules, filters, the data to be processed, and the output of each path defined by filters. The event data model (EDM) has several important properties: events are trigger based, the EDM contains only C++ object containers for all raw data and reconstruction objects, and it is directly browsable within ROOT. It should be noticed that the CMSSW framework is not limited to trigger based events, but this is the current implementation for the CMS experiment. Another important feature of the EDM over the ORCA data format was the requirement that all information about an event is contained within a single file. However, file parentage information is also kept so that if object from an input file are dropped (eg. the raw data) that information can be recovered by reading both the parent file and the current file in downstream processes. The framework was also constructed such that the EDM would contain all of the provenance information for all reconstructed objects. Therefore, it would be possible to regenerate and reproduce any processing output from the raw data given the file produced from CMSSW. Another element of the framework that is useful for reproducibility is the strict requirement that no module can maintain state information about the processing, and all such information must be contained within the base framework structures.
The art framework is an event processing framework that is an evolution of the CMSSW framework. In 2010, the Fermilab Scientific Computing Division undertook the development of an experiment-agnostic framework for use by smaller experiments that lacked the personpower to develop a new framework. Working from the general CMSSW framework, most of the design elements were maintained: lightweight framework based on modular development, event data model, and services required for modules. The output file is ROOT browsable and maintains the strict provenance requirements of CMSSW. For Intensity and Cosmic Frontier experiments, the strict definition of an event being trigger based isn't appropriate and so this structuring was removed and each instance of art allows the experiment to define the event period of interest as required. art is currently being used by the Muon g-2, $\mu2e$, NOvA, $\mu BooNE$, and LBNE/ 35T prototype experiments.
CMSSW did initially have some limitations when implemented, the most significant being the use of non-interpreted, run-time configuration files defined by the FHiCL language. The significance of this being that configuration parameters could not be evaluated dynamically and were required to be explicitly set in the input file. This limitation meant it was impossible to include any scripting within the configuration file. This limitation was recognized by the CMS Collaboration and they quickly made the choice to instead transition to Python (in 2006) based configuration files. At that time, a choice was made that the Python evaluation of configuration code would be distinctly delineated from framework and module processing. Therefore, once the configuration file was interpreted, all configuration information was cast as const within C++ objects and immutable. Due to the requirement within CMSSW for strict inclusion of provenance information in the EDM, the dynamic evaluation of configuration files then cast as const parameters and stored in the EDM was not considered a limitation to reproduction from raw data. When the art framework was forked from CMSSW in 2010, the art framework reverted back to using FHiCL language configuration files, and, while acceptable to experiments at the time of adoption, some consider this a serious limitation.
One of the challenges faced by the art framework has been the portability of the framework to platforms other than Scientific Linux Fermilab or Cern. The utilization of the Fermilab UPS and cetbuildtools products within the build and release system that was integrated into the art suite resulted in reliance upon those products that is difficult to remove and therefore port to other platforms (OS X, Ubuntu, etc). The CMSSW framework was implemented for CMS such that the build system was completely available from source and mechanisms for porting to experiment-supported platforms is integrated into the build system. While portability of art is not an inherent problem of the software framework design, and is currently being addressed by both Fermilab SCD and collaborative experiments, it serves as a significant design lesson when moving forward with art or designing other frameworks in the future.
\subsection{IceTray}
IceTray\cite{icetray} is the software framework used by the IceCube experiment and also ported to SeaTray for the Antares experiment. The framework is similar to other previously described frameworks in that it takes advantage of modular design for development and processing. Processing within the framework has both analysis modules and services similar to those described for Gaudi, CMSSW, and art. The IceTray framework and modules are written in the C++ language. The data structure for IceTray is designated a ``frame" and contains information about geometry, calibration, detector status, and physics events. Unlike other frameworks described, IceTray allows for multiple frames to be active in a module at the same time. This was implemented due to the nature of the IceCube detector and the need to delay processing an ``event" until information from more than the current frame is analyzed. This is accomplished through the use of a uniquely designed I/O mechanism utilizing Inboxes and Outboxes for modules. A module can have any number of Inboxes and Outboxes. The development of IceTray was done within the IceCube experiment based upon a specific set of requirements in 2003.
\subsection{Opportunity for improvement}
Some best practices relevant to event processing frameworks are identified:
\begin{description}
\item[open community] Make source-code repositories, bug tickets and
mailing lists (user and developer) available for anonymous reading
and lower the barrier for accepting contributions from the community.
\item[modularity] Separate the framework code into modular compilation
units with clear interfaces which minimize recompilation. The
system should work when optional modules are omitted and allow
different modules to be linked at run-time.
\item[documentation] Produce descriptions of the concepts, design and
implementation of the framework and guides on installation,
extension and use of the framework.
\end{description}
\noindent The community should work towards making one event
processing framework which is general purpose enough to service
multiple experiments existing at different scales. This framework
should be ultimately developed by a core team with representation from
multiple, major stake-holder experiments and with an open
user/developer community that spans other experiments. Steps to reach
this goal may include:
\begin{itemize}
\item Form an expert working group to identify requirements and
features needed by such a general use event processing framework.
Much of this exist in internal and published notes and needs to be
pulled together and made comprehensive.
\item The working group should evaluate existing frameworks with
significant user base against these requirements and determine what
deficiencies exist and the amount of effort required to correct
them.
\item The working group should recommend one framework, existing or
novel, to develop as a widely-used, community-supported project.
\item The working group should conclude by gauging interest in the
community, survey experiments to determine what involvement and use
can be expected and determine a list of candidate developers for the next
step.
\item Assemble a core team to provide this development and support
(something similar to the ROOT model). Direct support, which is
independent from specific experiment funding, for some significant
portion of this effort is recommended.
\end{itemize}