-
Notifications
You must be signed in to change notification settings - Fork 63
Task Profiler
Justus Calvin edited this page May 24, 2014
·
1 revision
MADNESS includes a task profiler, which is an event-based profiler. It records submit, start, and stop times for all tasks run by the program, which is used to create a task trace files. Task profiles can then be generated with these trace files.
To enable the task profiler, configure MADNESS with the --enable-task-profiler option.
When the task profiler is enabled, the task manager will automatically collect timing information for all tasks. The data is written to a text file before the program exits. The output file is specified with the MAD_TASKPROFILER_NAME environment variable. For example:
export MAD_TASKPROFILER_NAME=<file name>or for C shell
setenv MAD_TASKPROFILER_NAME <file name>If this environment variable is not specified, the timing data is still collected but no task profile data will be output. You will also see the following warning at the beginning of your program.
!!! WARNING: MAD_TASKPROFILER_NAME not set. !!! WARNING: There will be no task profile output.The output file specified will be appended with MPI rank and the number of threads in your application. For example, if your file name is “world_profile” and you are using 8 threads on 2 nodes, the following files will be generated:
world_profile_0x8 world_profile_1x8one on each node.
You can easily generate a flat profile with the taskprofile.pl script (located in src/bin):
$ perl taskprofile.pl world_profile_*Which will parse all files that match world_profile_* and combine the data. A sample profile from src/lib/world/world is given below.
Wall time: 5.122230 (s) Total task time: 1.583669 (s) Time (s) name counts total average % latency val1d_func(int) 100000 1.4894500 0.0000149 29 0.0000174 val_func() 100000 0.0634100 0.0000006 1 0.0000123 null_func() 100000 0.0158480 0.0000002 0 0.0000205 pounder(madness::WorldContainer<int, Mary, madness::Hash<int> > const&, int) 7 0.0142600 0.0020371 0 0.0000500 TTT::kate(madness::World*, std::string const&, double) 2 0.0002130 0.0001065 0 0.0000790 TTT::mary() 1 0.0001820 0.0001820 0 0.0000540 TTT::carl() 1 0.0001450 0.0001450 0 0.0000610 UNKNOWN 1 0.0000430 0.0000430 0 0.0000420 TTT::fred() 1 0.0000260 0.0000260 0 0.0000590 madness::tr1::detail::memfunc_traits<double (Mary::*)(std::string const&, int, double)>::result_type madness::WorldContainerImpl<int, Mary, madness::Hash<int> >::itemfun<double (Mary::*)(std::string const&, int, double), std::string, int, double>(int const&, double (Mary::*)(std::string const&, int, double), std::string const&, int const&, double const&) 1 0.0000200 0.0000200 0 0.0000100 madness::tr1::detail::memfunc_traits<std::string (Mary::*)(int, int)>::result_type madness::WorldContainerImpl<int, Mary, madness::Hash<int> >::itemfun<std::string (Mary::*)(int, int), int, int>(int const&, std::string (Mary::*)(int, int), int const&, int const&) 1 0.0000200 0.0000200 0 0.0000400 TTT::jody(double, double, double) 1 0.0000050 0.0000050 0 0.0000580 TTT::hugh(std::vector<madness::Future<int>, std::allocator<madness::Future<int> > >&) 1 0.0000050 0.0000050 0 0.0000170 Foo::get3c(int, char, short) const 1 0.0000040 0.0000040 0 0.0000140 TTT::dave(madness::World*) 1 0.0000040 0.0000040 0 0.0000150 TTT::sara(double, std::complex<double> const&) 1 0.0000040 0.0000040 0 0.0000690 Foo::get0f() 1 0.0000030 0.0000030 0 0.0000170 dumb(int, int, int, int, int, int, int) 1 0.0000030 0.0000030 0 0.0000080 Foo::get0() 1 0.0000030 0.0000030 0 0.0000140 TTT::bert(int) 1 0.0000030 0.0000030 0 0.0000380 Foo::get1c(int) const 1 0.0000020 0.0000020 0 0.0000130 Foo::get4c(int, char, short, long) const 1 0.0000020 0.0000020 0 0.0000160 Foo::get4(int, char, short, long) 1 0.0000020 0.0000020 0 0.0000130 Foo::get5c(int, char, short, long, short) const 1 0.0000020 0.0000020 0 0.0000130 Foo::get3(int, char, short) 1 0.0000020 0.0000020 0 0.0000130 Foo::get1(int) 1 0.0000020 0.0000020 0 0.0000220 Foo::get0c() const 1 0.0000020 0.0000020 0 0.0000160 Foo::get2(int, char) 1 0.0000020 0.0000020 0 0.0000130 Foo::get2c(int, char) const 1 0.0000010 0.0000010 0 0.0000140 Foo::get5(int, char, short, long, short) 1 0.0000010 0.0000010 0 0.0000130 0x0 7 0.0000000 0.0000000 0 0.0000757