-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathprofiler.tex
104 lines (100 loc) · 5.6 KB
/
profiler.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
\section{Profiling}
\begin{itemize}
\item Question: different data structures have different complexities. Adding something
might be cheap on one, but extensive on another that is faster on removing them. But
how can we compare them based on the accesses used?
\item Idea: count each call to the data structures, compute the complexities at the
end and give recommendations
\item Achieved through AspectJ which can be used to add functionalities to existing code
(see section \ref{sec:aspectj}). This enables us to keep nearly all profiler
functionality seperated from the other code: the aspects to enable counting reside in
\texttt{ProfilerAspects.aj}, the counterpart to count the accesses reside in
the class \texttt{Profiler}
\item On runtime, the profiler counts calls based on the performed action (add, remove,
contains) and callee signature (metric name, update type,\ldots), such that we can
determine the complexities for different use cases with one execution
\item Combined complexities are computed based on the complexities for each action on
each data structure, which are stored within each data structure class
\item Accesses are counted based on a list of distinguishable access types (add,
remove, contains) and the callee signature (eg. metric name, update type). For the
output, these are concatenated such that each line contains the signature and the
access type
\item The output ends with the aggregated complexity that occured in the current run and
a recommendation of three data structure combinations. As the computation of
recommendations is time consuming, this is not done for each callee, but only for
aggregations.
\item Example output (shortened, full output at \ref{sec:exampleOutput}):
\begin{verbatim}
DegreeDistributionU.AddNodeGlobal=0
DegreeDistributionU.AddNodeLocal=0
[...]
DegreeDistributionU.SizeEdgeLocal=96000
DegreeDistributionU.RandomNodeGlobal=0
DegreeDistributionU.RandomEdgeGlobal=0
[...]
# Aggr: 96000*O(1)
# Recommendations:
# DArray;DArray;DArray: 96000*O(1)
\end{verbatim}
\item Access counters are written to files split up into different use cases,
summarized for each callee, and a recommendation for optimized complexities. These
files are stored in the following locations:
\begin{tabular}{@{\hspace{1ex}}p{48em}}
\dirtree{%
.1 Root directory of data output.
.2 aggr.
.2 run.0.
.3 batch.0.
.4 metric.firstExample.
.5 \_\_\_metric.profiler.values\DTcomment{Output for a single metric}.
.4 metric.secondExample.
.5 \ldots .
.4 \_\_\_aggregated.profiler.values\DTcomment{Aggregation over the whole batch}.
.4 \_\_\_batchGeneration.profiler.values\DTcomment{Output for the batch generation}.
.4 \_\_\_metric.profiler.values\DTcomment{Aggregation over all metrics in this batch}.
.4 \_\_\_updates.profiler.values\DTcomment{Output for the batch application}.
.3 batch.1.
.4 \ldots .
.3 \_\_\_aggregated.profiler.values\DTcomment{Aggregation over the whole run}.
.3 \_\_\_batchGeneration.profiler.values\DTcomment{Aggregation of all batch
generations in this run}.
.3 \_\_\_graphGeneration.profiler.values\DTcomment{Output for the initial graph
generation}.
.3 \_\_\_metric.profiler.values\DTcomment{Aggregation over all metrics in this run}.
.3 \_\_\_updates.profiler.values\DTcomment{Aggregation of all batch
applications in this run}.
.2 run.1.
.3 \ldots .
.2 \_\_\_aggregated.profiler.values\DTcomment{Aggregation over the whole series}.
.2 \_\_\_batchGeneration.profiler.values\DTcomment{Aggregation of all batch
generations in this series}.
.2 \_\_\_metric.profiler.values\DTcomment{Aggregation over all metrics in this series}.
.2 \_\_\_updates.profiler.values\DTcomment{Aggregation of all batch
applications in this series}.
}
\end{tabular}
\item Important side note about the recommender: different data structures use different
ways to perform the same action (eg. \texttt{DArray.add} calls
\texttt{DArray.contains}, while \texttt{DHashSet.add} does not). The profiler can only
count each access type for the currently used data structures and give recommendations
for this combination of calls. Thus, using another combination might not lead to the
complexity the recommender computed. For the given example, the recommender will work
based on calls for both \texttt{add} and \texttt{contains}, although \texttt{DHashSet}
never calls \texttt{contains} within \texttt{add}. It might be necessary to run the
recommender more than once to yield better results.
\item It takes a nearly static amount of time to compute the recommendations,
independent of the chosen data structures and the graph size. Having it active in a
setup with a small graph predominates the optimization of data structures
\item The granularity of the recommender can be set in three levels through the DNA
configuration:
\begin{itemize}
\item \texttt{PROFILER\_DISABLE\_ALL\_RECOMMENDATIONS = true} completely disables the
recommender
\item \texttt{PROFILER\_DISABLE\_ALL\_RECOMMENDATIONS = false} and
\texttt{PROFILER\_WRITE\_ALL\_RECOMMENDATIONS = false} (default) outputs
recommendations only for aggregations, but not for each individual access type
\item \texttt{PROFILER\_DISABLE\_ALL\_RECOMMENDATIONS = false} and
\texttt{PROFILER\_WRITE\_ALL\_RECOMMENDATIONS = true} outputs recommendations for each
individual access type
\end{itemize}
\end{itemize}