-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy pathnotes.md~
232 lines (195 loc) · 5.29 KB
/
notes.md~
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
Shortest Path in Neo4j
---------------------
### Experiment
Originally created as a response to the data presented in [this blog post](http://istc-bigdata.org/index.php/benchmarking-graph-databases/),
this experiment compares the performance of different path finding algorithm implementations in Neo4j.
The start and end nodes are selected uniformly at random from across the entire node space (~4000 nodes).
Each algorithm is run 1000 times, using different (well, randomly selected) start and end nodes each time.
### Dataset
Source : http://snap.stanford.edu/data/egonets-Facebook.html
Nodes : 4,040
Relationships : 88,234
### Environment
Processor : 4x Intel i3-2330M CPU @ 2.20GHz (2 cores, with HyperThreading)
Memory : 6GB
Storage : INTEL SSDSA2BW16 160GB SSD drive
Operating System : Ubuntu 12.04.3 LTS (Linux 3.2.0-52-generic (x86_64))
Java : Java HotSpot 64-Bit VM
Neo4j : Neo4j Enterprise 1.9.4
## Results
### Single Shortest Path
**- Shortest Path -**
Run Time (ms)
COUNT : 1000
MIN : 0
MIN : 222
50th PERCENTILE : 1
90th PERCENTILE : 6
95th PERCENTILE : 10
99th PERCENTILE : 18
MEAN : 2.769
Path Length
COUNT : 1000
MIN : 1
MIN : 7
50th PERCENTILE : 4
90th PERCENTILE : 5
95th PERCENTILE : 6
99th PERCENTILE : 7
MEAN : 3.613
**- Unweighted Dijkstra -**
Run Time (ms)
COUNT : 1000
MIN : 0
MIN : 12365
50th PERCENTILE : 143
90th PERCENTILE : 651
95th PERCENTILE : 1093
99th PERCENTILE : 2150
MEAN : 294.47
Path Length
COUNT : 1000
MIN : 1
MIN : 7
50th PERCENTILE : 4
90th PERCENTILE : 5
95th PERCENTILE : 6
99th PERCENTILE : 7
MEAN : 3.613
**- Weighted Dijkstra -**
Run Time (ms)
COUNT : 1000
MIN : 0
MIN : 589
50th PERCENTILE : 134
90th PERCENTILE : 209
95th PERCENTILE : 241
99th PERCENTILE : 363
MEAN : 126.423
Path Length
COUNT : 1000
MIN : 1
MIN : 32
50th PERCENTILE : 13
90th PERCENTILE : 21
95th PERCENTILE : 23
99th PERCENTILE : 28
MEAN : 13.146
### All Shortest Paths (Run Time is for retrieving the 1st path only)
**- Shortest Path -**
Run Time (ms)
COUNT : 1000
MIN : 0
MIN : 159
50th PERCENTILE : 2
90th PERCENTILE : 7
95th PERCENTILE : 10
99th PERCENTILE : 17
MEAN : 3.059
Path Length
COUNT : 1000
MIN : 1
MIN : 7
50th PERCENTILE : 4
90th PERCENTILE : 5
95th PERCENTILE : 6
99th PERCENTILE : 7
MEAN : 3.613
Discovered Path Count
COUNT : 1000
MIN : 1
MIN : 515
50th PERCENTILE : 2
90th PERCENTILE : 15
95th PERCENTILE : 25
99th PERCENTILE : 68
MEAN : 6.975
**- Unweighted Dijkstra -**
Run Time (ms)
COUNT : 1000
MIN : 0
MIN : 12580
50th PERCENTILE : 132
90th PERCENTILE : 652
95th PERCENTILE : 1064
99th PERCENTILE : 2203
MEAN : 293.552
Path Length
COUNT : 1000
MIN : 1
MIN : 7
50th PERCENTILE : 4
90th PERCENTILE : 5
95th PERCENTILE : 6
99th PERCENTILE : 7
MEAN : 3.613
Discovered Path Count
COUNT : 1000
MIN : 1
MIN : 515
50th PERCENTILE : 2
90th PERCENTILE : 15
95th PERCENTILE : 25
99th PERCENTILE : 68
MEAN : 6.975
**- Weighted Dijkstra -**
Run Time (ms)
COUNT : 1000
MIN : 0
MIN : 609
50th PERCENTILE : 129
90th PERCENTILE : 209
95th PERCENTILE : 241
99th PERCENTILE : 362
MEAN : 124.228
Path Length
COUNT : 1000
MIN : 1
MIN : 32
50th PERCENTILE : 13
90th PERCENTILE : 21
95th PERCENTILE : 23
99th PERCENTILE : 28
MEAN : 13.146
Discovered Path Count
COUNT : 1000
MIN : 1
MIN : 12
50th PERCENTILE : 1
90th PERCENTILE : 3
95th PERCENTILE : 4
99th PERCENTILE : 8
MEAN : 1.665
### Recreate at home
Follow these instructions if you wish to recreate the experiment at home.
**(1) Compile:**
mvn clean compile -Dmaven.compiler.source=1.6 -Dmaven.compiler.target=1.6
**(2) Generate Graph .csv Files:**
Parameter specifies if each Facebook relationships should be modeled as 1 Neo4j relationship or 2 (one in each direction)
mvn exec:java -Dexec.mainClass=org.neo4j.bench.shortestpath.InputFilesCreator -Dexec.arguments="<bidirectional|unidirectional>"
**(3) Download and build neo4j-import (see [README](https://github.com/dmontag/neo4j-import/blob/master/README.textile)):**
https://github.com/dmontag/neo4j-import
**(4) Load Generated .csv Files into Neo4j using neo4j-import:**
./run.sh /path/to/shortestpath_bench/db /path/to/shortestpath_bench/data/generated/nodes.csv /path/to/shortestpath_bench/data/generated/relationships.csv
After this step the project directory should look as follows:
shortestpath_bench/
data/
raw/
generated/
nodes.csv
relationships.csv
path-start-and-end-nodes.csv
db/
...
lib/
...
pom.xml
README.md
src/
...
target/
...
**(5) Run Benchmark:**
First parameter specifies if algorithm should find one shortest path or all shortest paths
Second parameter specific if relationships should be traversed in both directions or outgoing only
MAVEN_OPTS="-server -XX:+UseConcMarkSweepGC -Xmx512m" mvn exec:java -Dexec.mainClass=org.neo4j.bench.shortestpath.ShortestPathBench -Dexec.arguments="<single|all> <out|both>"