Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Apple silicon #299

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

Developer-Ecosystem-Engineering

Use NEON when running on arm64
Add SIMD test

Tested on macOS 11.2 + M1 Mac Mini

Running ./main before patch with
subset_size_milllions = 10

Parsing gt:
10000
Loaded gt
1 0.1698 30.7341 us
2 0.2773 36.831 us
3 0.3567 43.9868 us
4 0.4215 55.2377 us
5 0.4673 59.3118 us
6 0.5041 59.3977 us
7 0.5392 78.3159 us
8 0.57 90.795 us
9 0.5949 79.8075 us
10 0.6191 101.076 us
11 0.6392 95.6952 us
12 0.6581 98.4538 us
13 0.6731 102.356 us
14 0.6871 103.722 us
15 0.7002 110.004 us
16 0.7125 123.257 us
17 0.725 107.637 us
18 0.736 108.864 us
19 0.7458 118.216 us
20 0.7539 120.993 us
21 0.7619 121.811 us
22 0.7707 124.828 us
23 0.7774 133.726 us
24 0.785 132.239 us
25 0.7898 158.008 us
26 0.7968 144.931 us
27 0.8016 150.408 us
28 0.806 147.33 us
29 0.8115 144.731 us
30 0.8174 148.872 us
40 0.8544 182.585 us
50 0.8795 215.213 us
60 0.8957 258.998 us
70 0.9107 275.703 us
80 0.9212 601.916 us
90 0.9312 346.77 us
100 0.9393 375.449 us
140 0.9601 584.979 us
180 0.9708 693.418 us
220 0.976 813.026 us
260 0.9804 1002.02 us
300 0.9826 1327.13 us
340 0.9845 1207.63 us
380 0.9862 1212.03 us
420 0.9877 1269.56 us
460 0.9892 1421.35 us

Running ./main post patch

Parsing gt:
10000
Loaded gt
1 0.1698 7.2835 us
2 0.2773 9.7249 us
3 0.3567 11.4171 us
4 0.4215 13.3347 us
5 0.4673 14.5087 us
6 0.5041 16.2215 us
7 0.5392 17.185 us
8 0.57 18.4606 us
9 0.5949 19.4449 us
10 0.6191 21.1617 us
11 0.6392 22.1288 us
12 0.6581 23.3129 us
13 0.6731 25.1171 us
14 0.6871 25.3738 us
15 0.7002 26.4833 us
16 0.7125 27.3419 us
17 0.725 28.3687 us
18 0.736 29.8781 us
19 0.7458 30.5237 us
20 0.7539 31.8982 us
21 0.7619 33.4757 us
22 0.7707 36.2736 us
23 0.7774 35.1295 us
24 0.785 39.422 us
25 0.7898 37.6223 us
26 0.7968 39.5841 us
27 0.8016 40.383 us
28 0.806 40.2653 us
29 0.8115 39.7669 us
30 0.8174 40.4674 us
40 0.8544 49.5017 us
50 0.8795 58.0349 us
60 0.8957 68.4427 us
70 0.9107 78.293 us
80 0.9212 96.4384 us
90 0.9312 104.241 us
100 0.9393 104.883 us
140 0.9601 153.266 us
180 0.9708 176.604 us
220 0.976 199.746 us
260 0.9804 237.368 us
300 0.9826 273.229 us
340 0.9845 305.279 us
380 0.9862 337.581 us
420 0.9877 366.357 us
460 0.9892 385.812 us

Use NEON when running on arm64
Add SIMD test

Tested on macOS 11.2 + M1 Mac Mini

Running ./main before patch with
subset_size_milllions = 10

Parsing gt:
10000
Loaded gt
1	0.1698	30.7341 us
2	0.2773	36.831 us
3	0.3567	43.9868 us
4	0.4215	55.2377 us
5	0.4673	59.3118 us
6	0.5041	59.3977 us
7	0.5392	78.3159 us
8	0.57	90.795 us
9	0.5949	79.8075 us
10	0.6191	101.076 us
11	0.6392	95.6952 us
12	0.6581	98.4538 us
13	0.6731	102.356 us
14	0.6871	103.722 us
15	0.7002	110.004 us
16	0.7125	123.257 us
17	0.725	107.637 us
18	0.736	108.864 us
19	0.7458	118.216 us
20	0.7539	120.993 us
21	0.7619	121.811 us
22	0.7707	124.828 us
23	0.7774	133.726 us
24	0.785	132.239 us
25	0.7898	158.008 us
26	0.7968	144.931 us
27	0.8016	150.408 us
28	0.806	147.33 us
29	0.8115	144.731 us
30	0.8174	148.872 us
40	0.8544	182.585 us
50	0.8795	215.213 us
60	0.8957	258.998 us
70	0.9107	275.703 us
80	0.9212	601.916 us
90	0.9312	346.77 us
100	0.9393	375.449 us
140	0.9601	584.979 us
180	0.9708	693.418 us
220	0.976	813.026 us
260	0.9804	1002.02 us
300	0.9826	1327.13 us
340	0.9845	1207.63 us
380	0.9862	1212.03 us
420	0.9877	1269.56 us
460	0.9892	1421.35 us

Running ./main post patch

Parsing gt:
10000
Loaded gt
1	0.1698	7.2835 us
2	0.2773	9.7249 us
3	0.3567	11.4171 us
4	0.4215	13.3347 us
5	0.4673	14.5087 us
6	0.5041	16.2215 us
7	0.5392	17.185 us
8	0.57	18.4606 us
9	0.5949	19.4449 us
10	0.6191	21.1617 us
11	0.6392	22.1288 us
12	0.6581	23.3129 us
13	0.6731	25.1171 us
14	0.6871	25.3738 us
15	0.7002	26.4833 us
16	0.7125	27.3419 us
17	0.725	28.3687 us
18	0.736	29.8781 us
19	0.7458	30.5237 us
20	0.7539	31.8982 us
21	0.7619	33.4757 us
22	0.7707	36.2736 us
23	0.7774	35.1295 us
24	0.785	39.422 us
25	0.7898	37.6223 us
26	0.7968	39.5841 us
27	0.8016	40.383 us
28	0.806	40.2653 us
29	0.8115	39.7669 us
30	0.8174	40.4674 us
40	0.8544	49.5017 us
50	0.8795	58.0349 us
60	0.8957	68.4427 us
70	0.9107	78.293 us
80	0.9212	96.4384 us
90	0.9312	104.241 us
100	0.9393	104.883 us
140	0.9601	153.266 us
180	0.9708	176.604 us
220	0.976	199.746 us
260	0.9804	237.368 us
300	0.9826	273.229 us
340	0.9845	305.279 us
380	0.9862	337.581 us
420	0.9877	366.357 us
460	0.9892	385.812 us
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant