forked from clearlinux-pkgs/linux
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathrcuref-1.patch
150 lines (129 loc) · 6.11 KB
/
rcuref-1.patch
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
From mboxrd@z Thu Jan 1 00:00:00 1970
Return-Path: <[email protected]>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
by smtp.lore.kernel.org (Postfix) with ESMTP id 83011C64ED6
for <[email protected]>; Tue, 28 Feb 2023 14:33:34 +0000 (UTC)
Received: ([email protected]) by vger.kernel.org via listexpand
id S229850AbjB1Odc (ORCPT <rfc822;[email protected]>);
Tue, 28 Feb 2023 09:33:32 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56936 "EHLO
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
with ESMTP id S229618AbjB1Oda (ORCPT
<rfc822;[email protected]>); Tue, 28 Feb 2023 09:33:30 -0500
Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55])
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2E7AF755;
Tue, 28 Feb 2023 06:33:28 -0800 (PST)
Message-ID: <[email protected]>
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de;
s=2020; t=1677594807;
h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
to:to:cc:cc:mime-version:mime-version:content-type:content-type:
references:references; bh=164HU9inGaDSStz20kDbBC51Xj3hrcXjyamsooLd+5g=;
b=i/JOnT2AKbM7BC9p9Q1bE+9kdzNuT9zADvs8sRa0yxUg4BHyvMmjDf0TsL3S9kfTJJ9Rhm
/4ddfUHSBqHHwfcL3CwSYWZPKqS6dpFg8MmvWVN0B1aQ/5+Em2gn2Fk9dZxBoGV1C6PvaN
ajTPa8brDAjcv79DeIM4ZXaJkAcqcU8Lb6UYA1fFkRBTyPj8Rg/+4Q4FzG2RialEYCv0jW
EFHjJOWKhh9GC1f/OdtRaw6lA85mv9atBfGFiY/hh3rZnhMaZnbYZtscLeBiG3kO72XSXG
coaZgRtrZo/23+7EXKYcw+sGnptVB9TTld4wZ3oMdpPTs7Zvvm11yBjGi7npzg==
DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de;
s=2020e; t=1677594807;
h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
to:to:cc:cc:mime-version:mime-version:content-type:content-type:
references:references; bh=164HU9inGaDSStz20kDbBC51Xj3hrcXjyamsooLd+5g=;
b=DcfZ7D0xyV3ezgSoUkW+DvoTF83VWC4wcoRRY6eQ5PHaotzI1Vkc+0xDrwPBY9i1wSg6SR
O4il+nVOdAmdyVAg==
From: Thomas Gleixner <[email protected]>
To: LKML <[email protected]>
Cc: Linus Torvalds <[email protected]>, [email protected],
Wangyang Guo <[email protected]>,
Arjan van De Ven <[email protected]>,
"David S. Miller" <[email protected]>,
Eric Dumazet <[email protected]>,
Jakub Kicinski <[email protected]>,
Paolo Abeni <[email protected]>, [email protected],
Will Deacon <[email protected]>,
Peter Zijlstra <[email protected]>,
Boqun Feng <[email protected]>,
Mark Rutland <[email protected]>,
Marc Zyngier <[email protected]>
Subject: [patch 1/3] net: dst: Prevent false sharing vs. dst_entry::__refcnt
References: <[email protected]>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Date: Tue, 28 Feb 2023 15:33:26 +0100 (CET)
Precedence: bulk
List-ID: <netdev.vger.kernel.org>
X-Mailing-List: [email protected]
From: Wangyang Guo <[email protected]>
dst_entry::__refcnt is highly contended in scenarios where many connections
happen from and to the same IP. The reference count is an atomic_t, so the
reference count operations have to take the cache-line exclusive.
Aside of the unavoidable reference count contention there is another
significant problem which is caused by that: False sharing.
perf top identified two affected read accesses. dst_entry::lwtstate and
rtable::rt_genid.
dst_entry:__refcnt is located at offset 64 of dst_entry, which puts it into
a seperate cacheline vs. the read mostly members located at the beginning
of the struct.
That prevents false sharing vs. the struct members in the first 64
bytes of the structure, but there is also
dst_entry::lwtstate
which is located after the reference count and in the same cache line. This
member is read after a reference count has been acquired.
struct rtable embeds a struct dst_entry at offset 0. struct dst_entry has a
size of 112 bytes, which means that the struct members of rtable which
follow the dst member share the same cache line as dst_entry::__refcnt.
Especially
rtable::rt_genid
is also read by the contexts which have a reference count acquired
already.
When dst_entry:__refcnt is incremented or decremented via an atomic
operation these read accesses stall.
This was found when analysing the memtier benchmark in 1:100 mode, which
amplifies the problem extremly.
Rearrange and pad the structure so that the lwtstate member is in the next
cache-line. This increases the struct size from 112 to 136 bytes on 64bit.
The resulting improvement depends on the micro-architecture and the number
of CPUs. It ranges from +20% to +120% with a localhost memtier/memcached
benchmark.
[ tglx: Rearrange struct ]
Signed-off-by: Wangyang Guo <[email protected]>
Signed-off-by: Arjan van De Ven <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
---
include/net/dst.h | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -69,15 +69,25 @@ struct dst_entry {
#endif
int __use;
unsigned long lastuse;
- struct lwtunnel_state *lwtstate;
struct rcu_head rcu_head;
short error;
short __pad;
__u32 tclassid;
#ifndef CONFIG_64BIT
+ struct lwtunnel_state *lwtstate;
atomic_t __refcnt; /* 32-bit offset 64 */
#endif
netdevice_tracker dev_tracker;
+#ifdef CONFIG_64BIT
+ /*
+ * Ensure that lwtstate is not in the same cache line as __refcnt,
+ * because that would lead to false sharing under high contention
+ * of __refcnt. This also ensures that rtable::rt_genid is not
+ * sharing the same cache-line.
+ */
+ int pad2[6];
+ struct lwtunnel_state *lwtstate;
+#endif
};
struct dst_metrics {