]>
Commit | Line | Data |
---|---|---|
1 | From mboxrd@z Thu Jan 1 00:00:00 1970 | |
2 | Return-Path: <linux-pci-owner@kernel.org> | |
3 | X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on | |
4 | aws-us-west-2-korg-lkml-1.web.codeaurora.org | |
5 | X-Spam-Level: | |
6 | X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, | |
7 | DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, | |
8 | MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT | |
9 | autolearn=unavailable autolearn_force=no version=3.4.0 | |
10 | Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) | |
11 | by smtp.lore.kernel.org (Postfix) with ESMTP id 5293AC433ED | |
12 | for <linux-pci@archiver.kernel.org>; Wed, 5 May 2021 16:46:38 +0000 (UTC) | |
13 | Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) | |
14 | by mail.kernel.org (Postfix) with ESMTP id 336FB613BE | |
15 | for <linux-pci@archiver.kernel.org>; Wed, 5 May 2021 16:46:38 +0000 (UTC) | |
16 | Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand | |
17 | id S236089AbhEEQrX (ORCPT <rfc822;linux-pci@archiver.kernel.org>); | |
18 | Wed, 5 May 2021 12:47:23 -0400 | |
19 | Received: from mail.kernel.org ([198.145.29.99]:39944 "EHLO mail.kernel.org" | |
20 | rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP | |
21 | id S235175AbhEEQn0 (ORCPT <rfc822;linux-pci@vger.kernel.org>); | |
22 | Wed, 5 May 2021 12:43:26 -0400 | |
23 | Received: by mail.kernel.org (Postfix) with ESMTPSA id 7196361879; | |
24 | Wed, 5 May 2021 16:35:00 +0000 (UTC) | |
25 | DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; | |
26 | s=k20201202; t=1620232500; | |
27 | bh=fPfmCTyAQD8PT6TZkO8THxra9rEY81AdAkfBo+IZ5tU=; | |
28 | h=From:To:Cc:Subject:Date:In-Reply-To:References:From; | |
29 | b=GGQEKTJt6sQkhGf+9veu8zkPNAFhEHpim85V7tVqJPqB4sraqNoIP0BWX1keInmv1 | |
30 | ZKmmiJG5OUg5J9Es8W0iw0yQNPVmz34gbFy4b/BRUc7EKu46CVXeOvY7+dX1ivCaHW | |
31 | ikKmjPxAXiMqLH3vRNqoFgDWvRRAhatZ2B9XuscrV7BPfXufja4ykhb29irvNL2akj | |
32 | Hc4a6H7NHXnuVMvnnogfrZDleFUOYf/BVnamTiRmKRbnoBOWJPt6XCS+yoB5gVy3H7 | |
33 | /pzDotZdZ51NWdc/HzZfBT+40TkFWyD6fV5hW9V0Yi5BVVD/2LZDvbLecJkOQJfTqD | |
34 | +rF2SH3dN0A9Q== | |
35 | Received: by pali.im (Postfix) | |
36 | id 09BEF79D; Wed, 5 May 2021 18:34:57 +0200 (CEST) | |
37 | From: =?UTF-8?q?Pali=20Roh=C3=A1r?= <pali@kernel.org> | |
38 | To: Bjorn Helgaas <bhelgaas@google.com>, | |
39 | Kalle Valo <kvalo@codeaurora.org>, | |
40 | =?UTF-8?q?Toke=20H=C3=B8iland-J=C3=B8rgensen?= <toke@redhat.com>, | |
41 | =?UTF-8?q?Marek=20Beh=C3=BAn?= <kabel@kernel.org>, | |
42 | =?UTF-8?q?Krzysztof=20Wilczy=C5=84ski?= <kw@linux.com> | |
43 | Cc: vtolkm@gmail.com, Rob Herring <robh@kernel.org>, | |
44 | Ilias Apalodimas <ilias.apalodimas@linaro.org>, | |
45 | Thomas Petazzoni <thomas.petazzoni@bootlin.com>, | |
46 | linux-pci@vger.kernel.org, ath10k@lists.infradead.org, | |
47 | linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org | |
48 | Subject: [PATCH v3] PCI: Disallow retraining link for Atheros chips on non-Gen1 PCIe bridges | |
49 | Date: Wed, 5 May 2021 18:33:57 +0200 | |
50 | Message-Id: <20210505163357.16012-1-pali@kernel.org> | |
51 | X-Mailer: git-send-email 2.20.1 | |
52 | In-Reply-To: <20210326124326.21163-1-pali@kernel.org> | |
53 | References: <20210326124326.21163-1-pali@kernel.org> | |
54 | MIME-Version: 1.0 | |
55 | Content-Type: text/plain; charset=UTF-8 | |
56 | Content-Transfer-Encoding: 8bit | |
57 | Precedence: bulk | |
58 | List-ID: <linux-pci.vger.kernel.org> | |
59 | X-Mailing-List: linux-pci@vger.kernel.org | |
60 | ||
61 | Atheros AR9xxx and QCA9xxx chips have behaviour issues not only after a | |
62 | bus reset, but also after doing retrain link, if PCIe bridge is not in | |
63 | GEN1 mode (at 2.5 GT/s speed): | |
64 | ||
65 | - QCA9880 and QCA9890 chips throw a Link Down event and completely | |
66 | disappear from the bus and their config space is not accessible | |
67 | afterwards. | |
68 | ||
69 | - QCA9377 chip throws a Link Down event followed by Link Up event, the | |
70 | config space is accessible and PCI device ID is correct. But trying to | |
71 | access chip's I/O space causes Uncorrected (Non-Fatal) AER error, | |
72 | followed by Synchronous external abort 96000210 and Segmentation fault | |
73 | of insmod while loading ath10k_pci.ko module. | |
74 | ||
75 | - AR9390 chip throws a Link Down event followed by Link Up event, config | |
76 | space is accessible, but contains nonsense values. PCI device ID is | |
77 | 0xABCD which indicates HW bug that chip itself was not able to read | |
78 | values from internal EEPROM/OTP. | |
79 | ||
80 | - AR9287 chip throws also Link Down and Link Up events, also has | |
81 | accessible config space containing correct values. But ath9k driver | |
82 | fails to initialize card from this state as it is unable to access HW | |
83 | registers. This also indicates that the chip iself is not able to read | |
84 | values from internal EEPROM/OTP. | |
85 | ||
86 | These issues related to PCI device ID 0xABCD and to reading internal | |
87 | EEPROM/OTP were previously discussed at ath9k-devel mailing list in | |
88 | following thread: | |
89 | ||
90 | https://www.mail-archive.com/ath9k-devel@lists.ath9k.org/msg07529.html | |
91 | ||
92 | After experiments we've come up with a solution: it seems that Retrain | |
93 | link can be called only when using GEN1 PCIe bridge or when PCIe bridge | |
94 | link speed is forced to 2.5 GT/s. Applying this workaround fixes all | |
95 | mentioned cards. | |
96 | ||
97 | This issue was reproduced with more cards: | |
98 | - Compex WLE900VX (QCA9880 based / device ID 0x003c) | |
99 | - QCNFA435 (QCA9377 based / device ID 0x0042) | |
100 | - Compex WLE200NX (AR9287 based / device ID 0x002e) | |
101 | - "noname" card (QCA9890 based / device ID 0x003c) | |
102 | - Wistron NKR-DNXAH1 (AR9390 based / device ID 0x0030) | |
103 | on Armada 385 with pci-mvebu.c driver and also on Armada 3720 with | |
104 | pci-aardvark.c driver. | |
105 | ||
106 | To workaround this issue, this change introduces a new PCI quirk called | |
107 | PCI_DEV_FLAGS_NO_RETRAIN_LINK_WHEN_NOT_GEN1, which is enabled for all | |
108 | Atheros chips with PCI_DEV_FLAGS_NO_BUS_RESET quirk, and also for Atheros | |
109 | chip AR9287. | |
110 | ||
111 | When this quirk is set, kernel disallows triggering PCI_EXP_LNKCTL_RL | |
112 | bit in config space of PCIe Bridge in the case when PCIe Bridge is | |
113 | capable of higher speed than 2.5 GT/s and this higher speed is already | |
114 | allowed. When PCIe Bridge has accessible LNKCTL2 register, we try to | |
115 | force target link speed to 2.5 GT/s. After this change it is possible | |
116 | to trigger PCI_EXP_LNKCTL_RL bit without issues. | |
117 | ||
118 | Currently only PCIe ASPM kernel code triggers this PCI_EXP_LNKCTL_RL bit, | |
119 | so quirk check is added only into pcie/aspm.c file. | |
120 | ||
121 | Signed-off-by: Pali Rohár <pali@kernel.org> | |
122 | Reported-by: Toke Høiland-Jørgensen <toke@redhat.com> | |
123 | Tested-by: Toke Høiland-Jørgensen <toke@redhat.com> | |
124 | Tested-by: Marek Behún <kabel@kernel.org> | |
125 | BugLink: https://lore.kernel.org/linux-pci/87h7l8axqp.fsf@toke.dk/ | |
126 | BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=84821 | |
127 | BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=192441 | |
128 | BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=209833 | |
129 | Cc: stable@vger.kernel.org # c80851f6ce63a ("PCI: Add PCI_EXP_LNKCTL2_TLS* macros") | |
130 | ||
131 | --- | |
132 | Changes since v1: | |
133 | * Move whole quirk code into pcie_downgrade_link_to_gen1() function | |
134 | * Reformat to 80 chars per line where possible | |
135 | * Add quirk also for cards with AR9287 chip (PCI ID 0x002e) | |
136 | * Extend commit message description and add information about 0xABCD | |
137 | ||
138 | Changes since v2: | |
139 | * Add quirk also for Atheros QCA9377 chip | |
140 | --- | |
141 | drivers/pci/pcie/aspm.c | 44 +++++++++++++++++++++++++++++++++++++++++ | |
142 | drivers/pci/quirks.c | 39 ++++++++++++++++++++++++++++-------- | |
143 | include/linux/pci.h | 2 ++ | |
144 | 3 files changed, 77 insertions(+), 8 deletions(-) | |
145 | ||
146 | diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c | |
147 | index ac0557a305af..729b0389562b 100644 | |
148 | --- a/drivers/pci/pcie/aspm.c | |
149 | +++ b/drivers/pci/pcie/aspm.c | |
150 | @@ -192,12 +192,56 @@ static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist) | |
151 | link->clkpm_disable = blacklist ? 1 : 0; | |
152 | } | |
153 | ||
154 | +static int pcie_downgrade_link_to_gen1(struct pci_dev *parent) | |
155 | +{ | |
156 | + u16 reg16; | |
157 | + u32 reg32; | |
158 | + int ret; | |
159 | + | |
160 | + /* Check if link is capable of higher speed than 2.5 GT/s */ | |
161 | + pcie_capability_read_dword(parent, PCI_EXP_LNKCAP, ®32); | |
162 | + if ((reg32 & PCI_EXP_LNKCAP_SLS) <= PCI_EXP_LNKCAP_SLS_2_5GB) | |
163 | + return 0; | |
164 | + | |
165 | + /* Check if link speed can be downgraded to 2.5 GT/s */ | |
166 | + pcie_capability_read_dword(parent, PCI_EXP_LNKCAP2, ®32); | |
167 | + if (!(reg32 & PCI_EXP_LNKCAP2_SLS_2_5GB)) { | |
168 | + pci_err(parent, "ASPM: Bridge does not support changing Link Speed to 2.5 GT/s\n"); | |
169 | + return -EOPNOTSUPP; | |
170 | + } | |
171 | + | |
172 | + /* Force link speed to 2.5 GT/s */ | |
173 | + ret = pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL2, | |
174 | + PCI_EXP_LNKCTL2_TLS, | |
175 | + PCI_EXP_LNKCTL2_TLS_2_5GT); | |
176 | + if (!ret) { | |
177 | + /* Verify that new value was really set */ | |
178 | + pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, ®16); | |
179 | + if ((reg16 & PCI_EXP_LNKCTL2_TLS) != PCI_EXP_LNKCTL2_TLS_2_5GT) | |
180 | + ret = -EINVAL; | |
181 | + } | |
182 | + | |
183 | + if (ret) { | |
184 | + pci_err(parent, "ASPM: Changing Target Link Speed to 2.5 GT/s failed: %d\n", ret); | |
185 | + return ret; | |
186 | + } | |
187 | + | |
188 | + pci_info(parent, "ASPM: Target Link Speed changed to 2.5 GT/s due to quirk\n"); | |
189 | + return 0; | |
190 | +} | |
191 | + | |
192 | static bool pcie_retrain_link(struct pcie_link_state *link) | |
193 | { | |
194 | struct pci_dev *parent = link->pdev; | |
195 | unsigned long end_jiffies; | |
196 | u16 reg16; | |
197 | ||
198 | + if ((link->downstream->dev_flags & PCI_DEV_FLAGS_NO_RETRAIN_LINK_WHEN_NOT_GEN1) && | |
199 | + pcie_downgrade_link_to_gen1(parent)) { | |
200 | + pci_err(parent, "ASPM: Retrain Link at higher speed is disallowed by quirk\n"); | |
201 | + return false; | |
202 | + } | |
203 | + | |
204 | pcie_capability_read_word(parent, PCI_EXP_LNKCTL, ®16); | |
205 | reg16 |= PCI_EXP_LNKCTL_RL; | |
206 | pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16); | |
207 | diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c | |
208 | index 653660e3ba9e..4999ad9d08b8 100644 | |
209 | --- a/drivers/pci/quirks.c | |
210 | +++ b/drivers/pci/quirks.c | |
211 | @@ -3553,31 +3553,55 @@ static void mellanox_check_broken_intx_masking(struct pci_dev *pdev) | |
212 | dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET; | |
213 | } | |
214 | ||
215 | +static void quirk_no_bus_reset_and_no_retrain_link(struct pci_dev *dev) | |
216 | +{ | |
217 | + dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET | | |
218 | + PCI_DEV_FLAGS_NO_RETRAIN_LINK_WHEN_NOT_GEN1; | |
219 | +} | |
220 | + | |
221 | /* | |
222 | * Some NVIDIA GPU devices do not work with bus reset, SBR needs to be | |
223 | * prevented for those affected devices. | |
224 | */ | |
225 | static void quirk_nvidia_no_bus_reset(struct pci_dev *dev) | |
226 | { | |
227 | if ((dev->device & 0xffc0) == 0x2340) | |
228 | quirk_no_bus_reset(dev); | |
229 | } | |
230 | DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID, | |
231 | quirk_nvidia_no_bus_reset); | |
232 | ||
233 | /* | |
234 | - * Some Atheros AR9xxx and QCA988x chips do not behave after a bus reset. | |
235 | + * Atheros AR9xxx and QCA9xxx chips do not behave after a bus reset and also | |
236 | + * after retrain link when PCIe bridge is not in GEN1 mode at 2.5 GT/s speed. | |
237 | * The device will throw a Link Down error on AER-capable systems and | |
238 | * regardless of AER, config space of the device is never accessible again | |
239 | * and typically causes the system to hang or reset when access is attempted. | |
240 | + * Or if config space is accessible again then it contains only dummy values | |
241 | + * like fixed PCI device ID 0xABCD or values not initialized at all. | |
242 | + * Retrain link can be called only when using GEN1 PCIe bridge or when | |
243 | + * PCIe bridge has forced link speed to 2.5 GT/s via PCI_EXP_LNKCTL2 register. | |
244 | + * To reset these cards it is required to do PCIe Warm Reset via PERST# pin. | |
245 | * https://lore.kernel.org/r/20140923210318.498dacbd@dualc.maya.org/ | |
246 | + * https://lore.kernel.org/r/87h7l8axqp.fsf@toke.dk/ | |
247 | + * https://www.mail-archive.com/ath9k-devel@lists.ath9k.org/msg07529.html | |
248 | */ | |
249 | -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0030, quirk_no_bus_reset); | |
250 | -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0032, quirk_no_bus_reset); | |
251 | -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x003c, quirk_no_bus_reset); | |
252 | -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0033, quirk_no_bus_reset); | |
253 | -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0034, quirk_no_bus_reset); | |
254 | -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x003e, quirk_no_bus_reset); | |
255 | +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x002e, | |
256 | + quirk_no_bus_reset_and_no_retrain_link); | |
257 | +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0030, | |
258 | + quirk_no_bus_reset_and_no_retrain_link); | |
259 | +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0032, | |
260 | + quirk_no_bus_reset_and_no_retrain_link); | |
261 | +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0033, | |
262 | + quirk_no_bus_reset_and_no_retrain_link); | |
263 | +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0034, | |
264 | + quirk_no_bus_reset_and_no_retrain_link); | |
265 | +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x003c, | |
266 | + quirk_no_bus_reset_and_no_retrain_link); | |
267 | +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x003e, | |
268 | + quirk_no_bus_reset_and_no_retrain_link); | |
269 | +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0042, | |
270 | + quirk_no_bus_reset_and_no_retrain_link); | |
271 | ||
272 | /* | |
273 | * Root port on some Cavium CN8xxx chips do not successfully complete a bus | |
274 | diff --git a/include/linux/pci.h b/include/linux/pci.h | |
275 | index 86c799c97b77..fdbf7254e4ab 100644 | |
276 | --- a/include/linux/pci.h | |
277 | +++ b/include/linux/pci.h | |
278 | @@ -227,6 +227,8 @@ enum pci_dev_flags { | |
279 | PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11), | |
280 | /* Device does honor MSI masking despite saying otherwise */ | |
281 | PCI_DEV_FLAGS_HAS_MSI_MASKING = (__force pci_dev_flags_t) (1 << 12), | |
282 | + /* Don't Retrain Link for device when bridge is not in GEN1 mode */ | |
283 | + PCI_DEV_FLAGS_NO_RETRAIN_LINK_WHEN_NOT_GEN1 = (__force pci_dev_flags_t) (1 << 12), | |
284 | }; | |
285 | ||
286 | enum pci_irq_reroute_variant { | |
287 | -- | |
288 | 2.20.1 | |
289 | ||
290 |