This patch has the driver for 10Gigabit Ethernet controller in AMD
SoC. This driver is written compatible to the Iflib framework. The
existing driver is for the old version of hardware. The submitted
driver here is for the recent versions of the hardware where the Ethernet
controller is PCI-E based.
Details
- Reviewers
manu andrew - Group Reviewers
Core Team network - Commits
- rS366628: 10Gigabit Ethernet driver for AMD SoC
The Submitted driver is verified with the recent versions of the AMD SoC.
If there are existing users for the old version of this driver, kindly
let us know and we would like to work with in validating the same.
Diff Detail
- Lint
Lint Passed - Unit
No Test Coverage - Build Status
Buildable 33524 Build 30788: arc lint + arc unit
Event Timeline
Due to the added text I would not call this a BSD-3-Clause, too bad they added that text.
- This file incorporates work covered by the following copyright and
- permission notice:
- The Synopsys DWC ETHER XGMAC Software Driver and documentation
- (hereinafter "Software") is an unsupported proprietary work of Synopsys,
- Inc. unless otherwise expressly agreed to in writing between Synopsys
This text had me worried until I went and read the rest of it, which does expressly in writing grant certain rights.
sys/dev/axgbe/if_axgbe.c | ||
---|---|---|
539 | C++ style comments should not be used, please convert to C style, or why not just delete the line? | |
sys/dev/axgbe/if_axgbe_pci.c | ||
4 | Why are All Rights Reserved still be asserted in 2020? | |
sys/dev/axgbe/xgbe-common.h | ||
58–59 | Technically a copyright and license are 2 separate and different things, the "Copyright" should be factored out of the license section and placed before it. | |
sys/dev/axgbe/xgbe-desc.c | ||
58–59 | Again the all rights reserved thing, yet it is not asserted in the GPLv2 license, copy paste repeat mistakes? |
Change details :
- Added SPDX-License Identifier for possible files with BSD-2-Clause-FreeBSD
- Moved copyright tags out of the License text as per the comment
- Changed to C style comments.
- Removed commented code and added TODO tags.
sys/dev/axgbe/if_axgbe_pci.c | ||
---|---|---|
4 | To understand about "All rights reserved", this text is only related to the copyright and unrelated to the License right? And "year" means the year the particular file came into existence for the first time in general and if multiple years specified, it means the years that file is actively touched, right? If my understanding is right, does changing the year to 2020 is the problem here? | |
sys/dev/axgbe/xgbe-common.h | ||
58–59 | Moved copyright above the License text and just have one instance. | |
sys/dev/axgbe/xgbe-desc.c | ||
58–59 | Just moved the copyright above and retained "All rights reserved" text. Once I get the clarity about the text, will make appropriate changes. |
sys/dev/axgbe/if_axgbe_pci.c | ||
---|---|---|
4 | Correct, the text "All rights Reserved" is associated with the copyright and not the License. But this text, "All rights reserved" is no longer needed, and we are trying to remove it when we can and not use it on any new code if possible. All the templates in FreeBSD have been updated to note have this text. Your statement about the "year" is the interpretation I have always used, some disagree on that and say it is valid to just list the first and last years as a range 2014-2020 for example. There is no problem with the year use of 2020 in this file, it is a new file. | |
sys/dev/axgbe/xgbe-common.h | ||
4 | This should probably have dates updated to 2014-2016, 2020. |
Changes :
- Removed "All rights reserved" and corrected years in the copyright
- Added a minor change in i2c and phy-v2 files for link status update for multi-port
sys/dev/axgbe/xgbe-common.h | ||
---|---|---|
4 | Other places also modified as needed. |
Hi Rajesh,
Sorry I haven't had time to even compile test this.
I think we're ok on the already addressed comment.
I'll commit this during this weekend if no one have an objection until then.
Thanks.
Changes :
- Removed unwanted header files from unwanted places
- Changed Rx path to use 2 Freelist instead of 1. Rx descriptors holds header and data buffers. Earlier use entries in one freelist were used to populate one descriptor. Now with 2 freelists, one entry from each freelist is used to populated on descriptor. This way the indexes will be in sync between driver and iflib
Hello,
I've just tested on my machine, an IEI Puzzle. (Sorry it took so long).
I needed this to have the driver compiled and loading : https://reviews.freebsd.org/P422
I don't know if something changed in iflib that causes the nrxqs change.
Also I don't know if my sfp+ modules aren't compatible or if it's due to the driver but I have :
ax0: xgbe_phy_sfp_detect: mod absent
in a loop when I up the interface.
Log when I load the module :
ax0: <AMD 10 Gigabit Ethernet Driver> mem 0xef7e0000-0xef7fffff,0xef7c0000-0xef7dffff,0xef80e000-0xef80ffff irq 40 at device 0.4 on pc
i6
ax0: Using 512 TX descriptors and 512 RX descriptors
ax0: Using 4 RX queues 4 TX queues
ax0: Using MSI-X interrupts with 8 vectors
ax0: xgbe_phy_reset: no phydev
ax1: <AMD 10 Gigabit Ethernet Driver> mem 0xef7a0000-0xef7bffff,0xef780000-0xef79ffff,0xef80c000-0xef80dfff irq 41 at device 0.5 on pci6
ax1: Using 512 TX descriptors and 512 RX descriptors
ax1: Using 4 RX queues 4 TX queues
ax1: Using MSI-X interrupts with 8 vectors
ax1: xgbe_phy_reset: no phydev
ax2: <AMD 10 Gigabit Ethernet Driver> mem 0xef760000-0xef77ffff,0xef740000-0xef75ffff,0xef80a000-0xef80bfff irq 42 at device 0.6 on pci6
ax2: Using 512 TX descriptors and 512 RX descriptors
ax2: Using 2 RX queues 2 TX queues
ax2: Using MSI-X interrupts with 6 vectors
ax2: xgbe_phy_reset: no phydev
ax3: <AMD 10 Gigabit Ethernet Driver> mem 0xef720000-0xef73ffff,0xef700000-0xef71ffff,0xef808000-0xef809fff irq 43 at device 0.7 on pci6
ax3: Using 512 TX descriptors and 512 RX descriptors
ax3: Using 2 RX queues 2 TX queues
ax3: Using MSI-X interrupts with 6 vectors
ax3: xgbe_phy_reset: no phydev
ax0: Enabling TSO in channel 0
ax0: Enabling TSO in channel 1
ax0: Enabling TSO in channel 2
ax0: Enabling TSO in channel 3
ax0: RSS Enabled
ax0: Receive checksum offload Enabled
ax0: VLAN filtering Enabled
ax0: VLAN Stripping Enabled
I'm wondering about the 'ax1: xgbe_phy_reset: no phydev' line.
Thanks,
Hi @manu, Thanks for trying the patch and making other necessary changes. Sorry, I missed to make that change MPASS (nrxqs == 2) when I submitted the v4 patch.
Regarding the log
'ax0: xgbe_phy_reset: no phydev'
This is normal before the interface is brought up. Not an issue. This log may be confusing. I will change that to verbose level log in the future patches.
Problem here is,
ax0: xgbe_phy_sfp_detect: mod absent
This means the SFP module is not recognized. We have tested with multiple SFP+ modules. Which SFP+ module are you using? Can you try a different one and see for verification?
You can enable the debug logs by setting appropriate value to below sysctl (supported values 1-3) for more verbose logs.
sysctl dev.ax.0.axgbe_debug_level=3
Currently enabling verbose logs for one device instance will set the same level for all instances. This will be corrected in future patches.
Thanks.
Problem here is,
ax0: xgbe_phy_sfp_detect: mod absent
This means the SFP module is not recognized. We have tested with multiple SFP+ modules. Which SFP+ module are you using? Can you try a different one and see for verification?
I only have finisar ftlx8574d3bcl for now, do you have a known model that is supposed to work ?
Also can you make the printf not in a loop ?
You can enable the debug logs by setting appropriate value to below sysctl (supported values 1-3) for more verbose logs.
sysctl dev.ax.0.axgbe_debug_level=3
Currently enabling verbose logs for one device instance will set the same level for all instances. This will be corrected in future patches.
Ok, will try that on monday or tuesday.
Thanks.
Fine @manu .
I only have finisar ftlx8574d3bcl for now, do you have a known model that is supposed to work ?
Also can you make the printf not in a loop ?
We have tested with Finisar 10G SFP-to-Optical modules (not RJ45). Other than that, we have tested with 10G FS, Prolabs and 10GTek (all SFP-to-RJ45). I will make the change to print that log only once and other changes I mentioned before and submit a new patch tomorrow.
In the patch tomorrow, shall I include the changes you made in https://reviews.freebsd.org/P422 as well?
Ok, weird that my modules doesn't work then no ?
In the patch tomorrow, shall I include the changes you made in https://reviews.freebsd.org/P422 as well?
Sure, thanks.
Ok, weird that my modules doesn't work then no ?
Yeah. I don't think optical (or) RJ45 should make any difference. Trying with other modules will make things clear. One other thing to try is to just have one port enabled and see whether that makes any difference.
Log with debug_level=3 : https://reviews.freebsd.org/P423
Also if you update the code could you add the PNPINFO so the module will autoload via devmatch(8) please ?
Will do more test after buying others sfp+ modules.
Thanks.
Changes :
- Fixed the debug sysctl to be device instance specific
- Minor Logging changes
@manu, all the discussed changes (avoid repetitive logs, fix the debug sysctl, and your changes in P422) are done and updated the patch here.
Also if you update the code could you add the PNPINFO so the module will autoload via devmatch(8) please
Regarding the above comment, PNPINFO is already added in the code (IFLIB_PNP_INFO) and devmatch -d lists the details appropriately. With the recent patch, necessary changes to load the driver automatically is also done.
Regarding the logs shared in P433, the config looks correct from the logs. So, problem could with the Port values being read from the SFP. Additional logs are added to indicate that now. Please try with other SFPs as well.
Let me know if you have any more comments.
So I feel a bit dumb but I now know why I it didn't worked.
Naming of interfaces is of course not the same on the panel but what made me confuse is if I remove/insert the module in the slot that correspond to ax0 I have :
ax3: xgbe_phy_sfp_signals: sfp_gpio_inputs: 0x7220
ax3: xgbe_phy_sfp_signals: sfp_gpio_inputs: 0x2220
Which lead me to think that I was using the right interface.
So TLDR inserting/removing a module in any of the 4 slot always prints info for ax3.
Do you have any idea what's going on ?
It's a logging mistake @manu. I will correct it accordingly. Each nibble in that value denotes a port status. When you say you are touching only ax0, I see the correct nibble reflecting the change (bits[15:12]). From the values you showed here, looks like your SFP module is detected, but Link is still not. Is the link connected?
What is the problem with the SFP module earlier? Is it faulty?
Apart from this logging issue, Is there any other changes needed? So, that I can include them in the next patch.
The paste I've added was done with no cable plugged, but I have link with a copper adapter, haven't yet setup my 10G switch but I guess it will be ok.
What is the problem with the SFP module earlier? Is it faulty?
Yes it was, I felt a bit stupid when really testing this afternoon :)
Apart from this logging issue, Is there any other changes needed? So, that I can include them in the next patch.
I think we're good.
Fine. Good to hear you could progress @manu. I will make that logging change and update the patch.
Hi Rajesh,
I'm working with Deciso in order to test the AMD SoC XGBE driver on a board using the EPYC 3201 (4 cores). Note that there are only two SFP+ ports on this board, which might be different to your situation.
While experimenting with hotplugging the SFP+ modules, I noticed some inconsistent behaviour.
Frequently, the link does not seem to be established (or at least, not right away). Below you can find the log output.
Both interfaces have IP-addresses assigned to them.
--- pull the module out of ax1 Sep 24 09:41:10 OPNsense kernel: ax0: xgbe_phy_sfp_signals: sfp_gpio_inputs: 0x7702 Sep 24 09:41:10 OPNsense kernel: ax1: xgbe_phy_sfp_signals: sfp_gpio_inputs: 0x7702 Sep 24 09:41:10 OPNsense kernel: ax1: xgbe_phy_sfp_detect: mod absent -- insert module into ax1 Sep 24 09:41:17 OPNsense kernel: ax0: xgbe_phy_sfp_signals: sfp_gpio_inputs: 0x7202 Sep 24 09:41:17 OPNsense kernel: ax1: xgbe_phy_sfp_signals: sfp_gpio_inputs: 0x7202 Sep 24 09:41:19 OPNsense kernel: ax0: xgbe_phy_sfp_signals: sfp_gpio_inputs: 0x7002 Sep 24 09:41:19 OPNsense kernel: ax1: xgbe_phy_sfp_signals: sfp_gpio_inputs: 0x7002 <-- still down, expected up --- after some time... (this does not happen every time, this is wildly inconsistent) Sep 24 09:41:45 OPNsense kernel: ax1: Link is UP - 10Gbps/Full - flow control rx/tx Sep 24 09:41:45 OPNsense kernel: ax1: link state changed to UP
Please note that I tested this with 3 different (but equivalent) SFP+ modules, to rule out any defects.
Let me know if you need more information and I will get back to you as soon as possible.
sys/dev/axgbe/xgbe-i2c.c | ||
---|---|---|
424 | According to the logs with axgbe_debug_level=3, it seems that reading the EEPROM times out after +/- 60 bytes. Since the I2C controller operates at a frequency of 100 kHz, it seems that it needs at least 11ms to complete the read operation of the EEPROM. I changed this locally to ticks + (10 * hz) and that seemed to fix the issue. On the safe side I would set this to 20 or 30. I wonder why this didn't happen on your side? |
Regarding the following statement, Does it mean the Link doesn't come up at all? or It takes some delay to come up?
Frequently, the link does not seem to be established (or at least, not right away).
Regarding the following log,
xgbe_phy_sfp_signals: sfp_gpio_inputs: 0x7702
Every nibble in this value corresponds to a port (port0/left to port3/right). Values of the disabled ports are invalid. Currently, even if port 1 is touched you may get the log on port0 instance, which might be confusing. I will correct that and submit a patch.
Having said that, the logs you showed above looks correct. When you plugged in the module in port 1, you can see the second nibble(from left) changing value to 0 (which means the module and the link are recognized). So I assume the issue being said here is Link UP (explicit log) happens after a delay rather than instantly? How frequent is this?
According to the logs with axgbe_debug_level=3, it seems that reading the EEPROM times out after +/- 60 bytes. Since the I2C controller operates at a frequency of 100 kHz, it seems that it needs at least 11ms to complete the read operation of the EEPROM. I changed this locally to ticks + (10 * hz) and that seemed to fix the issue. On the safe side I would set this to 20 or 30.
I wonder why this didn't happen on your side?
Good to hear you have a fix for the same. In my setup, I haven't faced a situation where reading the EEPROM timing out. We faced a bit of delay in Link UP, but that's intermittent. It's interesting to hear this timeout scenario. If there are timeouts, we have "error" message which prints without increasing the log-level. Have you seen any of those error messages? Have you tried adding any logs to compute the number of eeprom bytes read before timeout and actual time needed to complete the eeprom read? We had that 1ms timeout inline with Linux. I will check whether Linux also shows any delay.
FYI, I am working remotely and have limited hardware access till oct 5th. I will try to respond your questions asap anyway.
Regarding the following statement, Does it mean the Link doesn't come up at all? or It takes some delay to come up?
Frequently, the link does not seem to be established (or at least, not right away).
Having said that, the logs you showed above looks correct. When you plugged in the module in port 1, you can see the second nibble(from left) changing value to 0 (which means the module and the link are recognized). So I assume the issue being said here is Link UP (explicit log) happens after a delay rather than instantly? How frequent is this?
In most cases the link comes up after a delay (a small delay could be normal, but 30 or more seconds isn't right). However, I noticed that switching between ports 0 and 1 always causes the port of the newly inserted module to give the correct signals, like so:
xgbe_phy_sfp_signals: sfp_gpio_inputs: 0x7002
But not always give a link UP signal no matter how long I wait.
As you said, I also see the correct nibble changing upon insertion, be it port 0 or 1 (0x702 and 0x7002 respectively).
How frequent is this?
The problem here is that I tried to reproduce a consistent scenario, but couldn't. rebooting, unloading and loading the driver all seemed to lead to varying results. Sometimes it would immediately recognize a link, at other times it would have a delay, and sometimes the link doesn't come up at all, even though the module recognizes a link.
Regarding the following log,
xgbe_phy_sfp_signals: sfp_gpio_inputs: 0x7702
Every nibble in this value corresponds to a port (port0/left to port3/right). Values of the disabled ports are invalid. Currently, even if port 1 is touched you may get the log on port0 instance, which might be confusing. I will correct that and submit a patch.
Please ignore the last nibble in this value, as one of the I/O ports of the PCA9535 is connected to something else.
According to the logs with axgbe_debug_level=3, it seems that reading the EEPROM times out after +/- 60 bytes. Since the I2C controller operates at a frequency of 100 kHz, it seems that it needs at least 11ms to complete the read operation of the EEPROM. I changed this locally to ticks + (10 * hz) and that seemed to fix the issue. On the safe side I would set this to 20 or 30.
I wonder why this didn't happen on your side?
Good to hear you have a fix for the same. In my setup, I haven't faced a situation where reading the EEPROM timing out. We faced a bit of delay in Link UP, but that's intermittent. It's interesting to hear this timeout scenario. If there are timeouts, we have "error" message which prints without increasing the log-level. Have you seen any of those error messages? Have you tried adding any logs to compute the number of eeprom bytes read before timeout and actual time needed to complete the eeprom read? We had that 1ms timeout inline with Linux. I will check whether Linux also shows any delay.
I have indeed seen those error messages with and without the increased log level. Actually, the verbose logs already give an indication as to how many bytes are read before timing out: (if my interpretation of the logs is correct)
ax0: xgbe_i2c_read: rx_slots 0 rx_len 128 ax0: xgbe_i2c_write: tx_slots 15 tx_len 128 ax0: xgbe_i2c_write: cmd 256 tx_len 128 ax0: xgbe_i2c_write: cmd 256 tx_len 127 ax0: xgbe_i2c_write: cmd 256 tx_len 126 ax0: xgbe_i2c_write: cmd 256 tx_len 125 ax0: xgbe_i2c_write: cmd 256 tx_len 124 ax0: xgbe_i2c_write: cmd 256 tx_len 123 ax0: xgbe_i2c_write: cmd 256 tx_len 122 ax0: xgbe_i2c_write: cmd 256 tx_len 121 ax0: xgbe_i2c_write: cmd 256 tx_len 120 ax0: xgbe_i2c_write: cmd 256 tx_len 119 ax0: xgbe_i2c_write: cmd 256 tx_len 118 ax0: xgbe_i2c_write: cmd 256 tx_len 117 ax0: xgbe_i2c_write: cmd 256 tx_len 116 ax0: xgbe_i2c_write: cmd 256 tx_len 115 ax0: xgbe_i2c_write: cmd 256 tx_len 114 ax0: xgbe_i2c_isr: ret 0 stop 0 ax0: xgbe_i2c_isr: isr 0x2510 ax0: xgbe_i2c_isr: I2C interrupt status=0x00002510 ax0: xgbe_i2c_read: op cmd 0 ax0: xgbe_i2c_read: rx_slots 15 rx_len 128 ax0: xgbe_i2c_write: tx_slots 15 tx_len 113 ax0: xgbe_i2c_write: cmd 256 tx_len 113 ax0: xgbe_i2c_write: cmd 256 tx_len 112 ax0: xgbe_i2c_write: cmd 256 tx_len 111 ax0: xgbe_i2c_write: cmd 256 tx_len 110 ax0: xgbe_i2c_write: cmd 256 tx_len 109 ax0: xgbe_i2c_write: cmd 256 tx_len 108 ax0: xgbe_i2c_write: cmd 256 tx_len 107 ax0: xgbe_i2c_write: cmd 256 tx_len 106 ax0: xgbe_i2c_write: cmd 256 tx_len 105 ax0: xgbe_i2c_write: cmd 256 tx_len 104 ax0: xgbe_i2c_write: cmd 256 tx_len 103 ax0: xgbe_i2c_write: cmd 256 tx_len 102 ax0: xgbe_i2c_write: cmd 256 tx_len 101 ax0: xgbe_i2c_write: cmd 256 tx_len 100 ax0: xgbe_i2c_write: cmd 256 tx_len 99 ax0: xgbe_i2c_isr: ret 0 stop 0 ax0: xgbe_i2c_isr: isr 0x2510 ax0: xgbe_i2c_isr: I2C interrupt status=0x00002510 ax0: xgbe_i2c_read: op cmd 0 ax0: xgbe_i2c_read: rx_slots 15 rx_len 113 ax0: xgbe_i2c_write: tx_slots 15 tx_len 98 ax0: xgbe_i2c_write: cmd 256 tx_len 98 ax0: xgbe_i2c_write: cmd 256 tx_len 97 ax0: xgbe_i2c_write: cmd 256 tx_len 96 ax0: xgbe_i2c_write: cmd 256 tx_len 95 ax0: xgbe_i2c_write: cmd 256 tx_len 94 ax0: xgbe_i2c_write: cmd 256 tx_len 93 ax0: xgbe_i2c_write: cmd 256 tx_len 92 ax0: xgbe_i2c_write: cmd 256 tx_len 91 ax0: xgbe_i2c_write: cmd 256 tx_len 90 ax0: xgbe_i2c_write: cmd 256 tx_len 89 ax0: xgbe_i2c_write: cmd 256 tx_len 88 ax0: xgbe_i2c_write: cmd 256 tx_len 87 ax0: xgbe_i2c_write: cmd 256 tx_len 86 ax0: xgbe_i2c_write: cmd 256 tx_len 85 ax0: xgbe_i2c_write: cmd 256 tx_len 84 ax0: xgbe_i2c_isr: ret 0 stop 0 ax0: xgbe_i2c_xfer: operation timed out ax0: xgbe_i2c_isr: isr 0x2510 ax0: xgbe_i2c_isr: I2C interrupt status=0x00002510 ax0: xgbe_i2c_read: op cmd 0 ax0: xgbe_i2c_read: rx_slots 0 rx_len 98 ax0: xgbe_i2c_write: tx_slots 15 tx_len 83 ax0: xgbe_i2c_write: cmd 256 tx_len 83 ax0: xgbe_i2c_write: cmd 256 tx_len 82 ax0: xgbe_i2c_write: cmd 256 tx_len 81 ax0: xgbe_i2c_write: cmd 256 tx_len 80 ax0: xgbe_i2c_write: cmd 256 tx_len 79 ax0: ax0: xgbe_i2c_write: cmd 256 tx_len 78 xgbe_i2c_disable: final i2c_disable 0 ax0: ax0: xgbe_phy_i2c_read: ret2 -60 retry 1 xgbe_i2c_write: cmd 256 tx_len 77 ax0: ax0: xgbe_i2c_write: cmd 256 tx_len 76 ax0: sfp_base[XGBE_SFP_BASE_ID] : 0x0003 xgbe_i2c_write: cmd 256 tx_len 75 ax0: ax0: xgbe_i2c_write: cmd 256 tx_len 74 sfp_base[XGBE_SFP_BASE_EXT_ID] : 0x0004 ax0: ax0: xgbe_i2c_write: cmd 256 tx_len 73 sfp_base[XGBE_SFP_BASE_CABLE] : 0x0000 ax0: xgbe_i2c_write: cmd 256 tx_len 72 ax0: ax0: I2C error reading SFP EEPROM xgbe_i2c_write: cmd 256 tx_len 71 ax0: i2c xfer started ---->>> ax0: ax0: xgbe_i2c_write: cmd 3 tx_len 70 xgbe_i2c_disable: final i2c_disable 0 ax0: xgbe_i2c_write: cmd 4 tx_len 69 ax0: xgbe_i2c_isr: ret 0 stop 0 ax0: xgbe_i2c_isr: isr 0x710 ax0: xgbe_i2c_isr: I2C interrupt status=0x00000710 ax0: xgbe_i2c_read: op cmd 1 ax0: xgbe_i2c_write: tx_slots 15 tx_len 0 ax0: xgbe_i2c_isr: ret 0 stop 1 ax0: xgbe_i2c_xfer: I2C OP complete ax0: xgbe_i2c_xfer: i2c xfer ret 0 abrt_source 0x0 ax0: i2c xfer finished ---->>> ax0: xgbe_i2c_disable: final i2c_disable 0 ax0: xgbe_phy_sfp_detect: eeprom read failed
Furthermore, I applied a logic analyzer to actually see the I2C signal and see how much time it takes to complete an EEPROM read, this always comes up at around 11.5ms. If you are interested I can send over a CSV file containing the data.
FYI, I am working remotely and have limited hardware access till oct 5th. I will try to respond your questions asap anyway.
Understandable given the current situation. I appreciate the effort!
Thanks @stephan.dewt_yahoo.co.uk for the detailed info and sharing the logs.
So, with the increased timeout I believe you are not seeing EERPOM read issue and link always coming up. But, are you still facing the delays in link up often? Is it blocking your progress by anyway?
As I mentioned, we have experienced this delay in link up intermittently. But this EEPROM read issue is something interesting. I will see what is different in my setup. If that timeout increment is the only change, I will set that to 20 ms and include the changes to correct those logs and test to see if there is any other impact and submit the next patch. Please let me know if there are any other changes?
Furthermore, I applied a logic analyzer to actually see the I2C signal and see how much time it takes to complete an EEPROM read, this always comes up at around 11.5ms. If you are interested I can send over a CSV file containing the data.
This is interesting. It would be helpful if you share the details.
So, with the increased timeout I believe you are not seeing EERPOM read issue and link always coming up. But, are you still facing the delays in link up often? Is it blocking your progress by anyway?
The increased timeout indeed eliminates the issue of reading the EEPROM. I don't think this has anything to do with the link issue, as there is (as far as I know) a service timer that executes every second to see if any signals have changed, it then proceeds to read the EEPROM again, regardless of whether the module has changed. We should not see a Link UP after 30 seconds or more (or never) if the service timer executes every second. Could this have something to do with the integration with iflib? My suspicion comes from the fact that the "Link UP" message comes from iflib.
By the way, I also applied the logic analyzer to the Linux driver and saw the exact same results, except no EEPROM read fail....
This is interesting. It would be helpful if you share the details.
Of course, I will upload the files.
As a potential optimization: the PCA9535 has an interrupt signal attached to a GPIO line in the EPYC. Could it be worthwile to investigate replacing the service timer with an interrupt routine?
The increased timeout indeed eliminates the issue of reading the EEPROM. I don't think this has anything to do with the link issue, as there is (as far as I know) a service timer that executes every second to see if any signals have changed, it then proceeds to read the EEPROM again, regardless of whether the module has changed. We should not see a Link UP after 30 seconds or more (or never) if the service timer executes every second.
Just to make sure on my understanding. By link issue, you just mean the delay of more than 30 secs for Link UP? Or delayed link up + link not coming up intermittently? Just wanted to make sure whether you see link not coming up issue after you fix the EEPROM read issue.
Could this have something to do with the integration with iflib? My suspicion comes from the fact that the "Link UP" message comes from iflib.
I don't think IFLIB has a part here. Because the driver also has a Link UP message (on default log level). If that's not shown, the the driver couldn't recognize the link up.
By the way, I also applied the logic analyzer to the Linux driver and saw the exact same results, except no EEPROM read fail....
Interested to know Linux also exhibit similar behavior. I thought Linux also might have the EEPROM read fail (as it same use same timeout value), but not the other. But looks inverse.
As a potential optimization: the PCA9535 has an interrupt signal attached to a GPIO line in the EPYC. Could it be worthwile to investigate replacing the service timer with an interrupt routine?
From my perspective, may it be interrupt routine (or) timer based routine, Link state change will be notified it the driver see the appropriate change in the hardware signals. Please correct me if I am wrong.
And is that connection of PCA9535 Interrupt signal to GPIO pins board specific? If so, how can the driver notified about the GPIO pin to use as interrupt line? That info is needed by the "gpio_alloc_intr_resource" right?
Few more questions,
- Is this issue seen only during hotplug/hotunplug? or even when you have the link fixed, but just do the driver load/unload or a soft link down/up?
- Does enabling verbose logs points to any errors/points when you hit the issue (like it's shown during the EEPROM read timeout)?
- Which BIOS is being used? Is it a CUSTOM one or a standard release?
I will do some more debugging to recreate the issue in my setup and add more logs to debug the link status check path. I will also check internally to check for any known related issues.
Thanks for sharing the I2C logic analyzer details?
Hi Rajesh, apologies for the late reply.
Just to make sure on my understanding. By link issue, you just mean the delay of more than 30 secs for Link UP? Or delayed link up + link not coming up intermittently? Just wanted to make sure whether you see link not coming up issue after you fix the EEPROM read issue.
I mean the delayed link up + link not coming up intermittently.
From my perspective, may it be interrupt routine (or) timer based routine, Link state change will be notified it the driver see the appropriate change in the hardware signals. Please correct me if I am wrong.
And is that connection of PCA9535 Interrupt signal to GPIO pins board specific? If so, how can the driver notified about the GPIO pin to use as interrupt line? That info is needed by the "gpio_alloc_intr_resource" right?
Scratch that idea, it is indeed board-specific. A timer based routine is fine.
Few more questions,
- Is this issue seen only during hotplug/hotunplug? or even when you have the link fixed, but just do the driver load/unload or a soft link down/up?
The issue is seen both during hotplug/hotunplug. As I said, I tried to create a reproducible scenario. During testing this morning, I tried:
- Loading the driver while the module is plugged in caused the link to go UP immediately when assigning an IP address on interface 0 (different from last week, same driver code).
- Assigning an IP address to interface 1, after which I pulled out the module of ax0 and inserted it into ax1 caused the link to go UP immediately.
- Having both interfaces assigned, switching the module from ax1 to ax0 again caused the link to go UP, but only after about 10 seconds.
- Doing this again, the exact same thing happens, except the link goes UP immediately on the third step.
- Does enabling verbose logs points to any errors/points when you hit the issue (like it's shown during the EEPROM read timeout)?
Not as far as I can see.
- Which BIOS is being used? Is it a CUSTOM one or a standard release?
It is a custom BIOS.
I will do some more debugging to recreate the issue in my setup and add more logs to debug the link status check path. I will also check internally to check for any known related issues.
I also tested the link issue with an optical module+cable. This time, I had both interfaces configured beforehand. After inserting the module into ax0, no link is established. Inserting it into ax1 also does nothing:
dmesg | grep Link ax0: link_status returned Link: 0 an_restart: 0 ax1: ax0: Link Deactive ax0: ax1: xgbe_phy_status: Link 0 phy_link 0 Link 0 new_state 0 ax1: link_status returned Link: 0 an_restart: 0 ax1: Link Deactive ax0: link_status returned Link: 0 an_restart: 0 Link Deactive ax0: ax1: xgbe_phy_status: Link 0 ax0: phy_link 0 Link 0 new_state 0 ax1: link_status returned Link: 0 an_restart: 0 ax1: Link Deactive link_status returned Link: 0 an_restart: 0 Link Deactive ax0: xgbe_phy_status: Link 0 ax0: phy_link 0 Link 0 new_state 0 ax1: link_status returned Link: 0 an_restart: 0 ax1: Link Deactive link_status returned Link: 0 an_restart: 0 ax0: ax1: Link Deactive ax1: xgbe_phy_status: Link 0 ax1: phy_link 0 Link 0 new_state 0 ax1: link_status returned Link: 0 an_restart: 0 ax1: Link Deactive ax0: link_status returned Link: 0 an_restart: 0 Link Deactive ax0: xgbe_phy_status: Link 0 ax0: phy_link 0 Link 0 new_state 0 ax1: link_status returned Link: 0 an_restart: 0 ax1: Link Deactive ax0: link_status returned Link: 0 an_restart: 0 ax1: Link Deactive ax1: ax0: xgbe_phy_status: Link 0 phy_link 0 Link 0 new_state 0 ax1: link_status returned Link: 0 an_restart: 0 ax1: Link Deactive ax1: link_status returned Link: 0 an_restart: 0 ax0: Link Deactive ax0: ax1: xgbe_phy_status: Link 0 ax0: ax1: phy_link 0 Link 0 new_state 0 ax1: link_status returned Link: 0 an_restart: 0 ax1: Link Deactive
Interestingly, after unloading the driver and loading it again, it immediately establishes a link, after which the link goes DOWN after a few seconds (exclusively on the optical module).
Furthermore, when running
ifconfig -m ax0
I do not see any option for 1000Base-T. My knowledge on this subject is a little fuzzy, but I would like to connect to a 1Gb switch via RJ45. Autonegotiation should then establish a 1Gbit link. The actual media reported when inserting both the RJ45 SFP+ module and the optical module is 10GBase-KR. I don't think this is correct. Could you elaborate on this?
The issue is seen both during hotplug/hotunplug. As I said, I tried to create a reproducible scenario. During testing this morning, I tried:
In an attempt to recreate the issue, I tried testing the soft link up/down (scripted ifconfig ax0 down/ifconfig ax0 up) for around 500 iteration with a maximum timeout of 10 seconds. I couldn't face any delay or link not coming up. All went good. So, looks it mostly to do with only hotplug/hotunplug scenarios. Your comment about optical testing also adds to this point.
Unfortunately, I couldn't do the hotplug test until Oct 5th with my remote setup as mentioned before. Sorry about that. I will do my hotplug testing next week and see whether I can recreate the problem. Until then, I will see if I can recreate the problem somehow.
It is a custom BIOS.
I assume, there is no issues from BIOS side.
I also tested the link issue with an optical module+cable. This time, I had both interfaces configured beforehand. After inserting the module into ax0, no link is established. Inserting it into ax1 also does nothing:
We have tested with optical cable as well, but at minimal level compared to RJ45. I will try optical cable also in my testing next week for recreating the issue.
I do not see any option for 1000Base-T. My knowledge on this subject is a little fuzzy, but I would like to connect to a 1Gb switch via RJ45. Autonegotiation should then establish a 1Gbit link. The actual media reported when inserting both the RJ45 SFP+ module and the optical module is 10GBase-KR. I don't think this is correct. Could you elaborate on this?
We have a code issue here. Currently, we are hard coding (in media_status call back) the media type to 10GBase-KR if it's a 10G link irrespective of whatever module being plugged. Also, we haven't added 1000Base-T in the supported media(ifmedia_add). That is why you are not seeing 1000Base-T in ifconfig -m output. I will make the appropriate changes (along with the previous discussed changes) and submit a new patch by end of today.
We have tested 10G, 1G, back-to-back and switch configs. But I don't remember testing a 10G Link with a 1G switch and autoneg to 1G link. I will try to do this test as well next week. But from a code perspective, this should work.
Summarizing, I will give an updated patch today. And I will do some more testing next week. Meanwhile, if I have any clues/inputs I will update accordingly.
Hi Rajesh,
Unfortunately, I couldn't do the hotplug test until Oct 5th with my remote setup as mentioned before. Sorry about that. I will do my hotplug testing next week and see whether I can recreate the problem. Until then, I will see if I can recreate the problem somehow.
That's no problem at all. The link issue is indeed hotplug-related. I will also continue to see if I can recreate the problem. I suspect that it is a timing issue, so I'm logging everything to see if I can find the issue.
We have a code issue here. Currently, we are hard coding (in media_status call back) the media type to 10GBase-KR if it's a 10G link irrespective of whatever module being plugged. Also, we haven't added 1000Base-T in the supported media(ifmedia_add). That is why you are not seeing 1000Base-T in ifconfig -m output. I will make the appropriate changes (along with the previous discussed changes) and submit a new patch by end of today.
I suspected such a thing, thanks for confirming.
We have tested 10G, 1G, back-to-back and switch configs. But I don't remember testing a 10G Link with a 1G switch and autoneg to 1G link. I will try to do this test as well next week. But from a code perspective, this should work.
I have confirmed this to be working (the sfp+ module does the job of autonegotiation for us).
Thanks for all the hard work, I look forward to the patch.
Changes:
- Increased timeout from 1 to 20 for SFP EEPROM read
- Changes to add supports for appropriate media types
- Changes to list appropriate media type in ifconfig output
- Logging changes.
Hi,
Thanks for the update.
It seems that you've also include non related changes (likely a mis-merge like the IOMMU/DMAR rename in GENERIC).
After fixing this I'll commit this driver and we can take care of the latest bugs later in the tree, having it in tree will mean that it will be easier for people to
report bugs.
Thanks.
Thanks for pointing the mis-merge @manu. Sorry about that.
Shall I update it? Or you are planning to update yourself and commit?
If you can update it that would be great, otherwise I'll handle this when commiting later today or tomorow.
Hi @manu. Fixed the GENERIC file. Please let me know if anything else needs to be taken care.
Just did a build test for arm64 before commiting :
ld: error: undefined symbol: xgbe_init_function_ptrs_phy_v1
referenced by if_axgbe.c
if_axgbe.o:(xgbe_v1)
ld: error: undefined symbol: bitrev32
referenced by xgbe-dev.c:823 (/usr/home/manu/Work/freebsd/freebsd-svn/base/head/sys/dev/axgbe/xgbe-dev.c:823)
xgbe-dev.o:(xgbe_update_vlan_hash_table)
- kernel.bin ---
Will fix tomorow if you didn't.
Hi @manu,
I am trying to build for arm64. It's taking a lot of time to buildworld (even with 16 threads). I will see how it goes.
Just to make sure, where the changes in the following files are good.
- sys/amd64/conf/GENERIC - added "device axp" - should this be "axgbe" as mentioned in below files?
- sys/conf/NOTES - added "device axgbe" (as this is common to both amd64 and arm64). Should this be "axp" specifically?
- sys/conf/files.amd64 - added the files for if_axp alone for now. Will it have any trouble with arm boards?
- sys/modules/Makefile - added "axgbe". Should this be "axp" specifically?
axp is correct since it's what the driver is named in sys/conf/files.
- sys/conf/NOTES - added "device axgbe" (as this is common to both amd64 and arm64). Should this be "axp" specifically?
Oh yeah, this should be axp and this should be in sys/amd64/conf/NOTES and sys/arm64/conf/NOTES since axp is amd64 or arm64 only iirc.
- sys/conf/files.amd64 - added the files for if_axp alone for now. Will it have any trouble with arm boards?
Yeah, every common files between axp and axa should be in sys/conf/files
AMD64 only files should be in sys/conf/files.amd64 and arm64 only files in sys/conf/files.arm64
- sys/modules/Makefile - added "axgbe". Should this be "axp" specifically?
No as there is subdirs.
Finally buildworld completes after around 3 hours. Looks like the issue is because of the missing files mentioned in files.<arch>
axp is correct since it's what the driver is named in sys/conf/files.
Corresponding change made in sys/arm64/conf/GENERIC as well.
Oh yeah, this should be axp and this should be in sys/amd64/conf/NOTES and sys/arm64/conf/NOTES since axp is amd64 or arm64 only iirc.
Necessary change done in sys/amd64/conf/NOTES and sys/arm64/conf/NOTES
Yeah, every common files between axp and axa should be in sys/conf/files
AMD64 only files should be in sys/conf/files.amd64 and arm64 only files in sys/conf/files.arm64
Moved the common files to sys/conf/files. But it's still breaking. If we have the common files placed in both files.amd64 and files.arm64, things are compiling properly. So, can I retain the common files in both?
Also, noticed a "unused variable" error in phy-v1.c, which is also corrected now.
I am doing a clean run now. Once that's complete I will update the patch.
Regarding the link issue you are facing, can you try setting "an_cdr_workaround = 0" in if_axgbe_pci.c in appropriate version and give a try?
Hello Rajesh,
Regarding the link issue you are facing, can you try setting "an_cdr_workaround = 0" in if_axgbe_pci.c in appropriate version and give a try?
Unfortunately, this does not seem to fix the issue.
I have been testing with the fiber module and noticed different behaviour based on the link partner. Using a thunderbolt-to-SFP+ module and connecting this end to the SFP+ port on the board resulted in the driver recognizing a link, but hotplugging after this link seems to break the driver. By this I mean that after hotplugging, the external SFP module is being power-cycled (RRC) every 10 seconds in order to detect a change in link status, but the result always ends up in a PCS register read of 0xc2 (1100 0010), which according to the IEEE specification of the PCS, means no link, and a fault condition has been detected.
Plugging the other end into a 10Gb Switch (slower equipment) after a reboot, with the driver also being loaded on early boot, seems to create a very stable connection, although I have noticed the same issue happening on very rare occassions.
With this stable connection I see a Link UP happening everytime I reconnect the module.
Interestingly, when I plug in a fiber connection from port 0 directly TO port 1, the driver alternates between Link UP and Link DOWN on interface 0 and interface 1 respectively. This behaviour should not occur. Note that this also happens on Linux.
Linux:
[ 425.402969] amd-xgbe 0000:07:00.4 enp7s0f4: Link is Down [ 425.433840] amd-xgbe 0000:07:00.5 enp7s0f5: Link is Up - 10Gbps/Full - flow control rx/tx [ 425.433856] IPv6: ADDRCONF(NETDEV_CHANGE): enp7s0f5: link becomes ready [ 436.678755] amd-xgbe 0000:07:00.5 enp7s0f5: Link is Down [ 437.670487] amd-xgbe 0000:07:00.4 enp7s0f4: Link is Up - 10Gbps/Full - flow control rx/tx [ 437.670503] IPv6: ADDRCONF(NETDEV_CHANGE): enp7s0f4: link becomes ready [ 448.935356] amd-xgbe 0000:07:00.4 enp7s0f4: Link is Down [ 448.966479] amd-xgbe 0000:07:00.5 enp7s0f5: Link is Up - 10Gbps/Full - flow control rx/tx [ 460.230995] amd-xgbe 0000:07:00.5 enp7s0f5: Link is Down
FreeBSD:
ax1: Link is DOWN ax0: Link is UP - 10Gbps/Full - flow control rx/tx ax0: Link is DOWN ax1: Link is UP - 10Gbps/Full - flow control rx/tx ax1: Link is DOWN ax1: Link is UP - 10Gbps/Full - flow control rx/tx ax1: Link is DOWN ax1: Link is UP - 10Gbps/Full - flow control rx/tx ax1: Link is DOWN ax1: Link is UP - 10Gbps/Full - flow control rx/tx ax1: Link is DOWN ax1: Link is UP - 10Gbps/Full - flow control rx/tx ax1: Link is DOWN ax1: Link is UP - 10Gbps/Full - flow control rx/tx ax1: Link is DOWN ax1: Link is UP - 10Gbps/Full - flow control rx/tx ax1: Link is DOWN ax1: Link is UP - 10Gbps/Full - flow control rx/tx ax1: Link is DOWN ax1: Link is UP - 10Gbps/Full - flow control rx/tx ax1: Link is DOWN ax1: Link is UP - 10Gbps/Full - flow control rx/tx ax1: Link is DOWN
The Linux driver is the stock module based on kernel 5.4.0.
I have been trying the hotplug/hotunplug experiment in my setup which is a back-to-back connection between the AMD SFP ports and an external link partner. As I have mentioned earlier, Intermittently I could see the delay in link coming up, and when that happens, I observe the similar behavior of PCS register having value of 0xC2 (even though sfp gpio signals are sensing the link immediately). We expect a value of 0x46 (bit[2] to be set basically). One possible reason could be the CDR workaround. But that option is ruled out. Adding to it, I see the delay only when the SFP module is touched, but not when just the link is hotplugged/unplugged. But I haven't experienced a scenario where the link doesn't come up at all.
Talking to my peers here, looks like there is a similar issue being reported internally. That seems related to Auto-negotiation and PCS hardware. But, I am yet to get the full details of the same. I will update once I get the details.
Regarding the port0 - port1 direct connection back-to-back. I just did a quick test in my setup here. I couldn't see the toggling issue. As expected, both side link goes down when I pull the cable/SFP module just one side and coming up. But thanks for bringing this, let me do some more testing to see the problem.
If there are no other comments on the current code, let's have it upstream and then we shall have a ticket opened for the same and work on it. Let me know if you have any more comments on the current code.
Hi Rajesh,
Regarding the port0 - port1 direct connection back-to-back. I just did a quick test in my setup here. I couldn't see the toggling issue. As expected, both side link goes down when I pull the cable/SFP module just one side and coming up. But thanks for bringing this, let me do some more testing to see the problem.
Keep in mind that this test was performed using a fiber connection. Using an SFP module with a copper cable (RJ45) produces a different result: only one port establishes a link, the other port does not, this does not seem to change over time.
Talking to my peers here, looks like there is a similar issue being reported internally. That seems related to Auto-negotiation and PCS hardware. But, I am yet to get the full details of the same. I will update once I get the details.
An update would be great, please keep me informed.
If there are no other comments on the current code, let's have it upstream and then we shall have a ticket opened for the same and work on it. Let me know if you have any more comments on the current code.
Could you update the patch to include support for netmap? I heard you tested this before and I would like to test this as well.
Talking to my peers here, looks like there is a similar issue being reported internally. That seems related to Auto-negotiation and PCS hardware. But, I am yet to get the full details of the same. I will update once I get the details.
Regarding this, It looks like the same issue. But, this is reported specifically when the ports work in KR mode and Auto-negotiation(CL73) is enabled. This issue is still being worked on internally.
In my setup, port_mode is SFP, and sfp_base is XGBE_SFP_BASE_10000_SR, where auto-negotiation is disabled. But I still see the same behavior. From earlier logs, your port mode is also SFP, but not sure about sfp_base and whether Auto-negotiation is enabled. Can you please check that (enabling verbose logs during hotplug would give that info)? It would be good if you can share a verbose log capture from your setup (starting from driver load).
I will discuss this internally and verify whether they are same issue and take it forward accordingly.
Apart from this, Is there any other comments on the current code?
Hi Rajesh,
In my setup, port_mode is SFP, and sfp_base is XGBE_SFP_BASE_10000_SR, where auto-negotiation is disabled. But I still see the same behavior. From earlier logs, your port mode is also SFP, but not sure about sfp_base and whether Auto-negotiation is enabled. Can you please check that (enabling verbose logs during hotplug would give that info)? It would be good if you can share a verbose log capture from your setup (starting from driver load).
Of course, here are the relevant hotplug logs:
<<Module is inserted and I2C transfer begins>> ax0: i2c xfer finished ---->>> ax0: xgbe_i2c_disable: final i2c_disable 0 ax0: xgbe_phy_i2c_read: ret2 0 retry 1 ax0: sfp_base[XGBE_SFP_BASE_ID] : 0x0003 ax0: sfp_base[XGBE_SFP_BASE_EXT_ID] : 0x0004 ax0: sfp_base[XGBE_SFP_BASE_CABLE] : 0x0000 ax0: SFP detected: ax0: vendor: Uptimed ax0: part number: UP-TR-SR-CI ax0: revision level: V02 ax0: serial number: UPC18SR260024 ax0: i2c xfer started ---->>> ax0: xgbe_i2c_disable: final i2c_disable 0 ax0: xgbe_i2c_isr: isr 0x10 ax0: xgbe_i2c_isr: I2C interrupt status=0x00000010 ax0: xgbe_i2c_read: op cmd 1 ax0: xgbe_i2c_write: tx_slots 15 tx_len 1 ax0: xgbe_i2c_write: cmd 0 tx_len 1 ax0: xgbe_i2c_isr: ret 0 stop 0 ax0: xgbe_i2c_isr: isr 0x710 ax0: xgbe_i2c_isr: I2C interrupt status=0x00000710 ax0: xgbe_i2c_read: op cmd 1 ax0: xgbe_i2c_write: tx_slots 15 tx_len 0 ax0: xgbe_i2c_isr: ret 0 stop 1 ax0: xgbe_i2c_xfer: I2C OP complete ax0: xgbe_i2c_xfer: i2c xfer ret 0 abrt_source 0x0 ax0: i2c xfer finished ---->>> ax0: xgbe_i2c_disable: final i2c_disable 0 ax0: xgbe_phy_sfp_parse_eeprom: sfp_base: 0x5 sfp_speed: 0x3 sfp_cable: 0x1 rx_los 0x0 tx_fault 0x0 ax0: xgbe_phy_sfp_external_phy: sfp_changed: 0x1 ax0: xgbe_phy_sfp_phy_settings: link speed 2 spf_base 0x5 pause_autoneg 0 advert 0xa000 support 0xa000 ax0: xgbe_phy_sfp_detect: phy speed: 0x2 duplex: 0x2 autoneg: 0x0 pause_autoneg: 0x0 ax0: xgbe_phy_link_status: SFP changed observed ax0: link_status returned Link:0 an_restart:1 aneg:1 ax0: xgbe_phy_find_phy_device: phydev 0 phydev_mode 1 sfp_phy_avail 0 phy_id 0x00000000 ax0: xgbe_phy_find_phy_device: port_mode 8 avail 0 ax0: xgbe_phy_an_config: find_phy_device return Success. ax0: fixed PHY configuration ax0: link_aneg - 0 ax0: xgbe_phy_link_status: calling phy detect
As you can see, the sfp_base is XGBE_SFP_BASE_10000_SR also (sfp_base 0x5). As you can see, a fixed AN configuration is called, meaning auto-negotiation is disabled (correct me if I'm wrong). In any case, a sysctl query reveals that autonegotiation is turned off:
dev.ax.0.pauseparam_info: Autonegotiate: off RX: on TX: on
Here are the driver load logs: (YDEBUG)
ax0: <AMD 10 Gigabit Ethernet Driver> mem 0x80160000-0x8017ffff,0x80140000-0x8015ffff,0x80188000-0x80189fff at device 0.4 on pci7 ax0: axgbe_if_attach_pre: Device ID: 0x1458 ax0: xpcs window def : 0x00009060 ax0: xpcs window sel : 0x00009064 ax0: xpcs window : 0x0000b000 ax0: xpcs window size : 0x00001000 ax0: xpcs window mask : 0x00000fff ax0: port property 0 = 0x15800800 ax0: port property 1 = 0x10100c10 ax0: port property 2 = 0x00100010 ax0: port property 3 = 0x2dc0e100 ax0: port property 4 = 0x00801c03 ax0: max tx/rx channel count = 16/16 ax0: max tx/rx hw queue count = 16/12 ax0: -->xgbe_get_all_hw_features ax0: xgbe_get_all_hw_features: Tx fifo 0x40000 Rx fifo 0x40000 ax0: Hardware features: ax0: 1GbE support : yes ax0: VLAN hash filter : yes ax0: MDIO interface : yes ax0: Wake-up packet support : no ax0: Magic packet support : no ax0: Management counters : yes ax0: ARP offload : yes ax0: IEEE 1588-2008 Timestamp : yes ax0: Energy Efficient Ethernet : yes ax0: TX checksum offload : yes ax0: RX checksum offload : yes ax0: Additional MAC addresses : 31 ax0: Timestamp source : internal/external ax0: SA/VLAN insertion : yes ax0: RX fifo size : 262144 ax0: TX fifo size : 262144 ax0: IEEE 1588 high word : yes ax0: DMA width : 48 ax0: Data Center Bridging : yes ax0: Split header : yes ax0: TCP Segmentation Offload : yes ax0: Debug memory interface : yes ax0: Receive Side Scaling : yes ax0: Traffic Class count : 8 ax0: Hash table size : 256 ax0: L3/L4 Filters : 8 ax0: RX queue count : 12 ax0: TX queue count : 16 ax0: RX DMA channel count : 16 ax0: TX DMA channel count : 16 ax0: PPS outputs : 0 ax0: Auxiliary snapshot inputs : 0 ax0: <--xgbe_get_all_hw_features ax0: ncpu 4 intrcpu 4 ax0: TX/RX max channel count = 16/16 ax0: TX/RX max queue count = 16/12 ax0: TX/RX DMA ring count = 4/4 ax0: TX/RX hardware queue count = 4/12 ax0: max tx/rx max fifo size = 229376/229376 ax0: axgbe_alloc_channels: txqs 4 rxqs 4 ax0: Channel count set to: 4 ax0: Using 512 TX descriptors and 512 RX descriptors ax0: Using 4 RX queues 4 TX queues ax0: Using MSI-X interrupts with 8 vectors ax0: axgbe_if_attach_post: tx fifo 0x38000 rx fifo 0x38000 ax0: adjusted TX 4/4 RX 4/12 ax0: Channel count set to: 4 ax0: -->xgbe_phy_init ax0: port mode=8 ax0: port id=0 ax0: port speeds=0xb ax0: conn type=1 ax0: mdio addr=0 ax0: xgbe_phy_init: redrv addr=0 redrv i/f=1 ax0: xgbe_phy_init: port mode 8 ax0: SFP: mux_address=0x73 ax0: SFP: mux_channel=0 ax0: SFP: gpio_address=0x21 ax0: SFP: gpio_mask=0x2 ax0: SFP: gpio_rx_los=13 ax0: SFP: gpio_tx_fault=14 ax0: SFP: gpio_mod_absent=12 ax0: SFP: gpio_rate_select=0 ax0: xgbe_phy_init: start 7 mode 1 adv 0x0 ax0: xgbe_phy_init: conn type 1 mode 1 ax0: xgbe_phy_init: return success ax0: -->xgbe_init_rx_coalesce ax0: <--xgbe_init_rx_coalesce ax0: -->xgbe_init_tx_coalesce ax0: <--xgbe_init_tx_coalesce ax0: axgbe_if_attach_post: rx_buf_size 1536 ax0: mtu 1500 ax1: <AMD 10 Gigabit Ethernet Driver> mem 0x80120000-0x8013ffff,0x80100000-0x8011ffff,0x8018a000-0x8018bfff at device 0.5 on pci7 ax1: axgbe_if_attach_pre: Device ID: 0x1458 ax1: xpcs window def : 0x00009060 ax1: xpcs window sel : 0x00009064 ax1: xpcs window : 0x0000b000 ax1: xpcs window size : 0x00001000 ax1: xpcs window mask : 0x00000fff ax1: port property 0 = 0x15800801 ax1: port property 1 = 0x10100c10 ax1: port property 2 = 0x00100010 ax1: port property 3 = 0x2980a100 ax1: port property 4 = 0x00801c13 ax1: max tx/rx channel count = 16/16 ax1: max tx/rx hw queue count = 16/12 ax1: -->xgbe_get_all_hw_features ax1: xgbe_get_all_hw_features: Tx fifo 0x40000 Rx fifo 0x40000 ax1: Hardware features: ax1: 1GbE support : yes ax1: VLAN hash filter : yes ax1: MDIO interface : yes ax1: Wake-up packet support : no ax1: Magic packet support : no ax1: Management counters : yes ax1: ARP offload : yes ax1: IEEE 1588-2008 Timestamp : yes ax1: Energy Efficient Ethernet : yes ax1: TX checksum offload : yes ax1: RX checksum offload : yes ax1: Additional MAC addresses : 31 ax1: Timestamp source : internal/external ax1: SA/VLAN insertion : yes ax1: RX fifo size : 262144 ax1: TX fifo size : 262144 ax1: IEEE 1588 high word : yes ax1: DMA width : 48 ax1: Data Center Bridging : yes ax1: Split header : yes ax1: TCP Segmentation Offload : yes ax1: Debug memory interface : yes ax1: Receive Side Scaling : yes ax1: Traffic Class count : 8 ax1: Hash table size : 256 ax1: L3/L4 Filters : 8 ax1: RX queue count : 12 ax1: TX queue count : 16 ax1: RX DMA channel count : 16 ax1: TX DMA channel count : 16 ax1: PPS outputs : 0 ax1: Auxiliary snapshot inputs : 0 ax1: <--xgbe_get_all_hw_features ax1: ncpu 4 intrcpu 4 ax1: TX/RX max channel count = 16/16 ax1: TX/RX max queue count = 16/12 ax1: TX/RX DMA ring count = 4/4 ax1: TX/RX hardware queue count = 4/12 ax1: max tx/rx max fifo size = 229376/229376 ax1: axgbe_alloc_channels: txqs 4 rxqs 4 ax1: Channel count set to: 4 ax1: Using 512 TX descriptors and 512 RX descriptors ax1: Using 4 RX queues 4 TX queues ax1: Using MSI-X interrupts with 8 vectors ax1: axgbe_if_attach_post: tx fifo 0x38000 rx fifo 0x38000 ax1: adjusted TX 4/4 RX 4/12 ax1: Channel count set to: 4 ax1: -->xgbe_phy_init ax1: port mode=8 ax1: port id=1 ax1: port speeds=0xb ax1: conn type=1 ax1: mdio addr=0 ax1: xgbe_phy_init: redrv addr=0 redrv i/f=1 ax1: xgbe_phy_init: port mode 8 ax1: SFP: mux_address=0x73 ax1: SFP: mux_channel=1 ax1: SFP: gpio_address=0x21 ax1: SFP: gpio_mask=0x2 ax1: SFP: gpio_rx_los=9 ax1: SFP: gpio_tx_fault=10 ax1: SFP: gpio_mod_absent=8 ax1: SFP: gpio_rate_select=0 ax1: xgbe_phy_init: start 7 mode 1 adv 0x0 ax1: xgbe_phy_init: conn type 1 mode 1 ax1: xgbe_phy_init: return success ax1: -->xgbe_init_rx_coalesce ax1: <--xgbe_init_rx_coalesce ax1: -->xgbe_init_tx_coalesce ax1: <--xgbe_init_tx_coalesce ax1: axgbe_if_attach_post: rx_buf_size 1536 ax1: mtu 1500
Apart from this, Is there any other comments on the current code?
I have no futher comments on the code right now.
Thanks @manu for committing this driver.
@stephan.dewt_yahoo.co.uk, Thanks for your logs and details. Looks your setup is more or less similar. I will followup regarding the link issue and update accordingly.
Regarding the netmap support, I will provide the details and patch on a seperate review.
Thanks for you work :)
Could I ask for a manpage too ?
Otherwise I'll write a small one this week.
@stephan.dewt_yahoo.co.uk, Thanks for your logs and details. Looks your setup is more or less similar. I will followup regarding the link issue and update accordingly.
Regarding the netmap support, I will provide the details and patch on a seperate review.
Could I ask for a manpage too ?
Yes, I will have one manpage prepared this week and place it for review @manu
head/sys/dev/axgbe/xgbe-phy-v2.c | ||
---|---|---|
2744 ↗ | (On Diff #78110) | GCC points out that this statement does nothing, XGBE_ADV expands to ((_phy)->advertising & ADVERTISED_##_mode) |
2746 ↗ | (On Diff #78110) | Ditto above re: XGBE_ADV |
2756 ↗ | (On Diff #78110) | Ditto above re: XGBE_ADV |
2772 ↗ | (On Diff #78110) | GCC also seems to think that common_adv_gb can be uninitialized here. I note that it's only initialized above if pdata->phy.supported == SUPPORTED_1000baseT_Half || pdata->phy.supported == SUPPORTED_1000baseT_Full, so I suspect that it's not wrong. |
Hi @kevans
GCC points out that this statement does nothing, XGBE_ADV expands to ((_phy)->advertising & ADVERTISED_##_mode)
Thanks for pointing this. I will fix this sometime this week. I will correct that "common_adv_gb" to be initialized to zero. This code is for direct RJ45 ports (non-SFP), which we need some more testing. I am not sure whether any hardware available in market with EPYC processors and direct RJ45 ports.
I will post the above said changes as a seperate review, since this review is closed.
@kevans, Is it possible to share your hardware details? do you have a test hardware with RJ45 ports? Any specific reason why it's been compiled with GCC?