Message ID | cover.1647904780.git.pisa@cmp.felk.cvut.cz (mailing list archive) |
---|---|
Headers | show |
Series | CTU CAN FD open-source IP core SocketCAN driver, PCI, platform integration and documentation | expand |
On 22.03.2022 00:32:27, Pavel Pisa wrote: > This driver adds support for the CTU CAN FD open-source IP core. > More documentation and core sources at project page > (https://gitlab.fel.cvut.cz/canbus/ctucanfd_ip_core). > The core integration to Xilinx Zynq system as platform driver > is available (https://gitlab.fel.cvut.cz/canbus/zynq/zynq-can-sja1000-top). > Implementation on Intel FPGA based PCI Express board is available > from project (https://gitlab.fel.cvut.cz/canbus/pcie-ctucanfd). > The CTU CAN FD core emulation send for review for QEMU mainline. > Development repository for QEMU emulation - ctu-canfd branch of > https://gitlab.fel.cvut.cz/canbus/qemu-canbus > > More about CAN bus related projects used and developed at CTU FEE > on the guidepost page http://canbus.pages.fel.cvut.cz/ . The driver looks much better now. Good work. Please have a look at the TX path of the mcp251xfd driver, especially the tx_stop_queue and tx_wake_queue in mcp251xfd_start_xmit() and mcp251xfd_handle_tefif(). A lockless implementation should work in your hardware, too. BTW: The PROP_SEG/PHASE_SEG1 issue is known: > +A curious reader will notice that the durations of the segments PROP_SEG > +and PHASE_SEG1 are not determined separately but rather combined and > +then, by default, the resulting TSEG1 is evenly divided between PROP_SEG > +and PHASE_SEG1. and the flexcan IP core in CAN-FD mode has the same problem. When working on the bit timing parameter, I'll plan to have separate PROP_SEG/PHASE_SEG1 min/max in the kernel, so that the bit timing algorithm can take care of this. regards, Marc
Hello Marc, thanks for positive reply for our years effort. On Tuesday 22 of March 2022 08:46:22 Marc Kleine-Budde wrote: > On 22.03.2022 00:32:27, Pavel Pisa wrote: > > This driver adds support for the CTU CAN FD open-source IP core. > > The driver looks much better now. Good work. Please have a look at the > TX path of the mcp251xfd driver, especially the tx_stop_queue and > tx_wake_queue in mcp251xfd_start_xmit() and mcp251xfd_handle_tefif(). A > lockless implementation should work in your hardware, too. Is this blocker for now? I would like to start with years tested base. We have HW timestamping implemented for actual stable CTU CAN FD IP core version, support for variable number of TX buffers which count can be parameterized up to 8 in the prepared version and long term desire to configurable-SW defined multi-queue which our HW interface allows to dynamically server by á TX buffers. But plan is to keep combinations of the design and driver compatible from the actual revision. I would be happy if we can agree on some base/minimal support and get it into mainline and use it as base for the followup patch series. I understand that I have sent code late for actual merge window, but I am really loaded by teaching, related RISC-V simulator https://github.com/cvut/qtrvsim , ESA and robotic projects at company. So I would prefer to go step by step and cooperate on updates and testing with my diploma students. > BTW: The PROP_SEG/PHASE_SEG1 issue is known: > > +A curious reader will notice that the durations of the segments PROP_SEG > > +and PHASE_SEG1 are not determined separately but rather combined and > > +then, by default, the resulting TSEG1 is evenly divided between PROP_SEG > > +and PHASE_SEG1. > > and the flexcan IP core in CAN-FD mode has the same problem. When > working on the bit timing parameter, I'll plan to have separate > PROP_SEG/PHASE_SEG1 min/max in the kernel, so that the bit timing > algorithm can take care of this. Hmm, when I have thought about that years ago I have not noticed real difference when time quanta is move between PROP_SEG and PHASE_SEG1. So for me it had no influence on the algorithm computation and could be done on the chip level when minimal and maximal sum is respected. But may it be I have overlooked something and there is difference for CAN FD. May it be my colleagues Jiri Novak and Ondrej Ille are more knowable. As for the optimal timequantas per bit value, I agree that it is not so simple. In the fact SJW and even tipple-sampling should be defined in percentage of bit time and then all should be optimized together and even combination with slight bitrate error should be preferred against other exact matching when there is significant difference in the other parameters values. But I am not ready to dive into it till our ESA space NanoXplore FPGA project passes final stage... By the way we have received report from Andrew Dennison about successful integration of CTU CAN FD into Litex based RISC-V system. Tested with the Linux our Linux kernel driver. The first iteration there, but he reported that some corrections from his actual development needs to be added to the public repo still to be usable out of the box https://github.com/AndrewD/litecan Best wishes, Pavel Pisa phone: +420 603531357 e-mail: pisa@cmp.felk.cvut.cz Department of Control Engineering FEE CVUT Karlovo namesti 13, 121 35, Prague 2 university: http://dce.fel.cvut.cz/ personal: http://cmp.felk.cvut.cz/~pisa projects: https://www.openhub.net/accounts/ppisa CAN related:http://canbus.pages.fel.cvut.cz/ Open Technologies Research Education and Exchange Services https://gitlab.fel.cvut.cz/otrees/org/-/wikis/home
On 22.03.22 09:18, Pavel Pisa wrote: > Hello Marc, > > thanks for positive reply for our years effort. > > On Tuesday 22 of March 2022 08:46:22 Marc Kleine-Budde wrote: >> On 22.03.2022 00:32:27, Pavel Pisa wrote: >>> This driver adds support for the CTU CAN FD open-source IP core. >> >> The driver looks much better now. Good work. Please have a look at the >> TX path of the mcp251xfd driver, especially the tx_stop_queue and >> tx_wake_queue in mcp251xfd_start_xmit() and mcp251xfd_handle_tefif(). A >> lockless implementation should work in your hardware, too. > > Is this blocker for now? I would like to start with years tested base. > > We have HW timestamping implemented for actual stable CTU CAN FD IP core > version, support for variable number of TX buffers which count can be > parameterized up to 8 in the prepared version and long term desire to > configurable-SW defined multi-queue which our HW interface allows to > dynamically server by á TX buffers. But plan is to keep combinations > of the design and driver compatible from the actual revision. > > I would be happy if we can agree on some base/minimal support and get > it into mainline and use it as base for the followup patch series. IMHO I would vote for this approach too. There are many users of that open source IP CAN core right now and the out-of-tree maintenance is no fun for all of them. When the driver status is fine from the technical and programming style standpoint we should move the improvements for the lockless transmissions to a later date. Best regards, Oliver > > I understand that I have sent code late for actual merge window, > but I am really loaded by teaching, related RISC-V simulator > https://github.com/cvut/qtrvsim , ESA and robotic projects > at company. So I would prefer to go step by step and cooperate > on updates and testing with my diploma students. > >> BTW: The PROP_SEG/PHASE_SEG1 issue is known: >>> +A curious reader will notice that the durations of the segments PROP_SEG >>> +and PHASE_SEG1 are not determined separately but rather combined and >>> +then, by default, the resulting TSEG1 is evenly divided between PROP_SEG >>> +and PHASE_SEG1. >> >> and the flexcan IP core in CAN-FD mode has the same problem. When >> working on the bit timing parameter, I'll plan to have separate >> PROP_SEG/PHASE_SEG1 min/max in the kernel, so that the bit timing >> algorithm can take care of this. > > Hmm, when I have thought about that years ago I have not noticed real > difference when time quanta is move between PROP_SEG and PHASE_SEG1. > So for me it had no influence on the algorithm computation and > could be done on the chip level when minimal and maximal sum is > respected. But may it be I have overlooked something and there is > difference for CAN FD. May it be my colleagues Jiri Novak and > Ondrej Ille are more knowable. > > As for the optimal timequantas per bit value, I agree that it > is not so simple. In the fact SJW and even tipple-sampling > should be defined in percentage of bit time and then all should > be optimized together and even combination with slight bitrate > error should be preferred against other exact matching when > there is significant difference in the other parameters values. > > But I am not ready to dive into it till our ESA space NanoXplore > FPGA project passes final stage... > > By the way we have received report from Andrew Dennison about > successful integration of CTU CAN FD into Litex based RISC-V > system. Tested with the Linux our Linux kernel driver. > > The first iteration there, but he reported that some corrections > from his actual development needs to be added to the public > repo still to be usable out of the box > > https://github.com/AndrewD/litecan > > Best wishes, > > Pavel Pisa > phone: +420 603531357 > e-mail: pisa@cmp.felk.cvut.cz > Department of Control Engineering FEE CVUT > Karlovo namesti 13, 121 35, Prague 2 > university: http://dce.fel.cvut.cz/ > personal: http://cmp.felk.cvut.cz/~pisa > projects: https://www.openhub.net/accounts/ppisa > CAN related:http://canbus.pages.fel.cvut.cz/ > Open Technologies Research Education and Exchange Services > https://gitlab.fel.cvut.cz/otrees/org/-/wikis/home >
On 22.03.2022 09:18:32, Pavel Pisa wrote: > > The driver looks much better now. Good work. Please have a look at the > > TX path of the mcp251xfd driver, especially the tx_stop_queue and > > tx_wake_queue in mcp251xfd_start_xmit() and mcp251xfd_handle_tefif(). A > > lockless implementation should work in your hardware, too. > > Is this blocker for now? I would like to start with years tested base. Makes sense. > We have HW timestamping implemented for actual stable CTU CAN FD IP core > version, support for variable number of TX buffers which count can be > parameterized up to 8 in the prepared version and long term desire to > configurable-SW defined multi-queue which our HW interface allows to > dynamically server by á TX buffers. But plan is to keep combinations > of the design and driver compatible from the actual revision. Is the number of RX and TX buffers and TX queues auto detectable by software, or do we need DT bindings for this? > I would be happy if we can agree on some base/minimal support and get > it into mainline and use it as base for the followup patch series. > > I understand that I have sent code late for actual merge window, > but I am really loaded by teaching, related RISC-V simulator > https://github.com/cvut/qtrvsim , ESA and robotic projects > at company. So I would prefer to go step by step and cooperate > on updates and testing with my diploma students. The net-next merge window closed with Monday evening, so this driver will go into net-next for v5.19. > > BTW: The PROP_SEG/PHASE_SEG1 issue is known: > > > +A curious reader will notice that the durations of the segments PROP_SEG > > > +and PHASE_SEG1 are not determined separately but rather combined and > > > +then, by default, the resulting TSEG1 is evenly divided between PROP_SEG > > > +and PHASE_SEG1. > > > > and the flexcan IP core in CAN-FD mode has the same problem. When > > working on the bit timing parameter, I'll plan to have separate > > PROP_SEG/PHASE_SEG1 min/max in the kernel, so that the bit timing > > algorithm can take care of this. > > Hmm, when I have thought about that years ago I have not noticed real > difference when time quanta is move between PROP_SEG and PHASE_SEG1. > So for me it had no influence on the algorithm computation and > could be done on the chip level when minimal and maximal sum is > respected. But may it be I have overlooked something and there is > difference for CAN FD. May it be my colleagues Jiri Novak and > Ondrej Ille are more knowable. Jiri, Ondrej, I'm interested in details :) > As for the optimal timequantas per bit value, I agree that it > is not so simple. In the fact SJW and even tipple-sampling > should be defined in percentage of bit time I thought of specifying SJW in a relative value, too. > and then all should be optimized together SJW has no influence on the bit rate and sample point. Although SJW is max phase seg 2, so it's maximum is influenced by the sample point. > and even combination with slight bitrate error should be preferred > against other exact matching when there is significant difference in > the other parameters values. Since linux-4.8, i.e. | 7da29f97d6c8 can: dev: can-calc-bit-timing(): better sample point calculation the algorithm optimizes for best bit minimal absolute bit rate error first and then for minimal sample point error. The sample point must be <= the nominal sample point. I have some experiments where the algorithm optimizes the absolute sample point error. For more complicated bit timing optimization you need to combine the bitrate error and the sample point error into a function that needs to be minimized. > But I am not ready to dive into it till our ESA space NanoXplore > FPGA project passes final stage... Ok > By the way we have received report from Andrew Dennison about > successful integration of CTU CAN FD into Litex based RISC-V > system. Tested with the Linux our Linux kernel driver. That sounds interesting! > The first iteration there, but he reported that some corrections > from his actual development needs to be added to the public > repo still to be usable out of the box > > https://github.com/AndrewD/litecan Nice! regards, Marc
Hello Marc, On Tuesday 22 of March 2022 10:22:12 Marc Kleine-Budde wrote: > > We have HW timestamping implemented for actual stable CTU CAN FD IP core > > version, support for variable number of TX buffers which count can be > > parameterized up to 8 in the prepared version and long term desire to > > configurable-SW defined multi-queue which our HW interface allows to > > dynamically server by á TX buffers. But plan is to keep combinations > > of the design and driver compatible from the actual revision. > > Is the number of RX and TX buffers and TX queues auto detectable by > software, or do we need DT bindings for this? we plan to count info available through field in the registers. See paragraph 3.1.39 TXTB_INFO http://canbus.pages.fel.cvut.cz/ctucanfd_ip_core/doc/Datasheet.pdf The bits are read as zeros on the older version so if the zero is found older default 4 will be chosen by the driver. So I prefer no side information channel DTC etc. Even on the actual PCI integration you can read number of channels from the support PCI BAR range and locate actual cores on the other BAR. > The net-next merge window closed with Monday evening, so this driver > will go into net-next for v5.19. My shame. I hoped to get driver into mainline till Embedded World Conference 2022 where we participate with our RISC-V education tools and materials so that I could report to others in the RISC-V camp and even CiA that we are finally easily to use. On the other hand more time gives time for better review and possible cleanup. But I would wait with features addition till some base accepted to the next or mainline. I hope that interfaces will not change too much before inclusion that actual tested code can get in. > > > BTW: The PROP_SEG/PHASE_SEG1 issue is known: > > > > +A curious reader will notice that the durations of the segments > > > > PROP_SEG +and PHASE_SEG1 are not determined separately but rather > > > > combined and +then, by default, the resulting TSEG1 is evenly divided > > > > between PROP_SEG +and PHASE_SEG1. > > > > > > and the flexcan IP core in CAN-FD mode has the same problem. When > > > working on the bit timing parameter, I'll plan to have separate > > > PROP_SEG/PHASE_SEG1 min/max in the kernel, so that the bit timing > > > algorithm can take care of this. > > > > Hmm, when I have thought about that years ago I have not noticed real > > difference when time quanta is move between PROP_SEG and PHASE_SEG1. > > So for me it had no influence on the algorithm computation and > > could be done on the chip level when minimal and maximal sum is > > respected. But may it be I have overlooked something and there is > > difference for CAN FD. May it be my colleagues Jiri Novak and > > Ondrej Ille are more knowable. > > Jiri, Ondrej, I'm interested in details :) > > > As for the optimal timequantas per bit value, I agree that it > > is not so simple. In the fact SJW and even tipple-sampling > > should be defined in percentage of bit time > > I thought of specifying SJW in a relative value, too. > > > and then all should be optimized together > > SJW has no influence on the bit rate and sample point. Although SJW is > max phase seg 2, so it's maximum is influenced by the sample point. > > > and even combination with slight bitrate error should be preferred > > against other exact matching when there is significant difference in > > the other parameters values. > > Since linux-4.8, i.e. > > | 7da29f97d6c8 can: dev: can-calc-bit-timing(): better sample point > | calculation > > the algorithm optimizes for best bit minimal absolute bit rate error > first and then for minimal sample point error. The sample point must be > <= the nominal sample point. I have some experiments where the algorithm > optimizes the absolute sample point error. Yes but you do not need exact bitrate even when it is available. I do no look into standards now, but I think 1% error should not be a problem. May it be even 3% error when whole jitter and clocks frequency errors fit below it should work (5 (bit stuff) x 3% = 15% is less than 100% - 80% (typical SP) and with well tuned SF even 25 or 50% of the last bit could be accepted that communication can at leas sometimes succeed). So allow error of 0.1% as better than too low or too high TQ per bit or strange SJW can be acceptable. On the other hand if CAN is used with time triggered stuff or keep/ synchronize whole plant global time then 0.1% is too much. So at the end in this case really tuning for concrete application comes into play. But in kernel algorithm is there to make most common usages easy... > For more complicated bit timing optimization you need to combine the > bitrate error and the sample point error into a function that needs to > be minimized. Yes. > > By the way we have received report from Andrew Dennison about > > successful integration of CTU CAN FD into Litex based RISC-V > > system. Tested with the Linux our Linux kernel driver. > > That sounds interesting! I hope that community joint forces can reach state where CAN FD will be solved on all FPGA and combinations easily and open source way. We need to find way for funding of certification and long term sustainability. But at least our previous release and actual public code is game level save point. Best wishes, Pavel -- Pavel Pisa phone: +420 603531357 e-mail: pisa@cmp.felk.cvut.cz Department of Control Engineering FEE CVUT Karlovo namesti 13, 121 35, Prague 2 university: http://dce.fel.cvut.cz/ personal: http://cmp.felk.cvut.cz/~pisa projects: https://www.openhub.net/accounts/ppisa CAN related:http://canbus.pages.fel.cvut.cz/ Open Technologies Research Education and Exchange Services https://gitlab.fel.cvut.cz/otrees/org/-/wikis/home
On 30.03.2022 09:46:08, Ondrej Ille wrote: > sorry for the late reply, my work duties are keeping me very busy. Known problem :) > Let me just quickly comment on topics discussed in the emails above. > > *1. Separation of PROP and TSEG1* > > IMHO there is no real benefit. The reason why CTU CAN FD has this > split is legacy. First implementation back in 2015 had this split > since I wanted to follow the standard. In HW, the first thing done in > bit timing logic (prescaler module), these two numbers are added, and > all further resynchronization/hard-synchronization is done with TSEG1 > value... Thanks for the insight. It's not easy to get in touch with the developers of the proprietary IP cores :) Never the less, there's another IP core which has different sizes for the prop and tseg1 register. So an update of the bit timing constant would help both. > *2. Number of TXT Buffers and RX Buffer size:* > > Pavel already replied with TXTB_INFO. The same role has the RX_MEM_INFO > register, when it comes to RX side. Thanks for the clarification. regards, Marc
Hello Marc and others, On Tuesday 22 of March 2022 10:22:12 Marc Kleine-Budde wrote: > On 22.03.2022 09:18:32, Pavel Pisa wrote: > > > The driver looks much better now. Good work. Please have a look at the > > > TX path of the mcp251xfd driver, especially the tx_stop_queue and > > > tx_wake_queue in mcp251xfd_start_xmit() and mcp251xfd_handle_tefif(). A > > > lockless implementation should work in your hardware, too. > > > > Is this blocker for now? I would like to start with years tested base. > > Makes sense. I have missed timing for 5.18 but v5.18-rc1 is out so I would be happy if we do not miss 5.19 merge window at least with minimal version. If we succeeds in review reasonably early we could fit with inclusion or at least the first review round of Mataj Vasilevski's https://gitlab.fel.cvut.cz/canbus/ctucanfd_ip_core/-/tree/hw-timestamping Please, help us to finish this subsequent goal of our portfolio development. I think that our work is valuable for the community, code can be tested even in QEMU CAN bus subsystem which we architected as well https://www.qemu.org/docs/master/system/devices/can.html I hope that it is usable for others. I have the last support call from Magdeburg University where they use CAN emulation for some Volkswagen projects. The Xilinx uses code for their CAN FD controllers emulation. Thei have whole stack including mainline driver for their CAN FD controller in mainline but on the other hand, their CAN FD is bound to Xilinx devices emulation. But CTU CAN FD provides generic PCI integration and can be used even on broad range of FPGAs so its emulation and matching driver provides valuable tool even if you do not consider use its actual design on hardware. New version of the latency tester based on CTU CAN FD timestamping is in preparation as upgrade of original Martin Jerabek's work done on Oliver Hartkopp's and Volkswagen call https://gitlab.fel.cvut.cz/canbus/zynq/zynq-can-sja1000-top/wikis/uploads/56b4d27d8f81ae390fc98bdce803398f/F3-BP-2016-Jerabek-Martin-Jerabek-thesis-2016.pdf Best wishes, Pavel -- Pavel Pisa phone: +420 603531357 e-mail: pisa@cmp.felk.cvut.cz Department of Control Engineering FEE CVUT Karlovo namesti 13, 121 35, Prague 2 university: http://dce.fel.cvut.cz/ personal: http://cmp.felk.cvut.cz/~pisa projects: https://www.openhub.net/accounts/ppisa CAN related:http://canbus.pages.fel.cvut.cz/ Open Technologies Research Education and Exchange Services https://gitlab.fel.cvut.cz/otrees/org/-/wikis/home
On 06.04.2022 10:20:42, Pavel Pisa wrote: > Hello Marc and others, > > On Tuesday 22 of March 2022 10:22:12 Marc Kleine-Budde wrote: > > On 22.03.2022 09:18:32, Pavel Pisa wrote: > > > > The driver looks much better now. Good work. Please have a look at the > > > > TX path of the mcp251xfd driver, especially the tx_stop_queue and > > > > tx_wake_queue in mcp251xfd_start_xmit() and mcp251xfd_handle_tefif(). A > > > > lockless implementation should work in your hardware, too. > > > > > > Is this blocker for now? I would like to start with years tested base. > > > > Makes sense. > > I have missed timing for 5.18 but v5.18-rc1 is out so I would be > happy if we do not miss 5.19 merge window at least with minimal version. I've taken the patch (almost) as is, I marked both can_bittiming_const static, as sparse complained about that and I changed the order of two variable declarations to look nicer :) Looking forward for more patches! regards, Marc