Message ID | 20190213175454.7506-8-logang@deltatee.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Support using MSI interrupts in ntb_transport | expand |
On Wed, Feb 13, 2019 at 10:54:49AM -0700, Logan Gunthorpe wrote: Hi > When using multi-ports each port uses resources (dbs, msgs, mws, etc) > on every other port. Creating a mapping for these resources such that > each port has a corresponding resource on every other port is a bit > tricky. > > Introduce the ntb_peer_resource_idx() function for this purpose. > It returns the peer resource number that will correspond with the > local peer index on the remote peer. > > Also, introduce ntb_peer_highest_mw_idx() which will use > ntb_peer_resource_idx() but return the MW index starting with the > highest index and working down. > > Signed-off-by: Logan Gunthorpe <logang@deltatee.com> > Cc: Jon Mason <jdmason@kudzu.us> > Cc: Dave Jiang <dave.jiang@intel.com> > Cc: Allen Hubbe <allenbh@gmail.com> > --- > include/linux/ntb.h | 70 +++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 70 insertions(+) > > diff --git a/include/linux/ntb.h b/include/linux/ntb.h > index 181d16601dd9..f5c69d853489 100644 > --- a/include/linux/ntb.h > +++ b/include/linux/ntb.h > @@ -1502,4 +1502,74 @@ static inline int ntb_peer_msg_write(struct ntb_dev *ntb, int pidx, int midx, > return ntb->ops->peer_msg_write(ntb, pidx, midx, msg); > } > > +/** > + * ntb_peer_resource_idx() - get a resource index for a given peer idx > + * @ntb: NTB device context. > + * @pidx: Peer port index. > + * > + * When constructing a graph of peers, each remote peer must use a different > + * resource index (mw, doorbell, etc) to communicate with each other > + * peer. > + * > + * In a two peer system, this function should always return 0 such that > + * resource 0 points to the remote peer on both ports. > + * > + * In a 5 peer system, this function will return the following matrix > + * > + * pidx \ port 0 1 2 3 4 > + * 0 0 0 1 2 3 > + * 1 0 1 2 3 4 > + * 2 0 1 2 3 4 > + * 3 0 1 2 3 4 > + * This table is too simplified to represent a generic case of port-index mapping table. In particular the IDT PCIe switch got it ports numbered with uneven integers like: 0 2 4 6 8 12 16 20 or 0 8 16, and so on. Moreover some of the ports might be disabled or may have NTB functions deactivated, in which case these ports shouldn't be considered by NTB subsystem at all. Basically we may have any increasing subset of that port numbers depending on the current IDT PCIe-switch ports setup. What I am trying to say by this is that if in real life the NTB MSI library or some other library used that matrix to calculate the MW index, then most likely it would either consume nearly all the MWs leaving holes in the space of MWs or just run out of them since depending on the configuration there might be from 0 to 24 MWs enabled in a IDT NTB function. In order to solve the problem we either need the ntb_peer_resource_idx() method somehow fixed so it would use the pidx instead of the pure port number or we should properly alter the current NTB API port-index table implementation, so the ntb_pingpong, NTB MSI library and others wouldn't need to a have some other table to distribute the NTB resources. A short preamble of why we introduced the ports-index NTB API. Since each NTB MW and NTB message registers can be setup to be related with any NTB device port, we needed to have the port attribute accepted by NTB API functions. But it turned out the IDT PCIe switch ports are unevenly numbered, which may cause difficulties in using their numbers for bulk configurations. So in order to simplify a client code utilizing the multi-ports NTB API (for example for simpler looping around the device ports), we decided to create the ports-index table in the NTB API internals. The table is defined by a set of ntb_port_*()/ntb_peer_port_*() functions. So now all the NTB API methods accept a port index instead of an irregular port number. Here is the port-index table currently defined for IDT 89HPES32NT8 PCIe-switch with all 8 ports having NTB functions enabled: peer port \ local port 0 2 4 6 8 12 16 20 0 x 0 0 0 0 0 0 0 2 0 x 1 1 1 1 1 1 4 1 1 x 2 2 2 2 2 6 2 2 2 x 3 3 3 3 8 3 3 3 3 x 4 4 4 12 4 4 4 4 4 x 5 5 16 5 5 5 5 5 5 x 6 20 6 6 6 6 6 6 6 x As you can see currently each NTB device side's got its own port-index mapping, so each port can access another ports, but not itself. I implemented it this way intentionally for two reasons: 1) I don't think anyone ever would need to have MW or Message registers setup pointing to the local port. 2) IDT PCIe switch will report "Unsupported TLP" error in case if there is an attempt to touch a MW frame initialized with port number matching the local port number. (Internally IDT PCIe-switch is working with port partitions. But I didn't want to complicate the API description, so we just utilize the port numbers naming, which is essentially the same as partitions for NTB.) The biggest drawback of the selected table representation is that the port-index mapping is unique for each local port, which makes the indexes unsuitable to distribute shared resources like outbound MWs, Doorbells and Scratchpads (if one ever implemented for multi-port NTB devices). For instance each port can access a port with index 0, which for local port 0 is port 2, while for the rest of the local ports it's port 0, and so on. This drawback already caused complications in the ntb_pingpong driver, where after your patchset acceptance we use port numbers to distribute the doorbell bits. The same problem is popping up here, but this time it can't be solved by the port numbers utilization due to the limited number of MWs available on IDT. As I already said there can be two ways to get rid of the complication. One of them is to redefine the port-index table, so to map the port numbers to global indexes instead of the local ones, like this: peer port \ local port 0 2 4 6 8 12 16 20 0 0 0 0 0 0 0 0 0 2 1 1 1 1 1 1 1 1 4 2 2 2 2 2 2 2 2 6 3 3 3 3 3 3 3 3 8 4 4 4 4 4 4 4 4 12 5 5 5 5 5 5 5 5 16 6 6 6 6 6 6 6 6 20 7 7 7 7 7 7 7 7 By doing so we'll be able to use the port indexes to distribute the shared resources like MWs, Doorbells and Scratchpads, but we may get the problem with "Unsupported TLP" error I described before for IDT devices, if a NTB API client code tries to define a MW pointing to the local port on IDT NTB hardware. I don't know whether the same problem exists for Switchtec multi-ports NTB devices, but this approach shall definitely lead to cleaner and easier NTB API as well as simplify the IDT NTB hw driver. We also won't need to have any additional tables like ntb_peer_resource_idx(). (the second solution is described further in the next comment) > + * For example, if this function is used to program peer's memory > + * windows, port 0 will program MW 0 on all it's peers to point to itself. > + * port 1 will program MW 0 in port 0 to point to itself and MW 1 on all > + * other ports. etc. > + * > + * For the legacy two host case, ntb_port_number() and ntb_peer_port_number() > + * both return zero and therefore this function will always return zero. > + * So MW 0 on each host would be programmed to point to the other host. > + * > + * Return: the resource index to use for that peer. > + */ > +static inline int ntb_peer_resource_idx(struct ntb_dev *ntb, int pidx) > +{ > + int local_port, peer_port; > + > + if (pidx >= ntb_peer_port_count(ntb)) > + return -EINVAL; > + > + local_port = ntb_port_number(ntb); > + peer_port = ntb_peer_port_number(ntb, pidx); > + > + if (peer_port < local_port) > + return local_port - 1; > + else > + return local_port; > +} > + Instead of redefining the port-index table we can just fix the ntb_peer_resource_idx() method, so it would return a global port index instead of some number based on the port number. It can be done just by the next modification: + if (peer_port <= local_port) + return pidx; + else + return pidx + 1; Personally I'd prefer the first solution even though it may lead to the "Unsupported TLP" errors and cause a greater code changes. Here is why: 1) the error might be IDT-specific, so we shouldn't limit the API due to one particular hardware peculiarity, 2) port-index table with global indexes implementation shall simplify the IDT NTB hw driver and provide a cleaner NTB API with simpler shared resources utilization code. The final decision is after the NTB subsystem maintainers. If they agree with solution #1 I'll send a corresponding patchset on this week, so you can alter this patchset being based on it. > +/** > + * ntb_peer_highest_mw_idx() - get a memory window index for a given peer idx > + * using the highest index memory windows first > + * > + * @ntb: NTB device context. > + * @pidx: Peer port index. > + * > + * Like ntb_peer_resource_idx(), except it returns indexes starting with > + * last memory window index. > + * > + * Return: the resource index to use for that peer. > + */ > +static inline int ntb_peer_highest_mw_idx(struct ntb_dev *ntb, int pidx) > +{ > + int ret; > + > + ret = ntb_peer_resource_idx(ntb, pidx); > + if (ret < 0) > + return ret; > + > + return ntb_mw_count(ntb, pidx) - ret - 1; > +} > + > #endif > -- > 2.19.0 > > -- > You received this message because you are subscribed to the Google Groups "linux-ntb" group. > To unsubscribe from this group and stop receiving emails from it, send an email to linux-ntb+unsubscribe@googlegroups.com. > To post to this group, send email to linux-ntb@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/linux-ntb/20190213175454.7506-8-logang%40deltatee.com. > For more options, visit https://groups.google.com/d/optout. -Sergey
On 2019-03-05 6:24 p.m., Serge Semin wrote: >> + * In a 5 peer system, this function will return the following matrix >> + * >> + * pidx \ port 0 1 2 3 4 >> + * 0 0 0 1 2 3 >> + * 1 0 1 2 3 4 >> + * 2 0 1 2 3 4 >> + * 3 0 1 2 3 4 >> + * Oh, first, oops: looks like I copied this down wrong anyway; the code was what I had intended, but the documented example should have been: pidx \ local_port 0 1 2 3 4 0 0 0 1 2 3 1 0 1 1 2 3 2 0 1 2 2 3 3 0 1 2 3 3 And this is definitely the correct table we are aiming for. ntb_peer_resource_idx() is supposed to return the result of ntb_peer_port_idx(ntb, local_port) when run on the peer specified by pidx. Note: this table also makes sense because it only uses 4 resources for 5 ports which is the best case scenario. (In other words, to communicate between N ports, N-1 resources are required on each peer). > This table is too simplified to represent a generic case of port-index > mapping table. In particular the IDT PCIe switch got it ports numbered > with uneven integers like: 0 2 4 6 8 12 16 20 or 0 8 16, and so on. > Moreover some of the ports might be disabled or may have NTB functions > deactivated, in which case these ports shouldn't be considered by NTB subsystem > at all. Basically we may have any increasing subset of that port > numbers depending on the current IDT PCIe-switch ports setup. Yes, I did not consider situations where there would be gaps in the "port number" space. It wasn't at all clear from the code that this was possible. Switchtec hardware could be configured for such an arrangement, but I don't know why anyone would do that as it just needlessly complicates everything. As you point out, with a gap, we end up with something that is wrong: pidx \ port 0 1 3 4 5 0 0 0 2 3 4 1 0 1 2 3 4 2 0 1 3 3 4 3 0 1 3 4 4 Here, the relationship between ntb_peer_resource_idx() and ntb_peer_port_idx() is not maintained and it seems to prescribe 5 resources for 5 ports. If there were more gaps it would be even more wrong. >> +static inline int ntb_peer_resource_idx(struct ntb_dev *ntb, int pidx) >> +{ >> + int local_port, peer_port; >> + >> + if (pidx >= ntb_peer_port_count(ntb)) >> + return -EINVAL; >> + >> + local_port = ntb_port_number(ntb); >> + peer_port = ntb_peer_port_number(ntb, pidx); >> + >> + if (peer_port < local_port) >> + return local_port - 1; >> + else >> + return local_port; >> +} >> + > > Instead of redefining the port-index table we can just fix the > ntb_peer_resource_idx() method, so it would return a global port index > instead of some number based on the port number. It can be done just by > the next modification: > > + if (peer_port <= local_port) > + return pidx; > + else > + return pidx + 1; > This creates a table that looks like: pidx \ port 0 1 2 3 4 0 1 0 0 0 0 1 2 2 1 1 1 2 3 3 3 2 2 3 4 4 4 4 3 Which is not correct. In fact, it seems to require 5 resources for 5 ports. This appears to be what is done in the current ntb_perf and I think I figured it out several months ago but it's way too messy and hard to understand and I don't want to spend the time to figure it out again. IMO, in order to support gaps, we'd need to, on some layer, create an un-gapped numbering scheme for the ports. I think the easiest thing is to just have Logical and Physical port numbers; so we would have something like: Physical Port Number: 0 2 4 6 8 12 16 20 Logical Port Number: 0 1 2 3 4 5 6 7 Peer Index (Port 0): x 0 1 2 3 4 5 6 Port Index (Port 8): 0 1 2 3 x 4 5 6 (etc) Where the Physical Port Number is whatever the hardware uses and the logical port number is a numbering scheme starting with zero with no gaps. Then the port indexes are still as we currently have them. If we say that the port numbers we have now are the Logical Port Number, then ntb_peer_resource_idx() is correct. I would strongly argue that the clients don't need to know anything about the Physical Port Number and these should be handled strictly inside the drivers. If multiple drivers need to do something similar to map the logical to physical port numbers then we should introduce helper functions to allow them to do so. If the Physical Numbers are not contained in the driver than the API would need to be expanded to expose which numbers are actually used to avoid needing to constantly loop through all the indexes to find this out. On a similar vein, I'd suggest that most clients shouldn't even really need to do anything with the Logical Port Number and should deal largely with Port Indexes. Ideally, Logical Port Numbers should only be used by helper code in the common layer to help assign resources used by the clients (like ntb_peer_resource_idx()). This world view isn't far off from what we have now, though you *may* need to adjust your IDT driver and we will have to eventually clean up the existing test clients to use the new helper functions. > Personally I'd prefer the first solution even though it may lead to the > "Unsupported TLP" errors and cause a greater code changes. Here is why: > 1) the error might be IDT-specific, so we shouldn't limit the API due to > one particular hardware peculiarity, > 2) port-index table with global indexes implementation shall simplify the IDT > NTB hw driver and provide a cleaner NTB API with simpler shared resources > utilization code. > The final decision is after the NTB subsystem maintainers. If they agree with > solution #1 I'll send a corresponding patchset on this week, so you can > alter this patchset being based on it. I think what we have right now is close enough and we just have to clean up the code and fix things. I don't think we need to do another big change to the semantics. I *certainly* don't want to risk breaking everything again to do it. Logan
On Wed, Mar 06, 2019 at 12:11:11PM -0700, Logan Gunthorpe wrote: > > > On 2019-03-05 6:24 p.m., Serge Semin wrote: > >> + * In a 5 peer system, this function will return the following matrix > >> + * > >> + * pidx \ port 0 1 2 3 4 > >> + * 0 0 0 1 2 3 > >> + * 1 0 1 2 3 4 > >> + * 2 0 1 2 3 4 > >> + * 3 0 1 2 3 4 > >> + * > > Oh, first, oops: looks like I copied this down wrong anyway; the code > was what I had intended, but the documented example should have been: > > pidx \ local_port 0 1 2 3 4 > 0 0 0 1 2 3 > 1 0 1 1 2 3 > 2 0 1 2 2 3 > 3 0 1 2 3 3 > > And this is definitely the correct table we are aiming for. > ntb_peer_resource_idx() is supposed to return the result of > ntb_peer_port_idx(ntb, local_port) when run on the peer specified by pidx. > > Note: this table also makes sense because it only uses 4 resources for 5 > ports which is the best case scenario. (In other words, to communicate > between N ports, N-1 resources are required on each peer). > Yes, it does use as much and as tight resources as it possible, but only for the case of pure integer ports numbering. While in case if there are gaps in the port numbers space (which is the only case we have in supported hardware at this moment) it will lead to a failure if there are ports with higher numbers, than there are MWs available (MWs availability depends on the IDT chip firmware). Additionally it creates gaps in the MWs space if physical ports are numbered with gaps. Since the only multi-port device we've got now is IDT and it always has it' ports numbered with gaps as I described, then the current implementation will definitely produced the problems. > > This table is too simplified to represent a generic case of port-index > > mapping table. In particular the IDT PCIe switch got it ports numbered > > with uneven integers like: 0 2 4 6 8 12 16 20 or 0 8 16, and so on. > > Moreover some of the ports might be disabled or may have NTB functions > > deactivated, in which case these ports shouldn't be considered by NTB subsystem > > at all. Basically we may have any increasing subset of that port > > numbers depending on the current IDT PCIe-switch ports setup. > > Yes, I did not consider situations where there would be gaps in the > "port number" space. It wasn't at all clear from the code that this was > possible. Switchtec hardware could be configured for such an > arrangement, but I don't know why anyone would do that as it just > needlessly complicates everything. > > As you point out, with a gap, we end up with something that is wrong: > > pidx \ port 0 1 3 4 5 > 0 0 0 2 3 4 > 1 0 1 2 3 4 > 2 0 1 3 3 4 > 3 0 1 3 4 4 > > Here, the relationship between ntb_peer_resource_idx() and > ntb_peer_port_idx() is not maintained and it seems to prescribe 5 > resources for 5 ports. If there were more gaps it would be even more wrong. > Exactly. The table will look even worse for the port numbers: 0 2 4 6 8 12 16 20. > >> +static inline int ntb_peer_resource_idx(struct ntb_dev *ntb, int pidx) > >> +{ > >> + int local_port, peer_port; > >> + > >> + if (pidx >= ntb_peer_port_count(ntb)) > >> + return -EINVAL; > >> + > >> + local_port = ntb_port_number(ntb); > >> + peer_port = ntb_peer_port_number(ntb, pidx); > >> + > >> + if (peer_port < local_port) > >> + return local_port - 1; > >> + else > >> + return local_port; > >> +} > >> + > > > > Instead of redefining the port-index table we can just fix the > > ntb_peer_resource_idx() method, so it would return a global port index > > instead of some number based on the port number. It can be done just by > > the next modification: > > > > + if (peer_port <= local_port) > > + return pidx; > > + else > > + return pidx + 1; > > > > This creates a table that looks like: > > pidx \ port 0 1 2 3 4 > 0 1 0 0 0 0 > 1 2 2 1 1 1 > 2 3 3 3 2 2 > 3 4 4 4 4 3 > > Which is not correct. In fact, it seems to require 5 resources for 5 > ports. This appears to be what is done in the current ntb_perf and I > think I figured it out several months ago but it's way too messy and > hard to understand and I don't want to spend the time to figure it out > again. > Yes, this is how it used to be done in ntb_pingpong and is still done in the ntb_perf driver. And it is correctly working. As I already described and you wrote further, this table provides a Logical Ports numbering space: peer port \ local port 0 2 4 6 8 12 16 20 0 0 0 0 0 0 0 0 0 2 1 1 1 1 1 1 1 1 4 2 2 2 2 2 2 2 2 6 3 3 3 3 3 3 3 3 8 4 4 4 4 4 4 4 4 12 5 5 5 5 5 5 5 5 16 6 6 6 6 6 6 6 6 20 7 7 7 7 7 7 7 7 (although I'd call it Global Port Indexes space) Currently local port indexes don't enumerate the local port number. So if you convert that table into the one you provided in the function comment, then it'll look like this (similar to what you called incorrect): pidx \ port 0 2 4 6 8 12 16 20 0 1 0 0 0 0 0 0 0 1 2 2 1 1 1 1 1 1 2 3 3 3 2 2 2 2 2 3 4 4 4 4 3 3 3 3 4 5 5 5 5 5 4 4 4 5 6 6 6 6 6 6 5 5 6 7 7 7 7 7 7 7 6 Yes, by using this table we'll waste one resource as always existing gap (is it the only incorrect thing you had in mind?). But it is smaller problem than to use physical port numbers, which produces much bigger gaps in case of your table implementation as well. Note, in addition in this case you'd need to reconsider your algorithm of the resources initialization. Lets for example take alook at Port 0. You'd need to have its outbound memory windows [1-7] pointing to the peers with ports [2,4,...,20] (correspond to pidx [0-6] of Port 0). So in this case Port 2 would have a port 0 inbound MW #1 retrieving data from Port 0 outbound MW #1, Port 4 would have a port 0 inbound MW #2 retrieving data from Port 0 outbound MW #2, and so on. So your current approach is inbound MW-centralized, while mine is developed around the outbound MWs. > IMO, in order to support gaps, we'd need to, on some layer, create an > un-gapped numbering scheme for the ports. I think the easiest thing is > to just have Logical and Physical port numbers; so we would have > something like: > > Physical Port Number: 0 2 4 6 8 12 16 20 > Logical Port Number: 0 1 2 3 4 5 6 7 > Peer Index (Port 0): x 0 1 2 3 4 5 6 > Port Index (Port 8): 0 1 2 3 x 4 5 6 > (etc) That's what I suggested in the two possible solutions: 1st solution: replace current pidx with Logical Port Number, 2nd solution: alter ntb_peer_resource_idx() so it would return the Logical Port Number. IMO In case of the 2nd solution I'd also suggest to rename the ntb_peer_resource_idx() method into ntb_peer_port_global_idx(), and then consider the current port indexes used in the NTB API as local port indexes. The resource indexing can be abstracted by a macro like this: #define ntb_peer_resource_idx ntb_peer_port_global_idx Finally in order to close the space up we'd also need to define a method: ntb_port_global_idx(), which would return a Logical (global) index of local port. > > Where the Physical Port Number is whatever the hardware uses and the > logical port number is a numbering scheme starting with zero with no > gaps. Then the port indexes are still as we currently have them. If we > say that the port numbers we have now are the Logical Port Number, then > ntb_peer_resource_idx() is correct. > Current port numbers are the physical port numbers with gaps. That's why we introduced the port-index NTB API abstraction in the first place, to have these gaps eliminated and to provide a simple way of bulk setup. Although that abstraction turned out not that suitable to distribute the shared resources. So the Logical (Global) indexing is needed to do it (that's what ntb_pingpong used to do and ntb_perf still does now). > I would strongly argue that the clients don't need to know anything > about the Physical Port Number and these should be handled strictly > inside the drivers. If multiple drivers need to do something similar to > map the logical to physical port numbers then we should introduce helper > functions to allow them to do so. If the Physical Numbers are not > contained in the driver than the API would need to be expanded to expose > which numbers are actually used to avoid needing to constantly loop > through all the indexes to find this out. > Absolutely agree with you. The main idea of NTB API was to provide a set of methods to access the NTB hardware without any abstractions but with possible useful helpers, like your NTB MSI library, or transport library, or anything else. So the physical port numbers must be available for the client drivers. > On a similar vein, I'd suggest that most clients shouldn't even really > need to do anything with the Logical Port Number and should deal largely > with Port Indexes. Ideally, Logical Port Numbers should only be used by > helper code in the common layer to help assign resources used by the > clients (like ntb_peer_resource_idx()). > This is the main question. Do we really need the current port indexes implementation at all? After all these years of NTB API usage I don't really see it useful in any case except to loop over the outbound MW resources automatically skipping the local port (usefulness of this is also questionable). As I already said I created the port-index table this way due to the IDT NTB MWs peculiarity, which doesn't seem to me a big problem now comparing to all these additional complications we intend to introduce. The rest of the drivers code really need to have the Logical (global) port indexes, at least to distribute the shared resources, and don't use the current pidx that much. Wouldn't it be better to just redefine the current port-index table in the following way? ntb_port_number() - local physical port number, ntb_port_idx() - local port logical (global) index, ntb_peer_port_count() - total number of ports NTB device provide (including the local one), ntb_peer_port_number() - physical port number of the peer with passed logical port index, ntb_peer_port_idx - logical port index of the passed physical port number. while currently we have: ntb_port_number() - local physical port number, ntb_peer_port_count() - total number of ports NTB device provide (excluding the local one), ntb_peer_port_number() - physical port number of the peer with passed port index, ntb_peer_port_idx - port index of the passed physical port number; -Sergey > This world view isn't far off from what we have now, though you *may* > need to adjust your IDT driver and we will have to eventually clean up > the existing test clients to use the new helper functions. > > > Personally I'd prefer the first solution even though it may lead to the > > "Unsupported TLP" errors and cause a greater code changes. Here is why: > > 1) the error might be IDT-specific, so we shouldn't limit the API due to > > one particular hardware peculiarity, > > 2) port-index table with global indexes implementation shall simplify the IDT > > NTB hw driver and provide a cleaner NTB API with simpler shared resources > > utilization code. > > > The final decision is after the NTB subsystem maintainers. If they agree with > > solution #1 I'll send a corresponding patchset on this week, so you can > > alter this patchset being based on it. > > I think what we have right now is close enough and we just have to clean > up the code and fix things. I don't think we need to do another big > change to the semantics. I *certainly* don't want to risk breaking > everything again to do it. > Logan > > -- > You received this message because you are subscribed to the Google Groups "linux-ntb" group. > To unsubscribe from this group and stop receiving emails from it, send an email to linux-ntb+unsubscribe@googlegroups.com. > To post to this group, send email to linux-ntb@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/linux-ntb/bd72f24f-5982-0fe7-59df-2fbbfe9f798a%40deltatee.com. > For more options, visit https://groups.google.com/d/optout.
On 2019-03-06 3:45 p.m., Serge Semin wrote: [Snip] Pretty sure everything above is just agreement... > So your current approach is inbound MW-centralized, while mine is developed > around the outbound MWs. I don't think this has anything to do with inbound vs outbound. The problem is the same no matter from which direction you assign things. >> Physical Port Number: 0 2 4 6 8 12 16 20 >> Logical Port Number: 0 1 2 3 4 5 6 7 >> Peer Index (Port 0): x 0 1 2 3 4 5 6 >> Port Index (Port 8): 0 1 2 3 x 4 5 6 >> (etc) > > That's what I suggested in the two possible solutions: > 1st solution: replace current pidx with Logical Port Number, > 2nd solution: alter ntb_peer_resource_idx() so it would return the Logical Port Number. Well my solution wasn't to change pidx and no, we don't want to change ntb_peer_resource_idx() to return the logical port number because that's not how it's being used and I have no use for a ntb_peer_port_global_idx() function. Functions are supposed to be added by code that needs to call them, not by some grand design. Part of the reason we have this confusing mess is because the API was reviewed and merged before any users of the API were presented. Usually this is not accepted in kernel development. My suggestion is to simply say that the existing port numbers are the logical port number and have the drivers handle the physical port number mapping internally. That requires the fewest changes. > IMO In case of the 2nd solution I'd also suggest to rename the > ntb_peer_resource_idx() method into ntb_peer_port_global_idx(), > and then consider the current port indexes used in the NTB API > as local port indexes. The resource indexing can be abstracted > by a macro like this: > #define ntb_peer_resource_idx ntb_peer_port_global_idx That define would not be useful. > Finally in order to close the space up we'd also need to define > a method: ntb_port_global_idx(), which would return a Logical (global) > index of local port. Again, I'd rather not add a bunch of large semantic and infrastructure changes at this point. It's confusing enough as it is and we don't need to introduce yet another indexing scheme API to the clients that really do not need it. What the clients need is a simple API to decide which resources to use for which peers, and to figure out which peers used which resources. ntb_peer_port_idx() and ntb_peer_resource_idx() suit these purposes. Nothing else really needs to exist. >> Where the Physical Port Number is whatever the hardware uses and the >> logical port number is a numbering scheme starting with zero with no >> gaps. Then the port indexes are still as we currently have them. If we >> say that the port numbers we have now are the Logical Port Number, then >> ntb_peer_resource_idx() is correct. >> > > Current port numbers are the physical port numbers with gaps. I think that's up for interpretation as, based on the existing code, I naturally interpreted it the other way and therefore it's pretty simple to say that it's the logical port number and fix the one driver that needs to change. > That's why we > introduced the port-index NTB API abstraction in the first place, to have these gaps > eliminated and to provide a simple way of bulk setup. Although that abstraction > turned out not that suitable to distribute the shared resources. So > the Logical (Global) indexing is needed to do it (that's what ntb_pingpong used > to do and ntb_perf still does now). My interpretation of the port-index was simply to match what was done in the two port case seeing code like ntb_transport simply uses the default 0 as the port index. There was no reason to believe, based on the code, that there would be gaps. >> I would strongly argue that the clients don't need to know anything >> about the Physical Port Number and these should be handled strictly >> inside the drivers. If multiple drivers need to do something similar to >> map the logical to physical port numbers then we should introduce helper >> functions to allow them to do so. If the Physical Numbers are not >> contained in the driver than the API would need to be expanded to expose >> which numbers are actually used to avoid needing to constantly loop >> through all the indexes to find this out. >> > > Absolutely agree with you. The main idea of NTB API was to provide a set > of methods to access the NTB hardware without any abstractions but > with possible useful helpers, like your NTB MSI library, or transport library, > or anything else. So the physical port numbers must be available for > the client drivers. Huh? How can you say you absolutely agree with me? I said the clients should not need to know about physical port numbers and you said the physical port numbers *must* be available to clients. I think this statement needs to be justified. Why should the clients need to know about the physical port numbers? >> On a similar vein, I'd suggest that most clients shouldn't even really >> need to do anything with the Logical Port Number and should deal largely >> with Port Indexes. Ideally, Logical Port Numbers should only be used by >> helper code in the common layer to help assign resources used by the >> clients (like ntb_peer_resource_idx()). >> > > This is the main question. Do we really need the current port indexes > implementation at all? After all these years of NTB API usage I don't really > see it useful in any case except to loop over the outbound MW resources > automatically skipping the local port (usefulness of this is also questionable). > As I already said I created the port-index table this way due to the IDT NTB > MWs peculiarity, which doesn't seem to me a big problem now comparing to all > these additional complications we intend to introduce. To me, it seems like the port indexes are used everywhere. A client almost always wants to deal with every port except itself. That seems entirely natural to me. Using the port index as the resource index makes perfect sense (which is why I implemented the inverse ntb_peer_resource_idx()). > The rest of the drivers code really need to have the Logical (global) > port indexes, at least to distribute the shared resources, and don't use > the current pidx that much. Drivers don't care about port indexes or how the resources are distributed. I would expect that all they'd do is map the port index (which all existing APIs take) to the physical address and program hardware as necessary. > Wouldn't it be better to just redefine the current port-index table > in the following way? > ntb_port_number() - local physical port number, > ntb_port_idx() - local port logical (global) index, > ntb_peer_port_count() - total number of ports NTB device provide (including the local one), > ntb_peer_port_number() - physical port number of the peer with passed logical port index, > ntb_peer_port_idx - logical port index of the passed physical port number. No, from the perspective of the client, the physical port numbers are useless. I don't want the count to include the local one because that means I'd always have to subtract one every time I want to loop through all peers (I never want to loop through all ports including the local one). I have no idea what your distinction between logical (global) index and logical port index is. In fact I find this all confusing. > while currently we have: > ntb_port_number() - local physical port number, > ntb_peer_port_count() - total number of ports NTB device provide (excluding the local one), > ntb_peer_port_number() - physical port number of the peer with passed port index, > ntb_peer_port_idx - port index of the passed physical port number; What we currently have all makes perfect sense to me and is exactly what I want for the clients, except I think ntb_port_number() needs to be the logical port number (with the gaps removed), so that we can more easily implement ntb_peer_resource_idx(). If ntb_port_number() corresponds to the physical port number, then the correct implementation of ntb_peer_resource_idx() is going to be crazy, essentially needing to generate a logical port number for every port, every time it's called. Logan
On Wed, Mar 06, 2019 at 04:22:33PM -0700, Logan Gunthorpe wrote: > > On 2019-03-06 3:45 p.m., Serge Semin wrote: > [Snip] > > Pretty sure everything above is just agreement... > > > So your current approach is inbound MW-centralized, while mine is developed > > around the outbound MWs. > > I don't think this has anything to do with inbound vs outbound. The > problem is the same no matter from which direction you assign things. > > >> Physical Port Number: 0 2 4 6 8 12 16 20 > >> Logical Port Number: 0 1 2 3 4 5 6 7 > >> Peer Index (Port 0): x 0 1 2 3 4 5 6 > >> Port Index (Port 8): 0 1 2 3 x 4 5 6 > >> (etc) > > > > That's what I suggested in the two possible solutions: > > 1st solution: replace current pidx with Logical Port Number, > > 2nd solution: alter ntb_peer_resource_idx() so it would return the Logical Port Number. > > Well my solution wasn't to change pidx and no, we don't want to change > ntb_peer_resource_idx() to return the logical port number because that's > not how it's being used and I have no use for a > ntb_peer_port_global_idx() function. Functions are supposed to be added > by code that needs to call them, not by some grand design. Part of the > reason we have this confusing mess is because the API was reviewed and > merged before any users of the API were presented. Usually this is not > accepted in kernel development. > > My suggestion is to simply say that the existing port numbers are the > logical port number and have the drivers handle the physical port number > mapping internally. That requires the fewest changes. > Now I see what you were suggesting. The solution you propose doesn't require any changes in your patches, but requires to update IDT driver. It also makes a change in the ports numbering semantics, which you said you didn't want to introduce. (read all further comments to get my conclusion regarding it) > > IMO In case of the 2nd solution I'd also suggest to rename the > > ntb_peer_resource_idx() method into ntb_peer_port_global_idx(), > > and then consider the current port indexes used in the NTB API > > as local port indexes. The resource indexing can be abstracted > > by a macro like this: > > #define ntb_peer_resource_idx ntb_peer_port_global_idx > > That define would not be useful. > > > Finally in order to close the space up we'd also need to define > > a method: ntb_port_global_idx(), which would return a Logical (global) > > index of local port. > > Again, I'd rather not add a bunch of large semantic and infrastructure > changes at this point. It's confusing enough as it is and we don't need > to introduce yet another indexing scheme API to the clients that really > do not need it. What the clients need is a simple API to decide which > resources to use for which peers, and to figure out which peers used > which resources. ntb_peer_port_idx() and ntb_peer_resource_idx() suit > these purposes. Nothing else really needs to exist. > If you don't want to add a large semantic and infrastructure change at this point, then it would be better to leave NTB port API the way it is now, and use logical port indexes in the ntb_peer_resource_idx() method. You'd also need to use this method to access both outbound and inbound memory windows. In this case we wouldn't need to change anything in current API and drivers. Yes, this approach would cause one resource being wasted, but it would lead to simpler semantic of the port numbers API. Don't get me wrong. I am not completely against the approach you suggest. It just has its pros and cons. The same way as mine does. I'll explain my opinion later in this email. > >> Where the Physical Port Number is whatever the hardware uses and the > >> logical port number is a numbering scheme starting with zero with no > >> gaps. Then the port indexes are still as we currently have them. If we > >> say that the port numbers we have now are the Logical Port Number, then > >> ntb_peer_resource_idx() is correct. > >> > > > > Current port numbers are the physical port numbers with gaps. > > I think that's up for interpretation as, based on the existing code, I > naturally interpreted it the other way and therefore it's pretty simple > to say that it's the logical port number and fix the one driver that > needs to change. > > > That's why we > > introduced the port-index NTB API abstraction in the first place, to have these gaps > > eliminated and to provide a simple way of bulk setup. Although that abstraction > > turned out not that suitable to distribute the shared resources. So > > the Logical (Global) indexing is needed to do it (that's what ntb_pingpong used > > to do and ntb_perf still does now). > > My interpretation of the port-index was simply to match what was done in > the two port case seeing code like ntb_transport simply uses the default > 0 as the port index. There was no reason to believe, based on the code, > that there would be gaps. > Sorry man, but how could you base your interpretation on a code, which didn't support multi-port case in the first place and just couldn't provide you a full impression by definition? You knew ntb_transport doesn't support the multi-port NTB devices, right? Moreover as far as I remember we already concerned similar problem in a discussion of your patches submitted for ntb_pingpong driver. You knew ntb_pingpong and ntb_perf driver support multi-port devices. So you could get your interpretation from there. BTW I didn't figure out it at that time, but you could fix the ntb_pingpong driver just by replacing the strict inequality in the conditional statement: --- a/drivers/ntb/test/ntb_pingpong.c +++ b/drivers/ntb/test/ntb_pingpong.c @@ -299,7 +299,7 @@ static void pp_init_flds(struct pp_ctx *pp) lport = ntb_port_number(pp->ntb); pcnt = ntb_peer_port_count(pp->ntb); for (pidx = 0; pidx < pcnt; pidx++) { - if (lport < ntb_peer_port_number(pp->ntb, pidx)) + if (lport <= ntb_peer_port_number(pp->ntb, pidx)) break; } This loop just finds out the logical index (as you named it) of the local port, which then is used to select a doorbell bit from the shared doorbell register. The similar algo is used in the ntb_perf driver. > >> I would strongly argue that the clients don't need to know anything > >> about the Physical Port Number and these should be handled strictly > >> inside the drivers. If multiple drivers need to do something similar to > >> map the logical to physical port numbers then we should introduce helper > >> functions to allow them to do so. If the Physical Numbers are not > >> contained in the driver than the API would need to be expanded to expose > >> which numbers are actually used to avoid needing to constantly loop > >> through all the indexes to find this out. > >> > > > > Absolutely agree with you. The main idea of NTB API was to provide a set > > of methods to access the NTB hardware without any abstractions but > > with possible useful helpers, like your NTB MSI library, or transport library, > > or anything else. So the physical port numbers must be available for > > the client drivers. > > Huh? How can you say you absolutely agree with me? I said the clients > should not need to know about physical port numbers and you said the > physical port numbers *must* be available to clients. I think this > statement needs to be justified. Why should the clients need to know > about the physical port numbers? > Ah, I misunderstood your statement. I thought you implied the other way around. I disagree then. Client drivers should be somehow able to retrieve the real physical port number. Physical port numbers can be connected with some ports-specific functionality (and they are in our projects), so they are used to enable/disable corresponding code of the client drivers. Additionally IDT PCIe NTB functions can be disabled/enabled by the device firmware, so one device may have for instance ports 2 4 6 with NTB, while another - ports 2 6 12, and so on. Both of these ports-set would be mapped to the same logical indexes space - 0 1 2. In case of having just the logical indexes, the code would have an impression of working with the same hardware, while in fact it isn't. You said: "Part of the reason we have this confusing mess is because the API was reviewed and merged before any users of the API were presented. Usually this is not accepted in kernel development." A source code of my project is using current port API and happy with it, so there was at least one user of the API at the time of the API submission. I bet there are others now, since I constantly get private questions regarding the IDT hardware configurations. So please don't be so judgemental. If you see some confusing from your point of view things it doesn't always mean it is a mess, you just may misunderstand something. I am sure a pro with experience like yours doesn't need this to be explained. > >> On a similar vein, I'd suggest that most clients shouldn't even really > >> need to do anything with the Logical Port Number and should deal largely > >> with Port Indexes. Ideally, Logical Port Numbers should only be used by > >> helper code in the common layer to help assign resources used by the > >> clients (like ntb_peer_resource_idx()). > >> > > > > This is the main question. Do we really need the current port indexes > > implementation at all? After all these years of NTB API usage I don't really > > see it useful in any case except to loop over the outbound MW resources > > automatically skipping the local port (usefulness of this is also questionable). > > As I already said I created the port-index table this way due to the IDT NTB > > MWs peculiarity, which doesn't seem to me a big problem now comparing to all > > these additional complications we intend to introduce. > > To me, it seems like the port indexes are used everywhere. A client > almost always wants to deal with every port except itself. That seems > entirely natural to me. Using the port index as the resource index makes > perfect sense (which is why I implemented the inverse > ntb_peer_resource_idx()). > > > The rest of the drivers code really need to have the Logical (global) > > port indexes, at least to distribute the shared resources, and don't use > > the current pidx that much. > > Drivers don't care about port indexes or how the resources are > distributed. I would expect that all they'd do is map the port index > (which all existing APIs take) to the physical address and program > hardware as necessary. > > > Wouldn't it be better to just redefine the current port-index table > > in the following way? > > ntb_port_number() - local physical port number, > > ntb_port_idx() - local port logical (global) index, > > ntb_peer_port_count() - total number of ports NTB device provide (including the local one), > > ntb_peer_port_number() - physical port number of the peer with passed logical port index, > > ntb_peer_port_idx - logical port index of the passed physical port number. > > No, from the perspective of the client, the physical port numbers are > useless. I don't want the count to include the local one because that > means I'd always have to subtract one every time I want to loop through > all peers (I never want to loop through all ports including the local > one). I have no idea what your distinction between logical (global) > index and logical port index is. In fact I find this all confusing. > > > while currently we have: > > ntb_port_number() - local physical port number, > > ntb_peer_port_count() - total number of ports NTB device provide (excluding the local one), > > ntb_peer_port_number() - physical port number of the peer with passed port index, > > ntb_peer_port_idx - port index of the passed physical port number; > > What we currently have all makes perfect sense to me and is exactly what > I want for the clients, except I think ntb_port_number() needs to be the > logical port number (with the gaps removed), so that we can more easily > implement ntb_peer_resource_idx(). If ntb_port_number() corresponds to > the physical port number, then the correct implementation of > ntb_peer_resource_idx() is going to be crazy, essentially needing to > generate a logical port number for every port, every time it's called. > > Logan > To sum the discussion up we now have two possible solutions of the problem discovered in this patch: 1) Leave current NTB port API as is, but fix this patch so the ntb_peer_resource_idx() method would return a port logical index: Physical port numbers: 0 2 4 6 8 12 16 20 (ntb_peer_port_number()/ntb_port_number()) Logical port numbers: 0 1 2 3 4 5 6 7 (ntb_peer_resource_idx()) Port indexes (Port 0): x 0 1 2 3 4 5 6 (ntb_peer_port_idx()) Port indexes (Port 8): 0 1 2 3 x 4 5 6 (ntb_peer_port_idx()) + pros: no need to change current NTB port API and available drivers; no need to apply the gap-less limitation on the NTB port numbers sequence; - cons: use ntb_peer_resource_idx() method to distribute a shared resource on each side of NTB (although it might be neutral or pros from some point of view); waste one memory window (no problem with shared doorbell register). 2) Apply the gap-less limitation on the NTB port numbers sequence, so the ntb_peer_port_number()/ntb_port_number() would return the logical port index: Physical port numbers: 0 2 4 6 8 12 16 20 (no NTB API method to get these numbers) Logical port numbers: 0 1 2 3 4 5 6 7 (ntb_peer_port_number()/ntb_port_number()) Port indexes (Port 0): x 0 1 2 3 4 5 6 (ntb_peer_port_idx()) Port indexes (Port 8): 0 1 2 3 x 4 5 6 (ntb_peer_port_idx()) + pros: no waste of one memory window; use port numbers returned from ntb_peer_port_number()//ntb_port_number() to distribute doorbell bits and outbound MWs; use ntb_peer_resource_idx() just at the peer side to select a proper inbound MW; - cons: need to change the port-index callbacks of the IDT hardware driver; introduces a change in the port numbers semantic; NTB port API gets to be an abstraction over the real physical port numbers, while the later ones won't de directly available to retrieve. The rest of the solutions would lead to overcomplications in the NTB port API, which we don't want to introduce. Personally after all the considerations I am now more inclined to the (2)-nd solution. Even though it causes more changes and makes the ports API more abstract, it provides a way to create a simpler shared resources distribution code as well as to exactly distribute the necessary number of memory windows. While the physical port number still can be found by client drivers directly from pci_dev descriptor. The final decision regarding the solution is after the subsystem maintainers. But although the provided by this patchset NTB MSI library consists some part with multi-port API utilization like MWs distribution, as I said in comments to the other patches, it doesn't really support the only multi-ports NTB device currently available - IDT (which I only interested in). So I don't see a reason why I should bother with providing a patch with alterations to the IDT hardware driver unless this patchset provides a portable NTB MSI library. -Sergey > -- > You received this message because you are subscribed to the Google Groups "linux-ntb" group. > To unsubscribe from this group and stop receiving emails from it, send an email to linux-ntb+unsubscribe@googlegroups.com. > To post to this group, send email to linux-ntb@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/linux-ntb/4acf79c1-766b-1db6-45c1-4cdd4cbba437%40deltatee.com. > For more options, visit https://groups.google.com/d/optout.
On 2019-03-12 2:42 p.m., Serge Semin wrote: > If you don't want to add a large semantic and infrastructure change at > this point, then it would be better to leave NTB port API the way it is > now, and use logical port indexes in the ntb_peer_resource_idx() method. > You'd also need to use this method to access both outbound and inbound > memory windows. In this case we wouldn't need to change anything in > current API and drivers. Yes, this approach would cause one resource being > wasted, but it would lead to simpler semantic of the port numbers API. Well my proposal would not require any changes in the API, it would just require a change to the IDT driver. And the mess in the test tools will still be a mess. > Sorry man, but how could you base your interpretation on a code, which didn't > support multi-port case in the first place and just couldn't provide you a full > impression by definition? You knew ntb_transport doesn't support the multi-port > NTB devices, right? Moreover as far as I remember we already concerned similar > problem in a discussion of your patches submitted for ntb_pingpong driver. > You knew ntb_pingpong and ntb_perf driver support multi-port devices. > So you could get your interpretation from there. Yes, but the test drivers were a mess and difficult to follow. Only now am I realizing that ntb_perf required one too many resources by design, thus are probably very broken for cases that previously worked because of it. > > BTW I didn't figure out it at that time, but you could fix the ntb_pingpong > driver just by replacing the strict inequality in the conditional statement: > --- a/drivers/ntb/test/ntb_pingpong.c > +++ b/drivers/ntb/test/ntb_pingpong.c > @@ -299,7 +299,7 @@ static void pp_init_flds(struct pp_ctx *pp) > lport = ntb_port_number(pp->ntb); > pcnt = ntb_peer_port_count(pp->ntb); > for (pidx = 0; pidx < pcnt; pidx++) { > - if (lport < ntb_peer_port_number(pp->ntb, pidx)) > + if (lport <= ntb_peer_port_number(pp->ntb, pidx)) > break; > } > > This loop just finds out the logical index (as you named it) of the local port, > which then is used to select a doorbell bit from the shared doorbell register. > The similar algo is used in the ntb_perf driver. Yes, it just feels overly complex to have to do that loop every time. > Ah, I misunderstood your statement. I thought you implied the other way around. > I disagree then. Client drivers should be somehow able to retrieve the real physical port > number. Physical port numbers can be connected with some ports-specific functionality > (and they are in our projects), so they are used to enable/disable corresponding > code of the client drivers. Ok, you're saying that the user will need to be able to map these ports somehow to their physical address. I buy that, but NTB transport for example, doesn't really have any method for this. You just get network interfaces that will likely be numbered similarly to the logical port number. But that's a whole other problem that will need to be solved later when there is multi-port ntb-transport. > You said: "Part of the reason we have this confusing mess is because the API was > reviewed and merged before any users of the API were presented. Usually this is not > accepted in kernel development." A source code of my project is using current port > API and happy with it, so there was at least one user of the API at the time of > the API submission. I bet there are others now, since I constantly get private questions > regarding the IDT hardware configurations. So please don't be so judgemental. If you see > some confusing from your point of view things it doesn't always mean it is a mess, > you just may misunderstand something. I am sure a pro with experience like yours > doesn't need this to be explained. Users of the code that are not in upstream do not count. Developers cannot look at that code and reason about how the API is supposed to be used. This isn't being judgemental, it's stating a fact: kernel developers do not like having incomplete code in upstream[1][2]. When it happens, it just makes the code very hard to maintain and very hard to develop on. Many developers call this a disaster and commonly call for the code in question to be removed. I can understand this completely because I'm facing the exact same issues trying to work with the current upstream NTB code. > - cons: > use ntb_peer_resource_idx() method to distribute a shared resource on each side > of NTB (although it might be neutral or pros from some point of view); > waste one memory window (no problem with shared doorbell register). I think wasting one memory window is a no-go and should never have been merged in the first place. The code originally worked fine in the situation where you have 2 peers, each with one memory window, and that needs to be maintained. > The rest of the solutions would lead to overcomplications in the NTB port API, > which we don't want to introduce. Well, frankly, it's a mess right now so we just have to deal with it and try to find a short term solution to start fixing it. Complexity be damned. > Personally after all the considerations I am now more inclined to the (2)-nd > solution. Even though it causes more changes and makes the ports API > more abstract, it provides a way to create a simpler shared resources > distribution code as well as to exactly distribute the necessary number > of memory windows. While the physical port number still can be found by > client drivers directly from pci_dev descriptor. Ok I think, for v3, I'll introduce a logical_port_number helper function which is essentially the loop you proposed. It's needlessly slow (altho speed isn't an issue) and ugly but at least, I think, it should be as close to correct as we can get. Someone else will have to eventually clean up all the test tools because I suspect they are still broken (but they at least work for me after my fixup set was merged). I personally have no interest in working on NTB code anymore unless I am being contracted to do so. > The final decision regarding the solution is after the subsystem maintainers. > But although the provided by this patchset NTB MSI library consists some part > with multi-port API utilization like MWs distribution, as I said in comments > to the other patches, it doesn't really support the only multi-ports NTB device > currently available - IDT (which I only interested in). So I don't see a reason > why I should bother with providing a patch with alterations to the IDT hardware > driver unless this patchset provides a portable NTB MSI library. Yes, well the reason it won't work is because of the current mess which I don't feel like I should be responsible for sorting out. My code works correctly for the existing ntb_transport and I've made a best effort to get as much multiport support as I feel I can, given the available infrastructure. Logan [1] https://lwn.net/Articles/757124/ [2] https://lwn.net/ml/linux-kernel/20190130075256.GA29665@lst.de/
diff --git a/include/linux/ntb.h b/include/linux/ntb.h index 181d16601dd9..f5c69d853489 100644 --- a/include/linux/ntb.h +++ b/include/linux/ntb.h @@ -1502,4 +1502,74 @@ static inline int ntb_peer_msg_write(struct ntb_dev *ntb, int pidx, int midx, return ntb->ops->peer_msg_write(ntb, pidx, midx, msg); } +/** + * ntb_peer_resource_idx() - get a resource index for a given peer idx + * @ntb: NTB device context. + * @pidx: Peer port index. + * + * When constructing a graph of peers, each remote peer must use a different + * resource index (mw, doorbell, etc) to communicate with each other + * peer. + * + * In a two peer system, this function should always return 0 such that + * resource 0 points to the remote peer on both ports. + * + * In a 5 peer system, this function will return the following matrix + * + * pidx \ port 0 1 2 3 4 + * 0 0 0 1 2 3 + * 1 0 1 2 3 4 + * 2 0 1 2 3 4 + * 3 0 1 2 3 4 + * + * For example, if this function is used to program peer's memory + * windows, port 0 will program MW 0 on all it's peers to point to itself. + * port 1 will program MW 0 in port 0 to point to itself and MW 1 on all + * other ports. etc. + * + * For the legacy two host case, ntb_port_number() and ntb_peer_port_number() + * both return zero and therefore this function will always return zero. + * So MW 0 on each host would be programmed to point to the other host. + * + * Return: the resource index to use for that peer. + */ +static inline int ntb_peer_resource_idx(struct ntb_dev *ntb, int pidx) +{ + int local_port, peer_port; + + if (pidx >= ntb_peer_port_count(ntb)) + return -EINVAL; + + local_port = ntb_port_number(ntb); + peer_port = ntb_peer_port_number(ntb, pidx); + + if (peer_port < local_port) + return local_port - 1; + else + return local_port; +} + +/** + * ntb_peer_highest_mw_idx() - get a memory window index for a given peer idx + * using the highest index memory windows first + * + * @ntb: NTB device context. + * @pidx: Peer port index. + * + * Like ntb_peer_resource_idx(), except it returns indexes starting with + * last memory window index. + * + * Return: the resource index to use for that peer. + */ +static inline int ntb_peer_highest_mw_idx(struct ntb_dev *ntb, int pidx) +{ + int ret; + + ret = ntb_peer_resource_idx(ntb, pidx); + if (ret < 0) + return ret; + + return ntb_mw_count(ntb, pidx) - ret - 1; +} + #endif
When using multi-ports each port uses resources (dbs, msgs, mws, etc) on every other port. Creating a mapping for these resources such that each port has a corresponding resource on every other port is a bit tricky. Introduce the ntb_peer_resource_idx() function for this purpose. It returns the peer resource number that will correspond with the local peer index on the remote peer. Also, introduce ntb_peer_highest_mw_idx() which will use ntb_peer_resource_idx() but return the MW index starting with the highest index and working down. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Cc: Jon Mason <jdmason@kudzu.us> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Allen Hubbe <allenbh@gmail.com> --- include/linux/ntb.h | 70 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+)