mbox series

[v6,00/25] RTRS (former IBTRS) rdma transport library and RNBD (former IBNBD) rdma network block device

Message ID 20191230102942.18395-1-jinpuwang@gmail.com (mailing list archive)
Headers show
Series RTRS (former IBTRS) rdma transport library and RNBD (former IBNBD) rdma network block device | expand

Message

Jinpu Wang Dec. 30, 2019, 10:29 a.m. UTC
Hi all,

here is V6 of the RTRS (former IBTRS) rdma transport library and the
corresponding RNBD (former IBNBD) rdma network block device.

Changelog since v5:
1 rebased to linux-5.5-rc4
2 fix typo in my email address in first patch
3 cleanup copyright as suggested by Leon Romanovsky
4 remove 2 redudant kobject_del in error path as suggested by Leon Romanovsky
5 add MAINTAINERS entries in alphabetical order as Gal Pressman suggested


Introduction
-------------

RTRS (RDMA Transport) is a reliable high speed transport library
which allows for establishing connection between client and server
machines via RDMA. It is based on RDMA-CM, so expect also to support RoCE
and iWARP, but we mainly tested in IB environment. It is optimized to
transfer (read/write) IO blocks in the sense that it follows the BIO
semantics of providing the possibility to either write data from a
scatter-gather list to the remote side or to request ("read") data
transfer from the remote side into a given set of buffers.

RTRS is multipath capable and provides I/O fail-over and load-balancing
functionality, i.e. in RTRS terminology, an RTRS path is a set of RDMA
connections and particular path is selected according to the load-balancing policy.
It can be used for other components not bind to RNBD.

RNBD (InfiniBand Network Block Device) is a pair of kernel modules
(client and server) that allow for remote access of a block device on
the server over RTRS protocol. After being mapped, the remote block
devices can be accessed on the client side as local block devices.
Internally RNBD uses RTRS as an RDMA transport library.

Commits for kernel can be found here:
   https://github.com/ionos-enterprise/ibnbd/commits/linux-5.5-rc4-ibnbd-v6
The out-of-tree modules are here:
   https://github.com/ionos-enterprise/ibnbd

As always, please review and share your comments, 

thanks.

Jack Wang (25):
  sysfs: export sysfs_remove_file_self()
  rtrs: public interface header to establish RDMA connections
  rtrs: private headers with rtrs protocol structs and helpers
  rtrs: core: lib functions shared between client and server modules
  rtrs: client: private header with client structs and functions
  rtrs: client: main functionality
  rtrs: client: statistics functions
  rtrs: client: sysfs interface functions
  rtrs: server: private header with server structs and functions
  rtrs: server: main functionality
  rtrs: server: statistics functions
  rtrs: server: sysfs interface functions
  rtrs: include client and server modules into kernel compilation
  rtrs: a bit of documentation
  rnbd: private headers with rnbd protocol structs and helpers
  rnbd: client: private header with client structs and functions
  rnbd: client: main functionality
  rnbd: client: sysfs interface functions
  rnbd: server: private header with server structs and functions
  rnbd: server: main functionality
  rnbd: server: functionality for IO submission to file or block dev
  rnbd: server: sysfs interface functions
  rnbd: include client and server modules into kernel compilation
  rnbd: a bit of documentation
  MAINTAINERS: Add maintainers for RNBD/RTRS modules

 Documentation/ABI/testing/sysfs-block-rnbd    |   51 +
 .../ABI/testing/sysfs-class-rnbd-client       |  117 +
 .../ABI/testing/sysfs-class-rnbd-server       |   57 +
 .../ABI/testing/sysfs-class-rtrs-client       |  190 ++
 .../ABI/testing/sysfs-class-rtrs-server       |   81 +
 MAINTAINERS                                   |   14 +
 drivers/block/Kconfig                         |    2 +
 drivers/block/Makefile                        |    1 +
 drivers/block/rnbd/Kconfig                    |   28 +
 drivers/block/rnbd/Makefile                   |   17 +
 drivers/block/rnbd/README                     |   92 +
 drivers/block/rnbd/rnbd-clt-sysfs.c           |  641 ++++
 drivers/block/rnbd/rnbd-clt.c                 | 1743 ++++++++++
 drivers/block/rnbd/rnbd-clt.h                 |  151 +
 drivers/block/rnbd/rnbd-common.c              |   25 +
 drivers/block/rnbd/rnbd-log.h                 |   43 +
 drivers/block/rnbd/rnbd-proto.h               |  307 ++
 drivers/block/rnbd/rnbd-srv-dev.c             |  144 +
 drivers/block/rnbd/rnbd-srv-dev.h             |  112 +
 drivers/block/rnbd/rnbd-srv-sysfs.c           |  213 ++
 drivers/block/rnbd/rnbd-srv.c                 |  864 +++++
 drivers/block/rnbd/rnbd-srv.h                 |   81 +
 drivers/infiniband/Kconfig                    |    1 +
 drivers/infiniband/ulp/Makefile               |    1 +
 drivers/infiniband/ulp/rtrs/Kconfig           |   27 +
 drivers/infiniband/ulp/rtrs/Makefile          |   17 +
 drivers/infiniband/ulp/rtrs/README            |  149 +
 drivers/infiniband/ulp/rtrs/rtrs-clt-stats.c  |  435 +++
 drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c  |  501 +++
 drivers/infiniband/ulp/rtrs/rtrs-clt.c        | 2934 +++++++++++++++++
 drivers/infiniband/ulp/rtrs/rtrs-clt.h        |  296 ++
 drivers/infiniband/ulp/rtrs/rtrs-log.h        |   32 +
 drivers/infiniband/ulp/rtrs/rtrs-pri.h        |  408 +++
 drivers/infiniband/ulp/rtrs/rtrs-srv-stats.c  |   91 +
 drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c  |  297 ++
 drivers/infiniband/ulp/rtrs/rtrs-srv.c        | 2169 ++++++++++++
 drivers/infiniband/ulp/rtrs/rtrs-srv.h        |  141 +
 drivers/infiniband/ulp/rtrs/rtrs.c            |  628 ++++
 drivers/infiniband/ulp/rtrs/rtrs.h            |  316 ++
 fs/sysfs/file.c                               |    1 +
 40 files changed, 13418 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-block-rnbd
 create mode 100644 Documentation/ABI/testing/sysfs-class-rnbd-client
 create mode 100644 Documentation/ABI/testing/sysfs-class-rnbd-server
 create mode 100644 Documentation/ABI/testing/sysfs-class-rtrs-client
 create mode 100644 Documentation/ABI/testing/sysfs-class-rtrs-server
 create mode 100644 drivers/block/rnbd/Kconfig
 create mode 100644 drivers/block/rnbd/Makefile
 create mode 100644 drivers/block/rnbd/README
 create mode 100644 drivers/block/rnbd/rnbd-clt-sysfs.c
 create mode 100644 drivers/block/rnbd/rnbd-clt.c
 create mode 100644 drivers/block/rnbd/rnbd-clt.h
 create mode 100644 drivers/block/rnbd/rnbd-common.c
 create mode 100644 drivers/block/rnbd/rnbd-log.h
 create mode 100644 drivers/block/rnbd/rnbd-proto.h
 create mode 100644 drivers/block/rnbd/rnbd-srv-dev.c
 create mode 100644 drivers/block/rnbd/rnbd-srv-dev.h
 create mode 100644 drivers/block/rnbd/rnbd-srv-sysfs.c
 create mode 100644 drivers/block/rnbd/rnbd-srv.c
 create mode 100644 drivers/block/rnbd/rnbd-srv.h
 create mode 100644 drivers/infiniband/ulp/rtrs/Kconfig
 create mode 100644 drivers/infiniband/ulp/rtrs/Makefile
 create mode 100644 drivers/infiniband/ulp/rtrs/README
 create mode 100644 drivers/infiniband/ulp/rtrs/rtrs-clt-stats.c
 create mode 100644 drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
 create mode 100644 drivers/infiniband/ulp/rtrs/rtrs-clt.c
 create mode 100644 drivers/infiniband/ulp/rtrs/rtrs-clt.h
 create mode 100644 drivers/infiniband/ulp/rtrs/rtrs-log.h
 create mode 100644 drivers/infiniband/ulp/rtrs/rtrs-pri.h
 create mode 100644 drivers/infiniband/ulp/rtrs/rtrs-srv-stats.c
 create mode 100644 drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
 create mode 100644 drivers/infiniband/ulp/rtrs/rtrs-srv.c
 create mode 100644 drivers/infiniband/ulp/rtrs/rtrs-srv.h
 create mode 100644 drivers/infiniband/ulp/rtrs/rtrs.c
 create mode 100644 drivers/infiniband/ulp/rtrs/rtrs.h

Comments

Bart Van Assche Dec. 31, 2019, 12:11 a.m. UTC | #1
On 2019-12-30 02:29, Jack Wang wrote:
> here is V6 of the RTRS (former IBTRS) rdma transport library and the
> corresponding RNBD (former IBNBD) rdma network block device.

Please provide more information about the RTRS_IO_RSP_IMM and
RTRS_IO_RSP_W_INV_IMM server to client message types. Does one of these
message types perhaps mean that the receiver of the message is
responsible for invalidating the rkey associated with the RDMA transfer?

Thanks,

Bart.
Bart Van Assche Dec. 31, 2019, 2:39 a.m. UTC | #2
On 2019-12-30 02:29, Jack Wang wrote:
> here is V6 of the RTRS (former IBTRS) rdma transport library and the
> corresponding RNBD (former IBNBD) rdma network block device.
> 
> Changelog since v5:
> 1 rebased to linux-5.5-rc4
> 2 fix typo in my email address in first patch
> 3 cleanup copyright as suggested by Leon Romanovsky
> 4 remove 2 redudant kobject_del in error path as suggested by Leon Romanovsky
> 5 add MAINTAINERS entries in alphabetical order as Gal Pressman suggested

Please always include the full changelog when posting a new version.
Every other Linux kernel patch series I have seen includes a full
changelog in version two and later versions of its cover letter.

Information about how this patch series has been tested would be
welcome. How big were the changes between v4 and v5 and how much testing
have these changes received? Was this patch series tested in the Ionos
data center or is it the out-of-tree version of these drivers that runs
in the Ionos data center?

Thanks,

Bart.
Jinpu Wang Jan. 2, 2020, 8:48 a.m. UTC | #3
On Tue, Dec 31, 2019 at 1:11 AM Bart Van Assche <bvanassche@acm.org> wrote:
>
> On 2019-12-30 02:29, Jack Wang wrote:
> > here is V6 of the RTRS (former IBTRS) rdma transport library and the
> > corresponding RNBD (former IBNBD) rdma network block device.
>
> Please provide more information about the RTRS_IO_RSP_IMM and
> RTRS_IO_RSP_W_INV_IMM server to client message types. Does one of these
> message types perhaps mean that the receiver of the message is
> responsible for invalidating the rkey associated with the RDMA transfer?
>
> Thanks,
>
> Bart.
Hi Bart,

You're right, RTRS_IO_RSP_W_INV_IMM means the client upon receiving
the message should invalidate
the rkey associated with the RDMA transfer.

We will document it in README PROTOCOL part.

Thanks,
Jack
Jinpu Wang Jan. 2, 2020, 9:20 a.m. UTC | #4
On Tue, Dec 31, 2019 at 3:39 AM Bart Van Assche <bvanassche@acm.org> wrote:
>
> On 2019-12-30 02:29, Jack Wang wrote:
> > here is V6 of the RTRS (former IBTRS) rdma transport library and the
> > corresponding RNBD (former IBNBD) rdma network block device.
> >
> > Changelog since v5:
> > 1 rebased to linux-5.5-rc4
> > 2 fix typo in my email address in first patch
> > 3 cleanup copyright as suggested by Leon Romanovsky
> > 4 remove 2 redudant kobject_del in error path as suggested by Leon Romanovsky
> > 5 add MAINTAINERS entries in alphabetical order as Gal Pressman suggested
>
> Please always include the full changelog when posting a new version.
> Every other Linux kernel patch series I have seen includes a full
> changelog in version two and later versions of its cover letter.
Sorry, it was my mistake, will include the full changelog next time.
>
> Information about how this patch series has been tested would be
> welcome. How big were the changes between v4 and v5 and how much testing
> have these changes received? Was this patch series tested in the Ionos
> data center or is it the out-of-tree version of these drivers that runs
> in the Ionos data center?
As mentioned in the v5 cover letter, the changes between v4 and v5
"'
 Main changes are the following:
1. Fix the security problem pointed out by Jason
2. Implement code-style/readability/API/etc suggestions by Bart van Assche
3. Rename IBTRS and IBNBD to RTRS and RNBD accordingly
4. Fileio mode support in rnbd-srv has been removed.

The main functional change is a fix for the security problem pointed out by
Jason and discussed both on the mailing list and during the last LPC
RDMA MC 2019.
On the server side we now invalidate in RTRS each rdma buffer before we hand it
over to RNBD server and in turn to the block layer. A new rkey is generated and
registered for the buffer after it returns back from the block layer and RNBD
server. The new rkey is sent back to the client along with the IO result.
The procedure is the default behaviour of the driver. This invalidation and
registration on each IO causes performance drop of up to 20%. A user of the
driver may choose to load the modules with this mechanism switched off
(always_invalidate=N), if he understands and can take the risk of a malicious
client being able to corrupt memory of a server it is connected to. This might
be a reasonable option in a scenario where all the clients and all the servers
are located within a secure datacenter.

Huge thanks to Bart van Assche for the very detailed review of both RNBD and
RTRS. These included suggestions for style fixes, better readability and
documentation, code simplifications, eliminating usage of deprecated APIs,
too many to name.

The transport library and the network block device using it have been renamed to
RTRS and RNBD accordingly in order to reflect the fact that they are based on
the rdma subsystem and not bound to InfiniBand only.

Fileio mode support in rnbd-server is not so efficent as pointed out by Bart,
and we can use loop device in between if there is need, hence we just
removed the fileio mode support.
"'
Regarding testing, all the changes have been tested with our
regression tests in our staging environment in IONOS data center.
it's around 200 test cases, for both always_invalidate=N and
always_invalidate=Y configurations.

I will mention it in the cover letter next time.

Thanks for your comments, Bart.
>
> Thanks,
>
> Bart.
Jason Gunthorpe Jan. 2, 2020, 6:28 p.m. UTC | #5
On Mon, Dec 30, 2019 at 06:39:00PM -0800, Bart Van Assche wrote:
> On 2019-12-30 02:29, Jack Wang wrote:
> > here is V6 of the RTRS (former IBTRS) rdma transport library and the
> > corresponding RNBD (former IBNBD) rdma network block device.
> > 
> > Changelog since v5:
> > 1 rebased to linux-5.5-rc4
> > 2 fix typo in my email address in first patch
> > 3 cleanup copyright as suggested by Leon Romanovsky
> > 4 remove 2 redudant kobject_del in error path as suggested by Leon Romanovsky
> > 5 add MAINTAINERS entries in alphabetical order as Gal Pressman suggested
> 
> Please always include the full changelog when posting a new version.
> Every other Linux kernel patch series I have seen includes a full
> changelog in version two and later versions of its cover letter.

We now also like it if you include URLs to lore.kernel.org for the
prior submissions.

Jason
Jinpu Wang Jan. 3, 2020, 12:34 p.m. UTC | #6
On Thu, Jan 2, 2020 at 7:28 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> On Mon, Dec 30, 2019 at 06:39:00PM -0800, Bart Van Assche wrote:
> > On 2019-12-30 02:29, Jack Wang wrote:
> > > here is V6 of the RTRS (former IBTRS) rdma transport library and the
> > > corresponding RNBD (former IBNBD) rdma network block device.
> > >
> > > Changelog since v5:
> > > 1 rebased to linux-5.5-rc4
> > > 2 fix typo in my email address in first patch
> > > 3 cleanup copyright as suggested by Leon Romanovsky
> > > 4 remove 2 redudant kobject_del in error path as suggested by Leon Romanovsky
> > > 5 add MAINTAINERS entries in alphabetical order as Gal Pressman suggested
> >
> > Please always include the full changelog when posting a new version.
> > Every other Linux kernel patch series I have seen includes a full
> > changelog in version two and later versions of its cover letter.
>
> We now also like it if you include URLs to lore.kernel.org for the
> prior submissions.
>
> Jason
Will do.