Message ID | 54B63028.3090701@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Alex, On Wed, Jan 14, 2015 at 09:00:24AM +0000, Alexandre Courbot wrote: > On 12/02/2014 01:57 AM, Will Deacon wrote: > > This patch plumbs the existing ARM IOMMU DMA infrastructure (which isn't > > actually called outside of a few drivers) into arch_setup_dma_ops, so > > that we can use IOMMUs for DMA transfers in a more generic fashion. > > > > Since this significantly complicates the arch_setup_dma_ops function, > > it is moved out of line into dma-mapping.c. If CONFIG_ARM_DMA_USE_IOMMU > > is not set, the iommu parameter is ignored and the normal ops are used > > instead. > > A series for IOMMU support with Tegra/Nouveau ceased to work after this > patch. Which series? This code shouldn't even be executed unless you start using of_xlate and the generic bindings. > The Tegra IOMMU is not registered by the time the DT is parsed, > and thus all devices end up without the proper DMA ops set up because > the phandle to the IOMMU cannot be resolved. You might want to look at the patches posted for the exynos, renesas and ARM SMMUs for some hints in how to use the new API. > Subsequently calling arm_iommu_create_mapping() and > arm_iommu_attach_device() from the driver (as I used to do until 3.18) > does not help since the call to set_dma_ops() has been moved out of > arm_iommu_attach_device(). Therefore there seems to be no way for a device > to gets its correct DMA ops unless the IOMMU is ready by the time the DT > is parsed. > > Also potentially affected by this are the Rockchip DRM and OMAP3 ISP > drivers, which follow the same pattern. I don't understand why any code currently in mainline should be affected. Please can you elaborate on the failure case? > This raises the following questions: > > 1) Why are arm_iommu_create_mapping() and arm_iommu_attach_device() > still public since they cannot set the DMA ops and thus seem to be > useless outside of arch_setup_dma_ops()? It has callers outside of the file. I'd like to make it static, but that means doing some non-trivial porting of all the callers, which I'm also unable to test. > 2) Say you want to use the IOMMU API in your driver, and have an iommu > property in your device's DT node. If by chance your IOMMU is registered > early, you will already have a mapping automatically created even before > your probe function is called. Can this be avoided? Is it even safe? Currently, I think you have to either teardown the ops manually or return an error from of_xlate. Thierry was also looking at this sort of thing, so it might be worth talking to him. Will
Hi Will, Alexandre, Daniel, Am Mittwoch, 14. Januar 2015, 10:46:10 schrieb Will Deacon: > Hi Alex, > > On Wed, Jan 14, 2015 at 09:00:24AM +0000, Alexandre Courbot wrote: > > On 12/02/2014 01:57 AM, Will Deacon wrote: > > > This patch plumbs the existing ARM IOMMU DMA infrastructure (which isn't > > > actually called outside of a few drivers) into arch_setup_dma_ops, so > > > that we can use IOMMUs for DMA transfers in a more generic fashion. > > > > > > Since this significantly complicates the arch_setup_dma_ops function, > > > it is moved out of line into dma-mapping.c. If CONFIG_ARM_DMA_USE_IOMMU > > > is not set, the iommu parameter is ignored and the normal ops are used > > > instead. > > > > A series for IOMMU support with Tegra/Nouveau ceased to work after this > > patch. > > Which series? This code shouldn't even be executed unless you start using > of_xlate and the generic bindings. > > > The Tegra IOMMU is not registered by the time the DT is parsed, > > and thus all devices end up without the proper DMA ops set up because > > the phandle to the IOMMU cannot be resolved. > > You might want to look at the patches posted for the exynos, renesas and ARM > SMMUs for some hints in how to use the new API. > > > Subsequently calling arm_iommu_create_mapping() and > > arm_iommu_attach_device() from the driver (as I used to do until 3.18) > > does not help since the call to set_dma_ops() has been moved out of > > arm_iommu_attach_device(). Therefore there seems to be no way for a device > > to gets its correct DMA ops unless the IOMMU is ready by the time the DT > > is parsed. > > > > Also potentially affected by this are the Rockchip DRM and OMAP3 ISP > > drivers, which follow the same pattern. > > I don't understand why any code currently in mainline should be affected. > Please can you elaborate on the failure case? As Alexandre suspected the new Rockchip drm code seems to be affected by this. I hadn't played with the drm code before last weekend and was then stumbling over different iommu related issues. As I hadn't to much contact with iommus till now I didn't get very far. But with Alexandre's bandaid patch of adding set_dma_ops(dev, &iommu_ops); to arm_iommu_attach_device both problems go away. So to elaborate on the two failure cases: When attaching either hdmi or vga connectors at runtime, I get [drm:drm_mode_debug_printmodeline] Modeline 26:"1024x768" 85 94500 1024 1072 1168 1376 768 769 772 808 0x40 0x5 [drm:drm_crtc_helper_set_mode] [CRTC:14] [drm:vop_crtc_dpms] crtc[14] mode[0] rockchip-vop ff930000.vop: Attached to iommu domain [drm] processing encoder TMDS-18 [drm] processing encoder DAC-22 [drm] processing connector HDMI-A-1 [drm] processing connector VGA-1 [drm:vop_win_update] [PLANE:12] [FB:-1->24] update rk_iommu ff930300.iommu: Page fault at 0x2d400500 of type read rk_iommu ff930300.iommu: iova = 0x2d400500: dte_index: 0xb5 pte_index: 0x0 page_offset: 0x500 rk_iommu ff930300.iommu: mmu_dte_addr: 0x2e3b3000 dte@0x2e3b32d4: 0x000000 valid: 0 pte@0x00000000: 0x000000 valid: 0 page@0x00000000 flags: 0x0 [drm:drm_crtc_helper_set_mode] [ENCODER:22:DAC-22] set [MODE:26:1024x768] =============== When my wip vga-connector is plugged in at boot, I get [drm:drm_target_preferred] found mode 1280x1024 [drm:drm_setup_crtcs] picking CRTCs for 4096x4096 config [drm:drm_setup_crtcs] desired mode 1280x1024 set on crtc 14 (0,0) ------------[ cut here ]------------ WARNING: CPU: 1 PID: 33 at mm/page_alloc.c:2645 __alloc_pages_nodemask+0x18c/0x6a8() Modules linked in: CPU: 1 PID: 33 Comm: kworker/u8:1 Not tainted 3.19.0-rc1+ #1512 Hardware name: Rockchip Cortex-A9 (Device Tree) Workqueue: deferwq deferred_probe_work_func [<c00148e4>] (unwind_backtrace) from [<c00111e0>] (show_stack+0x10/0x14) [<c00111e0>] (show_stack) from [<c0426564>] (dump_stack+0x6c/0x84) [<c0426564>] (dump_stack) from [<c0021f34>] (warn_slowpath_common+0x80/0xac) [<c0021f34>] (warn_slowpath_common) from [<c0021f78>] (warn_slowpath_null+0x18/0x1c) [<c0021f78>] (warn_slowpath_null) from [<c008b114>] (__alloc_pages_nodemask+0x18c/0x6a8) [<c008b114>] (__alloc_pages_nodemask) from [<c001a274>] (__dma_alloc_buffer.isra.18+0x2c/0x80) [<c001a274>] (__dma_alloc_buffer.isra.18) from [<c001a2dc>] (__alloc_remap_buffer.isra.22+0x14/0x5c) [<c001a2dc>] (__alloc_remap_buffer.isra.22) from [<c001a490>] (__dma_alloc+0x16c/0x1d8) [<c001a490>] (__dma_alloc) from [<c001a614>] (arm_dma_alloc+0x84/0x90) [<c001a614>] (arm_dma_alloc) from [<c022f610>] (rockchip_gem_create_object+0x8c/0xc4) [<c022f610>] (rockchip_gem_create_object) from [<c022f158>] (rockchip_drm_fbdev_create+0x6c/0x1ec) [<c022f158>] (rockchip_drm_fbdev_create) from [<c021499c>] (drm_fb_helper_initial_config+0x230/0x328) [<c021499c>] (drm_fb_helper_initial_config) from [<c022f388>] (rockchip_drm_fbdev_init+0xa4/0xc0) [<c022f388>] (rockchip_drm_fbdev_init) from [<c022ea18>] (rockchip_drm_load+0x1b8/0x1f4) [<c022ea18>] (rockchip_drm_load) from [<c021c218>] (drm_dev_register+0x80/0x100) [<c021c218>] (drm_dev_register) from [<c022e77c>] (rockchip_drm_bind+0x48/0x74) [<c022e77c>] (rockchip_drm_bind) from [<c0234a58>] (try_to_bring_up_master.part.2+0xa4/0xf4) [<c0234a58>] (try_to_bring_up_master.part.2) from [<c0234c58>] (component_add+0x9c/0x104) [<c0234c58>] (component_add) from [<c023a1e4>] (platform_drv_probe+0x48/0x90) [<c023a1e4>] (platform_drv_probe) from [<c0238c30>] (driver_probe_device+0x130/0x340) [<c0238c30>] (driver_probe_device) from [<c0237270>] (bus_for_each_drv+0x70/0x84) [<c0237270>] (bus_for_each_drv) from [<c0238a8c>] (device_attach+0x64/0x88) [<c0238a8c>] (device_attach) from [<c0238090>] (bus_probe_device+0x28/0x98) [<c0238090>] (bus_probe_device) from [<c0238548>] (deferred_probe_work_func+0x78/0xa4) [<c0238548>] (deferred_probe_work_func) from [<c0034008>] (process_one_work+0x1c8/0x2f4) [<c0034008>] (process_one_work) from [<c0034448>] (worker_thread+0x2e8/0x450) [<c0034448>] (worker_thread) from [<c00380a8>] (kthread+0xdc/0xf0) [<c00380a8>] (kthread) from [<c000e8b8>] (ret_from_fork+0x14/0x3c) ---[ end trace ce23b3730f8c0d32 ]--- [drm:rockchip_gem_create_object] *ERROR* failed to allocate 0x500000 byte dma buffer where Daniel Kurtz already deducted in private: "But, more importantly, this call stack has "arm_dma_alloc", which suggests that are not actually using the iommu dma allocators. The allocation should have been handled by arm_iommu_alloc_attrs(), but for some reason, it is not. The iommu allocator should have been installed as the allocator for the drm device by the call to arm_iommu_attach_device() in rockchip_drm_load." > > This raises the following questions: > > > > 1) Why are arm_iommu_create_mapping() and arm_iommu_attach_device() > > still public since they cannot set the DMA ops and thus seem to be > > useless outside of arch_setup_dma_ops()? > > It has callers outside of the file. I'd like to make it static, but that > means doing some non-trivial porting of all the callers, which I'm also > unable to test. > > > 2) Say you want to use the IOMMU API in your driver, and have an iommu > > property in your device's DT node. If by chance your IOMMU is registered > > early, you will already have a mapping automatically created even before > > your probe function is called. Can this be avoided? Is it even safe? > > Currently, I think you have to either teardown the ops manually or return > an error from of_xlate. Thierry was also looking at this sort of thing, > so it might be worth talking to him. > > Will [drm:output_poll_execute] [CONNECTOR:23:VGA-1] status updated from disconnected to connected [drm:drm_fb_helper_hotplug_event] [drm:drm_helper_probe_single_connector_modes_merge_bits] [CONNECTOR:20:HDMI-A-1] [drm:drm_helper_probe_single_connector_modes_merge_bits] [CONNECTOR:20:HDMI-A-1] disconnected [drm:drm_helper_probe_single_connector_modes_merge_bits] [CONNECTOR:23:VGA-1] [drm:drm_mode_debug_printmodeline] Modeline 26:"1280x1024" 60 108000 1280 1328 1440 1688 1024 1025 1028 1066 0x48 0x5 [drm:drm_mode_prune_invalid] Not using 1280x1024 mode 12 [drm:drm_mode_debug_printmodeline] Modeline 27:"1152x864" 0 108000 1152 1216 1344 1600 864 865 868 900 0x40 0x5 [drm:drm_mode_prune_invalid] Not using 1152x864 mode 12 [drm:drm_mode_debug_printmodeline] Modeline 38:"1280x1024" 0 135000 1280 1296 1440 1688 1024 1025 1028 1066 0x40 0x5 [drm:drm_mode_prune_invalid] Not using 1280x1024 mode 12 [drm:drm_helper_probe_single_connector_modes_merge_bits] [CONNECTOR:23:VGA-1] probed modes : [drm:drm_mode_debug_printmodeline] Modeline 28:"1024x768" 85 94500 1024 1072 1168 1376 768 769 772 808 0x40 0x5 [drm:drm_mode_debug_printmodeline] Modeline 39:"1024x768" 75 78800 1024 1040 1136 1312 768 769 772 800 0x40 0x5 [drm:drm_mode_debug_printmodeline] Modeline 40:"1024x768" 70 75000 1024 1048 1184 1328 768 771 777 806 0x40 0xa [drm:drm_mode_debug_printmodeline] Modeline 41:"1024x768" 60 65000 1024 1048 1184 1344 768 771 777 806 0x40 0xa [drm:drm_mode_debug_printmodeline] Modeline 42:"832x624" 75 57284 832 864 928 1152 624 625 628 667 0x40 0xa [drm:drm_mode_debug_printmodeline] Modeline 29:"800x600" 85 56250 800 832 896 1048 600 601 604 631 0x40 0x5 [drm:drm_mode_debug_printmodeline] Modeline 43:"800x600" 75 49500 800 816 896 1056 600 601 604 625 0x40 0x5 [drm:drm_mode_debug_printmodeline] Modeline 44:"800x600" 72 50000 800 856 976 1040 600 637 643 666 0x40 0x5 [drm:drm_mode_debug_printmodeline] Modeline 31:"800x600" 60 40000 800 840 968 1056 600 601 605 628 0x40 0x5 [drm:drm_mode_debug_printmodeline] Modeline 32:"800x600" 56 36000 800 824 896 1024 600 601 603 625 0x40 0x5 [drm:drm_mode_debug_printmodeline] Modeline 30:"640x480" 85 36000 640 696 752 832 480 481 484 509 0x40 0xa [drm:drm_mode_debug_printmodeline] Modeline 33:"640x480" 75 31500 640 656 720 840 480 481 484 500 0x40 0xa [drm:drm_mode_debug_printmodeline] Modeline 34:"640x480" 73 31500 640 664 704 832 480 489 491 520 0x40 0xa [drm:drm_mode_debug_printmodeline] Modeline 35:"640x480" 67 30240 640 704 768 864 480 483 486 525 0x40 0xa [drm:drm_mode_debug_printmodeline] Modeline 36:"640x480" 60 25200 640 656 752 800 480 490 492 525 0x40 0xa [drm:drm_mode_debug_printmodeline] Modeline 37:"720x400" 70 28320 720 738 846 900 400 412 414 449 0x40 0x6 [drm:drm_setup_crtcs] [drm:drm_enable_connectors] connector 20 enabled? no [drm:drm_enable_connectors] connector 23 enabled? yes [drm:drm_target_preferred] looking for cmdline mode on connector 23 [drm:drm_target_preferred] looking for preferred mode on connector 23 0 [drm:drm_target_preferred] found mode 1024x768 [drm:drm_setup_crtcs] picking CRTCs for 4096x4096 config [drm:drm_setup_crtcs] desired mode 1024x768 set on crtc 14 (0,0) [drm:drm_crtc_helper_set_config] [drm:drm_crtc_helper_set_config] [CRTC:8] [NOFB] [drm:vop_crtc_dpms] crtc[8] mode[3] [drm:vop_crtc_dpms] desired dpms mode is same as previous one. [drm:vop_crtc_dpms] crtc[14] mode[3] [drm:vop_crtc_dpms] desired dpms mode is same as previous one. [drm:drm_crtc_helper_set_config] [drm:drm_crtc_helper_set_config] [CRTC:14] [FB:24] #connectors=1 (x y) (0 0) [drm:drm_crtc_helper_set_config] crtc has no fb, full mode set [drm:drm_crtc_helper_set_config] modes are different, full mode set [drm:drm_mode_debug_printmodeline] Modeline 0:"" 0 0 0 0 0 0 0 0 0 0 0x0 0x0 [drm:drm_mode_debug_printmodeline] Modeline 26:"1024x768" 85 94500 1024 1072 1168 1376 768 769 772 808 0x40 0x5 [drm:drm_crtc_helper_set_config] encoder changed, full mode switch [drm:drm_crtc_helper_set_config] [CONNECTOR:20:HDMI-A-1] to [NOCRTC] [drm:drm_crtc_helper_set_config] crtc changed, full mode switch [drm:drm_crtc_helper_set_config] [CONNECTOR:23:VGA-1] to [CRTC:14] [drm:drm_crtc_helper_set_config] attempting to set mode from userspace [drm:drm_mode_debug_printmodeline] Modeline 26:"1024x768" 85 94500 1024 1072 1168 1376 768 769 772 808 0x40 0x5 [drm:drm_crtc_helper_set_mode] [CRTC:14] [drm:vop_crtc_dpms] crtc[14] mode[0] rockchip-vop ff930000.vop: Attached to iommu domain [drm] processing encoder TMDS-18 [drm] processing encoder DAC-22 [drm] processing connector HDMI-A-1 [drm] processing connector VGA-1 [drm:vop_win_update] [PLANE:12] [FB:-1->24] update rk_iommu ff930300.iommu: Page fault at 0x2d400500 of type read rk_iommu ff930300.iommu: iova = 0x2d400500: dte_index: 0xb5 pte_index: 0x0 page_offset: 0x500 rk_iommu ff930300.iommu: mmu_dte_addr: 0x2e3b3000 dte@0x2e3b32d4: 0x000000 valid: 0 pte@0x00000000: 0x000000 valid: 0 page@0x00000000 flags: 0x0 [drm:drm_crtc_helper_set_mode] [ENCODER:22:DAC-22] set [MODE:26:1024x768] [drm:drm_crtc_helper_set_config] Setting connector DPMS state to on [drm:drm_crtc_helper_set_config] [CONNECTOR:23:VGA-1] set DPMS on [drm:vop_crtc_dpms] crtc[8] mode[3] [drm:vop_crtc_dpms] desired dpms mode is same as previous one. [drm:drm_do_probe_ddc_edid] drm: skipping non-existent adapter rk3x-i2c [drm:output_poll_execute] [CONNECTOR:23:VGA-1] status updated from connected to disconnected [drm:drm_fb_helper_hotplug_event] [drm:drm_helper_probe_single_connector_modes_merge_bits] [CONNECTOR:20:HDMI-A-1] [drm:drm_helper_probe_single_connector_modes_merge_bits] [CONNECTOR:20:HDMI-A-1] disconnected [drm:drm_helper_probe_single_connector_modes_merge_bits] [CONNECTOR:23:VGA-1] [drm:drm_do_probe_ddc_edid] drm: skipping non-existent adapter rk3x-i2c [drm:drm_helper_probe_single_connector_modes_merge_bits] [CONNECTOR:23:VGA-1] disconnected [drm:drm_setup_crtcs] [drm:drm_enable_connectors] connector 20 enabled? no [drm:drm_enable_connectors] connector 23 enabled? no [drm:drm_setup_crtcs] picking CRTCs for 4096x4096 config [drm:drm_crtc_helper_set_config] [drm:drm_crtc_helper_set_config] [CRTC:8] [NOFB] [drm:vop_crtc_dpms] crtc[8] mode[3] [drm:vop_crtc_dpms] desired dpms mode is same as previous one. [drm:drm_crtc_helper_set_config] [drm:drm_crtc_helper_set_config] [CRTC:14] [NOFB] [drm:vop_crtc_dpms] crtc[8] mode[3] [drm:vop_crtc_dpms] desired dpms mode is same as previous one. [drm:vop_crtc_dpms] crtc[14] mode[3] rk_iommu ff930300.iommu: Enable stall request timed out, status: 0x00004b rk_iommu ff930300.iommu: Disable paging request timed out, status: 0x00004b rockchip-vop ff930000.vop: Detached from iommu domain rockchip-drm display-subsystem: bound ff940000.vop (ops vop_component_ops) rockchip-drm display-subsystem: bound ff930000.vop (ops vop_component_ops) dwhdmi-rockchip ff980000.hdmi: Detected HDMI controller 0x20:0xa:0xa0:0xc1 rockchip-drm display-subsystem: bound ff980000.hdmi (ops dw_hdmi_rockchip_ops) rockchip-drm display-subsystem: bound vga-bridge (ops vga_bridge_ops) [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [drm] No driver support for vblank timestamp query. [drm:vop_crtc_dpms] crtc[8] mode[3] [drm:vop_crtc_dpms] desired dpms mode is same as previous one. [drm:vop_crtc_dpms] crtc[14] mode[3] [drm:vop_crtc_dpms] desired dpms mode is same as previous one. [drm:drm_helper_probe_single_connector_modes_merge_bits] [CONNECTOR:20:HDMI-A-1] [drm:drm_helper_probe_single_connector_modes_merge_bits] [CONNECTOR:20:HDMI-A-1] disconnected [drm:drm_helper_probe_single_connector_modes_merge_bits] [CONNECTOR:23:VGA-1] [drm:drm_helper_probe_single_connector_modes_merge_bits] [CONNECTOR:23:VGA-1] probed modes : [drm:drm_mode_debug_printmodeline] Modeline 25:"1280x1024" 60 108000 1280 1328 1440 1688 1024 1025 1028 1066 0x48 0x5 [drm:drm_mode_debug_printmodeline] Modeline 37:"1280x1024" 75 135000 1280 1296 1440 1688 1024 1025 1028 1066 0x40 0x5 [drm:drm_mode_debug_printmodeline] Modeline 26:"1152x864" 75 108000 1152 1216 1344 1600 864 865 868 900 0x40 0x5 [drm:drm_mode_debug_printmodeline] Modeline 27:"1024x768" 85 94500 1024 1072 1168 1376 768 769 772 808 0x40 0x5 [drm:drm_mode_debug_printmodeline] Modeline 38:"1024x768" 75 78800 1024 1040 1136 1312 768 769 772 800 0x40 0x5 [drm:drm_mode_debug_printmodeline] Modeline 39:"1024x768" 70 75000 1024 1048 1184 1328 768 771 777 806 0x40 0xa [drm:drm_mode_debug_printmodeline] Modeline 40:"1024x768" 60 65000 1024 1048 1184 1344 768 771 777 806 0x40 0xa [drm:drm_mode_debug_printmodeline] Modeline 41:"832x624" 75 57284 832 864 928 1152 624 625 628 667 0x40 0xa [drm:drm_mode_debug_printmodeline] Modeline 28:"800x600" 85 56250 800 832 896 1048 600 601 604 631 0x40 0x5 [drm:drm_mode_debug_printmodeline] Modeline 42:"800x600" 75 49500 800 816 896 1056 600 601 604 625 0x40 0x5 [drm:drm_mode_debug_printmodeline] Modeline 43:"800x600" 72 50000 800 856 976 1040 600 637 643 666 0x40 0x5 [drm:drm_mode_debug_printmodeline] Modeline 30:"800x600" 60 40000 800 840 968 1056 600 601 605 628 0x40 0x5 [drm:drm_mode_debug_printmodeline] Modeline 31:"800x600" 56 36000 800 824 896 1024 600 601 603 625 0x40 0x5 [drm:drm_mode_debug_printmodeline] Modeline 29:"640x480" 85 36000 640 696 752 832 480 481 484 509 0x40 0xa [drm:drm_mode_debug_printmodeline] Modeline 32:"640x480" 75 31500 640 656 720 840 480 481 484 500 0x40 0xa [drm:drm_mode_debug_printmodeline] Modeline 33:"640x480" 73 31500 640 664 704 832 480 489 491 520 0x40 0xa [drm:drm_mode_debug_printmodeline] Modeline 34:"640x480" 67 30240 640 704 768 864 480 483 486 525 0x40 0xa [drm:drm_mode_debug_printmodeline] Modeline 35:"640x480" 60 25200 640 656 752 800 480 490 492 525 0x40 0xa [drm:drm_mode_debug_printmodeline] Modeline 36:"720x400" 70 28320 720 738 846 900 400 412 414 449 0x40 0x6 [drm:drm_setup_crtcs] [drm:drm_enable_connectors] connector 20 enabled? no [drm:drm_enable_connectors] connector 23 enabled? yes [drm:drm_target_preferred] looking for cmdline mode on connector 23 [drm:drm_target_preferred] looking for preferred mode on connector 23 0 [drm:drm_target_preferred] found mode 1280x1024 [drm:drm_setup_crtcs] picking CRTCs for 4096x4096 config [drm:drm_setup_crtcs] desired mode 1280x1024 set on crtc 14 (0,0) ------------[ cut here ]------------ WARNING: CPU: 1 PID: 33 at mm/page_alloc.c:2645 __alloc_pages_nodemask+0x18c/0x6a8() Modules linked in: CPU: 1 PID: 33 Comm: kworker/u8:1 Not tainted 3.19.0-rc1+ #1512 Hardware name: Rockchip Cortex-A9 (Device Tree) Workqueue: deferwq deferred_probe_work_func [<c00148e4>] (unwind_backtrace) from [<c00111e0>] (show_stack+0x10/0x14) [<c00111e0>] (show_stack) from [<c0426564>] (dump_stack+0x6c/0x84) [<c0426564>] (dump_stack) from [<c0021f34>] (warn_slowpath_common+0x80/0xac) [<c0021f34>] (warn_slowpath_common) from [<c0021f78>] (warn_slowpath_null+0x18/0x1c) [<c0021f78>] (warn_slowpath_null) from [<c008b114>] (__alloc_pages_nodemask+0x18c/0x6a8) [<c008b114>] (__alloc_pages_nodemask) from [<c001a274>] (__dma_alloc_buffer.isra.18+0x2c/0x80) [<c001a274>] (__dma_alloc_buffer.isra.18) from [<c001a2dc>] (__alloc_remap_buffer.isra.22+0x14/0x5c) [<c001a2dc>] (__alloc_remap_buffer.isra.22) from [<c001a490>] (__dma_alloc+0x16c/0x1d8) [<c001a490>] (__dma_alloc) from [<c001a614>] (arm_dma_alloc+0x84/0x90) [<c001a614>] (arm_dma_alloc) from [<c022f610>] (rockchip_gem_create_object+0x8c/0xc4) [<c022f610>] (rockchip_gem_create_object) from [<c022f158>] (rockchip_drm_fbdev_create+0x6c/0x1ec) [<c022f158>] (rockchip_drm_fbdev_create) from [<c021499c>] (drm_fb_helper_initial_config+0x230/0x328) [<c021499c>] (drm_fb_helper_initial_config) from [<c022f388>] (rockchip_drm_fbdev_init+0xa4/0xc0) [<c022f388>] (rockchip_drm_fbdev_init) from [<c022ea18>] (rockchip_drm_load+0x1b8/0x1f4) [<c022ea18>] (rockchip_drm_load) from [<c021c218>] (drm_dev_register+0x80/0x100) [<c021c218>] (drm_dev_register) from [<c022e77c>] (rockchip_drm_bind+0x48/0x74) [<c022e77c>] (rockchip_drm_bind) from [<c0234a58>] (try_to_bring_up_master.part.2+0xa4/0xf4) [<c0234a58>] (try_to_bring_up_master.part.2) from [<c0234c58>] (component_add+0x9c/0x104) [<c0234c58>] (component_add) from [<c023a1e4>] (platform_drv_probe+0x48/0x90) [<c023a1e4>] (platform_drv_probe) from [<c0238c30>] (driver_probe_device+0x130/0x340) [<c0238c30>] (driver_probe_device) from [<c0237270>] (bus_for_each_drv+0x70/0x84) [<c0237270>] (bus_for_each_drv) from [<c0238a8c>] (device_attach+0x64/0x88) [<c0238a8c>] (device_attach) from [<c0238090>] (bus_probe_device+0x28/0x98) [<c0238090>] (bus_probe_device) from [<c0238548>] (deferred_probe_work_func+0x78/0xa4) [<c0238548>] (deferred_probe_work_func) from [<c0034008>] (process_one_work+0x1c8/0x2f4) [<c0034008>] (process_one_work) from [<c0034448>] (worker_thread+0x2e8/0x450) [<c0034448>] (worker_thread) from [<c00380a8>] (kthread+0xdc/0xf0) [<c00380a8>] (kthread) from [<c000e8b8>] (ret_from_fork+0x14/0x3c) ---[ end trace ce23b3730f8c0d32 ]--- [drm:rockchip_gem_create_object] *ERROR* failed to allocate 0x500000 byte dma buffer
On Wed, Jan 14, 2015 at 01:51:36PM +0000, Heiko Stübner wrote: > Am Mittwoch, 14. Januar 2015, 10:46:10 schrieb Will Deacon: > > On Wed, Jan 14, 2015 at 09:00:24AM +0000, Alexandre Courbot wrote: > > > On 12/02/2014 01:57 AM, Will Deacon wrote: > > > > This patch plumbs the existing ARM IOMMU DMA infrastructure (which isn't > > > > actually called outside of a few drivers) into arch_setup_dma_ops, so > > > > that we can use IOMMUs for DMA transfers in a more generic fashion. > > > > > > > > Since this significantly complicates the arch_setup_dma_ops function, > > > > it is moved out of line into dma-mapping.c. If CONFIG_ARM_DMA_USE_IOMMU > > > > is not set, the iommu parameter is ignored and the normal ops are used > > > > instead. > > > > > > A series for IOMMU support with Tegra/Nouveau ceased to work after this > > > patch. > > > > Which series? This code shouldn't even be executed unless you start using > > of_xlate and the generic bindings. > > > > > The Tegra IOMMU is not registered by the time the DT is parsed, > > > and thus all devices end up without the proper DMA ops set up because > > > the phandle to the IOMMU cannot be resolved. > > > > You might want to look at the patches posted for the exynos, renesas and ARM > > SMMUs for some hints in how to use the new API. > > > > > Subsequently calling arm_iommu_create_mapping() and > > > arm_iommu_attach_device() from the driver (as I used to do until 3.18) > > > does not help since the call to set_dma_ops() has been moved out of > > > arm_iommu_attach_device(). Therefore there seems to be no way for a device > > > to gets its correct DMA ops unless the IOMMU is ready by the time the DT > > > is parsed. > > > > > > Also potentially affected by this are the Rockchip DRM and OMAP3 ISP > > > drivers, which follow the same pattern. > > > > I don't understand why any code currently in mainline should be affected. > > Please can you elaborate on the failure case? > > As Alexandre suspected the new Rockchip drm code seems to be affected by > this. I hadn't played with the drm code before last weekend and was then > stumbling over different iommu related issues. As I hadn't to much contact > with iommus till now I didn't get very far. > > But with Alexandre's bandaid patch of adding > set_dma_ops(dev, &iommu_ops); > to arm_iommu_attach_device both problems go away. > > > So to elaborate on the two failure cases: Aha, I see what you mean now -- the issue is that attaching to an IOMMU domain no longer swizzles the DMA ops. Furthermore, we also need to take into account the coherency of the device, which we didn't do before (and assumedly "worked" because all of the users happened to be non-coherent). Maybe we just need to add something like arm_iommu_swizzle_dma_ops, so that direct users of the arm_iommu_* API can manage things manually? Will
On 01/14/2015 07:46 PM, Will Deacon wrote: > Hi Alex, > > On Wed, Jan 14, 2015 at 09:00:24AM +0000, Alexandre Courbot wrote: >> On 12/02/2014 01:57 AM, Will Deacon wrote: >>> This patch plumbs the existing ARM IOMMU DMA infrastructure (which isn't >>> actually called outside of a few drivers) into arch_setup_dma_ops, so >>> that we can use IOMMUs for DMA transfers in a more generic fashion. >>> >>> Since this significantly complicates the arch_setup_dma_ops function, >>> it is moved out of line into dma-mapping.c. If CONFIG_ARM_DMA_USE_IOMMU >>> is not set, the iommu parameter is ignored and the normal ops are used >>> instead. >> >> A series for IOMMU support with Tegra/Nouveau ceased to work after this >> patch. > > Which series? This code shouldn't even be executed unless you start using > of_xlate and the generic bindings. > >> The Tegra IOMMU is not registered by the time the DT is parsed, >> and thus all devices end up without the proper DMA ops set up because >> the phandle to the IOMMU cannot be resolved. > > You might want to look at the patches posted for the exynos, renesas and ARM > SMMUs for some hints in how to use the new API. > >> Subsequently calling arm_iommu_create_mapping() and >> arm_iommu_attach_device() from the driver (as I used to do until 3.18) >> does not help since the call to set_dma_ops() has been moved out of >> arm_iommu_attach_device(). Therefore there seems to be no way for a device >> to gets its correct DMA ops unless the IOMMU is ready by the time the DT >> is parsed. >> >> Also potentially affected by this are the Rockchip DRM and OMAP3 ISP >> drivers, which follow the same pattern. > > I don't understand why any code currently in mainline should be affected. > Please can you elaborate on the failure case? Here is the sequence of events: 1) DT is populated and of_dma_configure() is called for every device. Here is the stack trace: of_dma_configure of_platform_device_create_pdata of_platform_bus_create of_platform_bus_create of_platform_populate customize_machine do_one_initcall of_dma_configure() is supposed to set the iommu_ops for every device (and this is currently the only place in the kernel where this can happen), but since the IOMMU is not ready yet it falls back to the default arm_dma_ops. 2) Driver calls arm_iommu_create_mapping() and arm_iommu_attach_device() (which were already called during step 1 btw). Both calls succeed, but since arm_iommu_attach_device() does not change the ops anymore we are still left with arm_dma_ops. 3) Things don't go well. :) There are several problems with the way things are currently setup IMHO: 1) CONFIG_ARM_DMA_USE_IOMMU forcibly makes the DMA API go through the IOMMU. This effectively forbids the simultaneous use of the IOMMU API and DMA API because of address-space collisions issue. Being a kernel config option, there is no way to turn this behavior off which will likely become a problem for multi-configs where some platforms might want this and some other not. 2) IIUC arm_iommu_*() are now not supposed to be called directly by drivers anymore. At least that's what their current behavior makes me think, as well as the fact that they are not defined if CONFIG_ARM_DMA_USE_IOMMU is not set. Yet they are still public and drivers that use them have not been updated consequently. 3) This is a longer-standing issue, but IIUC the fact that the IOMMU VM used by the DMA API is not available effectively prevents anyone from using the DMA API behind a IOMMU and the IOMMU API simultaneously (which you now have to assume is always the case because of 1)). These are huge issues - under the current conditions the only safe path for me is to eschew the DMA API completely and port my work to the IOMMU API, and I suspect this will affect other people as well. > >> This raises the following questions: >> >> 1) Why are arm_iommu_create_mapping() and arm_iommu_attach_device() >> still public since they cannot set the DMA ops and thus seem to be >> useless outside of arch_setup_dma_ops()? > > It has callers outside of the file. I'd like to make it static, but that > means doing some non-trivial porting of all the callers, which I'm also > unable to test. Yeah, but all the callers of these functions are broken anyway as of 3.18-rc4... > >> 2) Say you want to use the IOMMU API in your driver, and have an iommu >> property in your device's DT node. If by chance your IOMMU is registered >> early, you will already have a mapping automatically created even before >> your probe function is called. Can this be avoided? Is it even safe? > > Currently, I think you have to either teardown the ops manually or return > an error from of_xlate. Thierry was also looking at this sort of thing, > so it might be worth talking to him. Tearing down the ops manually does not sound like something drivers should do - how can they even be aware of the private dma_ops anyway? Thanks, Alex.
On Wed, Jan 14, 2015 at 10:46:10AM +0000, Will Deacon wrote: > On Wed, Jan 14, 2015 at 09:00:24AM +0000, Alexandre Courbot wrote: [...] > > 2) Say you want to use the IOMMU API in your driver, and have an iommu > > property in your device's DT node. If by chance your IOMMU is registered > > early, you will already have a mapping automatically created even before > > your probe function is called. Can this be avoided? Is it even safe? > > Currently, I think you have to either teardown the ops manually or return > an error from of_xlate. Thierry was also looking at this sort of thing, > so it might be worth talking to him. I already explained in earlier threads why I think this is a bad idea. It's completely unnatural for any driver to manually tear down something that it didn't want set up in the first place. It also means that you have to carefully audit any users of these IOMMU APIs to make sure that they do tear down. That doesn't sound like a good incremental approach, as evidenced by the breakage that Alex and Heiko have encountered. The solution for me has been to completely side-step the issue and not register the IOMMU with the new mechanism at all. That is, there's no .of_xlate() implementation, which means that the ARM DMA API glue won't try to be smart and use the IOMMU in ways it's not meant to be used. This has several advantages, such as that I can also use the regular driver model for suspend/resume of the IOMMU, and I get to enjoy the benefits of devres in the IOMMU driver. Probe ordering is still a tiny issue, but we can easily solve that using explicit initcall ordering (which really isn't any worse than IOMMU_OF_DECLARE()). Ideally of course I could use deferred probing to solve that. Hiroshi and I posted patches to implement that over a year ago[0], but it was mostly ignored after some, in my opinion, fruitful discussion. That patch requires drivers to explicitly call iommu_attach() to advertise that they want to use the IOMMU. This was done because a previous attempt to call iommu_attach() from the driver core was rejected on the sole argument that "it doesn't belong in the core", without further explanation. Note that with that approach things all happen at driver probe time, so we can simply return -EPROBE_DEFER when the IOMMU isn't ready yet. At the same time the IOMMU can be a regular driver just like any of the other resource providers like clocks, regulators, etc. That all said, I'm fine with the current situation, too. Nobody can force the Tegra SMMU driver to register via IOMMU_OF_DECLARE() and I'm actively recommending the use of the IOMMU API directly. That way we get the control we need for drivers where it matters most (i.e. where close control of the address space and switching of address spaces per process are needed). Any other drivers, like USB or SDMMC simply use the DMA API and get their buffers from a contiguous pool. Thierry [0]: https://lkml.org/lkml/2014/6/26/476
On Wed, Jan 14, 2015 at 07:17:50PM +0000, Will Deacon wrote: > On Wed, Jan 14, 2015 at 01:51:36PM +0000, Heiko Stübner wrote: > > Am Mittwoch, 14. Januar 2015, 10:46:10 schrieb Will Deacon: > > > On Wed, Jan 14, 2015 at 09:00:24AM +0000, Alexandre Courbot wrote: > > > > On 12/02/2014 01:57 AM, Will Deacon wrote: > > > > > This patch plumbs the existing ARM IOMMU DMA infrastructure (which isn't > > > > > actually called outside of a few drivers) into arch_setup_dma_ops, so > > > > > that we can use IOMMUs for DMA transfers in a more generic fashion. > > > > > > > > > > Since this significantly complicates the arch_setup_dma_ops function, > > > > > it is moved out of line into dma-mapping.c. If CONFIG_ARM_DMA_USE_IOMMU > > > > > is not set, the iommu parameter is ignored and the normal ops are used > > > > > instead. > > > > > > > > A series for IOMMU support with Tegra/Nouveau ceased to work after this > > > > patch. > > > > > > Which series? This code shouldn't even be executed unless you start using > > > of_xlate and the generic bindings. > > > > > > > The Tegra IOMMU is not registered by the time the DT is parsed, > > > > and thus all devices end up without the proper DMA ops set up because > > > > the phandle to the IOMMU cannot be resolved. > > > > > > You might want to look at the patches posted for the exynos, renesas and ARM > > > SMMUs for some hints in how to use the new API. > > > > > > > Subsequently calling arm_iommu_create_mapping() and > > > > arm_iommu_attach_device() from the driver (as I used to do until 3.18) > > > > does not help since the call to set_dma_ops() has been moved out of > > > > arm_iommu_attach_device(). Therefore there seems to be no way for a device > > > > to gets its correct DMA ops unless the IOMMU is ready by the time the DT > > > > is parsed. > > > > > > > > Also potentially affected by this are the Rockchip DRM and OMAP3 ISP > > > > drivers, which follow the same pattern. > > > > > > I don't understand why any code currently in mainline should be affected. > > > Please can you elaborate on the failure case? > > > > As Alexandre suspected the new Rockchip drm code seems to be affected by > > this. I hadn't played with the drm code before last weekend and was then > > stumbling over different iommu related issues. As I hadn't to much contact > > with iommus till now I didn't get very far. > > > > But with Alexandre's bandaid patch of adding > > set_dma_ops(dev, &iommu_ops); > > to arm_iommu_attach_device both problems go away. > > > > > > So to elaborate on the two failure cases: > > Aha, I see what you mean now -- the issue is that attaching to an IOMMU > domain no longer swizzles the DMA ops. Furthermore, we also need to take > into account the coherency of the device, which we didn't do before (and > assumedly "worked" because all of the users happened to be non-coherent). > > Maybe we just need to add something like arm_iommu_swizzle_dma_ops, so > that direct users of the arm_iommu_* API can manage things manually? Why does this even have to be an ARM-specific API? Couldn't this equally well be generic across all platforms so that we get unified handling of IOMMU through the DMA API? Thierry
On Thu, Jan 15, 2015 at 08:28:44AM +0000, Thierry Reding wrote: > On Wed, Jan 14, 2015 at 10:46:10AM +0000, Will Deacon wrote: > > On Wed, Jan 14, 2015 at 09:00:24AM +0000, Alexandre Courbot wrote: > [...] > > > 2) Say you want to use the IOMMU API in your driver, and have an iommu > > > property in your device's DT node. If by chance your IOMMU is registered > > > early, you will already have a mapping automatically created even before > > > your probe function is called. Can this be avoided? Is it even safe? > > > > Currently, I think you have to either teardown the ops manually or return > > an error from of_xlate. Thierry was also looking at this sort of thing, > > so it might be worth talking to him. > > I already explained in earlier threads why I think this is a bad idea. > It's completely unnatural for any driver to manually tear down something > that it didn't want set up in the first place. It also means that you > have to carefully audit any users of these IOMMU APIs to make sure that > they do tear down. That doesn't sound like a good incremental approach, > as evidenced by the breakage that Alex and Heiko have encountered. Well, perhaps we hide that behind a get_iommu API or something. We *do* need this manual teardown step to support things like VFIO, so it makes sense to reuse it for other users too imo. > The solution for me has been to completely side-step the issue and not > register the IOMMU with the new mechanism at all. That is, there's no > .of_xlate() implementation, which means that the ARM DMA API glue won't > try to be smart and use the IOMMU in ways it's not meant to be used. > This has several advantages, such as that I can also use the regular > driver model for suspend/resume of the IOMMU, and I get to enjoy the > benefits of devres in the IOMMU driver. Probe ordering is still a tiny > issue, but we can easily solve that using explicit initcall ordering > (which really isn't any worse than IOMMU_OF_DECLARE()). That's a pity. I'd much rather extend what we currently have to satisfy your use-case. Ho-hum. Will
On Thu, Jan 15, 2015 at 08:30:06AM +0000, Thierry Reding wrote: > On Wed, Jan 14, 2015 at 07:17:50PM +0000, Will Deacon wrote: > > On Wed, Jan 14, 2015 at 01:51:36PM +0000, Heiko Stübner wrote: > > > As Alexandre suspected the new Rockchip drm code seems to be affected by > > > this. I hadn't played with the drm code before last weekend and was then > > > stumbling over different iommu related issues. As I hadn't to much contact > > > with iommus till now I didn't get very far. > > > > > > But with Alexandre's bandaid patch of adding > > > set_dma_ops(dev, &iommu_ops); > > > to arm_iommu_attach_device both problems go away. > > > > > > > > > So to elaborate on the two failure cases: > > > > Aha, I see what you mean now -- the issue is that attaching to an IOMMU > > domain no longer swizzles the DMA ops. Furthermore, we also need to take > > into account the coherency of the device, which we didn't do before (and > > assumedly "worked" because all of the users happened to be non-coherent). > > > > Maybe we just need to add something like arm_iommu_swizzle_dma_ops, so > > that direct users of the arm_iommu_* API can manage things manually? > > Why does this even have to be an ARM-specific API? Couldn't this equally > well be generic across all platforms so that we get unified handling of > IOMMU through the DMA API? It's already an ARM-specific API. Having something generic would be great, but I was thinking more about a short-term fix for the current issue rather than implementing something brand new. Will
On Thursday 15 January 2015 11:12:17 Will Deacon wrote: > On Thu, Jan 15, 2015 at 08:28:44AM +0000, Thierry Reding wrote: > > On Wed, Jan 14, 2015 at 10:46:10AM +0000, Will Deacon wrote: > > > On Wed, Jan 14, 2015 at 09:00:24AM +0000, Alexandre Courbot wrote: > > [...] > > > >>> 2) Say you want to use the IOMMU API in your driver, and have an iommu > >>> property in your device's DT node. If by chance your IOMMU is > >>> registered early, you will already have a mapping automatically > >>> created even before your probe function is called. Can this be > >>> avoided? Is it even safe? > >> > >> Currently, I think you have to either teardown the ops manually or > >> return an error from of_xlate. Thierry was also looking at this sort of > >> thing, so it might be worth talking to him. > > > > I already explained in earlier threads why I think this is a bad idea. > > It's completely unnatural for any driver to manually tear down something > > that it didn't want set up in the first place. It also means that you > > have to carefully audit any users of these IOMMU APIs to make sure that > > they do tear down. That doesn't sound like a good incremental approach, > > as evidenced by the breakage that Alex and Heiko have encountered. > > Well, perhaps we hide that behind a get_iommu API or something. We *do* > need this manual teardown step to support things like VFIO, so it makes > sense to reuse it for other users too imo. > > > The solution for me has been to completely side-step the issue and not > > register the IOMMU with the new mechanism at all. That is, there's no > > .of_xlate() implementation, which means that the ARM DMA API glue won't > > try to be smart and use the IOMMU in ways it's not meant to be used. That will break when someone will want to use the same IOMMU type for devices that use the DMA mapping API to hide the IOMMU. That might not be the case for your IOMMU today, but it's pretty fragile, we need to fix it. > > This has several advantages, such as that I can also use the regular > > driver model for suspend/resume of the IOMMU, and I get to enjoy the > > benefits of devres in the IOMMU driver. Probe ordering is still a tiny > > issue, but we can easily solve that using explicit initcall ordering > > (which really isn't any worse than IOMMU_OF_DECLARE()). > > That's a pity. I'd much rather extend what we currently have to satisfy > your use-case. Ho-hum. Assuming we want the IOMMU to be handled transparently for the majority of devices I only see two ways to fix this, The first way is to create a default DMA mapping unconditionally and let drivers that can't live with it tear it down. That's what is implemented today. The second way is to implement a mechanism to let drivers signal that they want to handle DMA mappings themselves. As the mappings need in the general case to be created before the probe function is called we can't signal this by calling a function in probe(). A new flag field for struct device_driver is a possible solution. This would however require delaying the creation of DMA mappings until right before probe time. Attaching to the IOMMU could be pushed to right before probe() as well, which would have the added benefit of making IOMMU driver implementable as real platform drivers.
On 01/16/2015 08:18 AM, Laurent Pinchart wrote: > On Thursday 15 January 2015 11:12:17 Will Deacon wrote: >> On Thu, Jan 15, 2015 at 08:28:44AM +0000, Thierry Reding wrote: >>> On Wed, Jan 14, 2015 at 10:46:10AM +0000, Will Deacon wrote: >>>> On Wed, Jan 14, 2015 at 09:00:24AM +0000, Alexandre Courbot wrote: >>> [...] >>> >>>>> 2) Say you want to use the IOMMU API in your driver, and have an iommu >>>>> property in your device's DT node. If by chance your IOMMU is >>>>> registered early, you will already have a mapping automatically >>>>> created even before your probe function is called. Can this be >>>>> avoided? Is it even safe? >>>> >>>> Currently, I think you have to either teardown the ops manually or >>>> return an error from of_xlate. Thierry was also looking at this sort of >>>> thing, so it might be worth talking to him. >>> >>> I already explained in earlier threads why I think this is a bad idea. >>> It's completely unnatural for any driver to manually tear down something >>> that it didn't want set up in the first place. It also means that you >>> have to carefully audit any users of these IOMMU APIs to make sure that >>> they do tear down. That doesn't sound like a good incremental approach, >>> as evidenced by the breakage that Alex and Heiko have encountered. >> >> Well, perhaps we hide that behind a get_iommu API or something. We *do* >> need this manual teardown step to support things like VFIO, so it makes >> sense to reuse it for other users too imo. >> >>> The solution for me has been to completely side-step the issue and not >>> register the IOMMU with the new mechanism at all. That is, there's no >>> .of_xlate() implementation, which means that the ARM DMA API glue won't >>> try to be smart and use the IOMMU in ways it's not meant to be used. > > That will break when someone will want to use the same IOMMU type for devices > that use the DMA mapping API to hide the IOMMU. That might not be the case for > your IOMMU today, but it's pretty fragile, we need to fix it. > >>> This has several advantages, such as that I can also use the regular >>> driver model for suspend/resume of the IOMMU, and I get to enjoy the >>> benefits of devres in the IOMMU driver. Probe ordering is still a tiny >>> issue, but we can easily solve that using explicit initcall ordering >>> (which really isn't any worse than IOMMU_OF_DECLARE()). >> >> That's a pity. I'd much rather extend what we currently have to satisfy >> your use-case. Ho-hum. > > Assuming we want the IOMMU to be handled transparently for the majority of > devices I only see two ways to fix this, > > The first way is to create a default DMA mapping unconditionally and let > drivers that can't live with it tear it down. That's what is implemented > today. I strongly support Thierry's point that drivers should not have to tear down things they don't need. The issue we are facing today is a very good illustration of why one should not have to do this. Everybody hates to receive unsollicited email with a link that says "to unsubscribe, click here". Let's not import that unpleasant culture into the kernel. I am arriving late in this discussion, but what is wrong with asking drivers to explicitly state that they want the DMA API to be backed by the IOMMU instead of forcibly making it work that way? > > The second way is to implement a mechanism to let drivers signal that they > want to handle DMA mappings themselves. As the mappings need in the general > case to be created before the probe function is called Sorry for being ignorant here, but why is that? > we can't signal this by > calling a function in probe(). A new flag field for struct device_driver is a > possible solution. This would however require delaying the creation of DMA > mappings until right before probe time. Attaching to the IOMMU could be pushed > to right before probe() as well, which would have the added benefit of making > IOMMU driver implementable as real platform drivers. Keeping the ability to write IOMMU drivers as platform drivers would be nice indeed. The problem with the opt-out flag though is that one will need to check every single driver out there to see whether it stills behave correctly if its hardware is suddently put behind a IOMMU. Doing it the other way (a flag that enables IOMMU if available) sounds safer to me. What we have right now is a mechanism that basically makes it impossible to use the DMA API on many ARM platforms if ARM_DMA_USE_IOMMU is set (and I suspect it would also make the IOMMU unusable as well, without any way to fix things). This is quite concerning. Even more concerning is that -rc5 is about to be released and we have in-tree drivers (Rockchip DRM) that are not working as they should because of this patch. Will, what is your plan to fix this? Do we have stuff that absolutely depends on this patch? If not, can we just revert it until all these issues are solved?
Hi Alexandre, On Sunday 18 January 2015 15:54:34 Alexandre Courbot wrote: > On 01/16/2015 08:18 AM, Laurent Pinchart wrote: > > On Thursday 15 January 2015 11:12:17 Will Deacon wrote: > >> On Thu, Jan 15, 2015 at 08:28:44AM +0000, Thierry Reding wrote: > >>> On Wed, Jan 14, 2015 at 10:46:10AM +0000, Will Deacon wrote: > >>>> On Wed, Jan 14, 2015 at 09:00:24AM +0000, Alexandre Courbot wrote: > >>> [...] > >>> > >>>>> 2) Say you want to use the IOMMU API in your driver, and have an iommu > >>>>> property in your device's DT node. If by chance your IOMMU is > >>>>> registered early, you will already have a mapping automatically > >>>>> created even before your probe function is called. Can this be > >>>>> avoided? Is it even safe? > >>>> > >>>> Currently, I think you have to either teardown the ops manually or > >>>> return an error from of_xlate. Thierry was also looking at this sort of > >>>> thing, so it might be worth talking to him. > >>> > >>> I already explained in earlier threads why I think this is a bad idea. > >>> It's completely unnatural for any driver to manually tear down something > >>> that it didn't want set up in the first place. It also means that you > >>> have to carefully audit any users of these IOMMU APIs to make sure that > >>> they do tear down. That doesn't sound like a good incremental approach, > >>> as evidenced by the breakage that Alex and Heiko have encountered. > >> > >> Well, perhaps we hide that behind a get_iommu API or something. We *do* > >> need this manual teardown step to support things like VFIO, so it makes > >> sense to reuse it for other users too imo. > >> > >>> The solution for me has been to completely side-step the issue and not > >>> register the IOMMU with the new mechanism at all. That is, there's no > >>> .of_xlate() implementation, which means that the ARM DMA API glue won't > >>> try to be smart and use the IOMMU in ways it's not meant to be used. > > > > That will break when someone will want to use the same IOMMU type for > > devices that use the DMA mapping API to hide the IOMMU. That might not be > > the case for your IOMMU today, but it's pretty fragile, we need to fix > > it. > > > >>> This has several advantages, such as that I can also use the regular > >>> driver model for suspend/resume of the IOMMU, and I get to enjoy the > >>> benefits of devres in the IOMMU driver. Probe ordering is still a tiny > >>> issue, but we can easily solve that using explicit initcall ordering > >>> (which really isn't any worse than IOMMU_OF_DECLARE()). > >> > >> That's a pity. I'd much rather extend what we currently have to satisfy > >> your use-case. Ho-hum. > > > > Assuming we want the IOMMU to be handled transparently for the majority of > > devices I only see two ways to fix this, > > > > The first way is to create a default DMA mapping unconditionally and let > > drivers that can't live with it tear it down. That's what is implemented > > today. > > I strongly support Thierry's point that drivers should not have to tear > down things they don't need. The issue we are facing today is a very > good illustration of why one should not have to do this. > > Everybody hates to receive unsollicited email with a link that says "to > unsubscribe, click here". Let's not import that unpleasant culture into > the kernel. > > I am arriving late in this discussion, but what is wrong with asking > drivers to explicitly state that they want the DMA API to be backed by > the IOMMU instead of forcibly making it work that way? The vast majority of the drivers are not IOMMU-aware. We would thus need to add a call at the beginning of the probe function of nearly every driver that can perform DMA to state that the driver doesn't need to handle any IOMMU that might be present in the system itself. I don't think that's a better solution. Explicitly tearing down mappings in drivers that want to manage IOMMUs isn't a solution I like either. A possibly better solution would be to call a function to state that the DMA mapping API shouldn't not handle IOMMUs. Something like dma_mapping_ignore_iommu(dev); at the beginning of the probe function of such drivers could do. The function would perform behind the scene all operations needed to tear down everything that shouldn't have been set up. > > The second way is to implement a mechanism to let drivers signal that they > > want to handle DMA mappings themselves. As the mappings need in the > > general case to be created before the probe function is called > > Sorry for being ignorant here, but why is that? Because a driver can call the DMA mapping API in its probe function, to allocate DMA coherent memory for instance. We need to ensure that the DMA mapping IOMMU has set up the required IOMMU ops by that time. As explained above I don't like the idea of sprinkling explicit calls to initialize IOMMU support in the vast majority of drivers, especially when they shouldn't be IOMMU-aware, so we then need to initialize everything that is needed before the first call to the DMA mapping API. > > we can't signal this by calling a function in probe(). A new flag field > > for struct device_driver is a possible solution. This would however > > require delaying the creation of DMA mappings until right before probe > > time. Attaching to the IOMMU could be pushed to right before probe() as > > well, which would have the added benefit of making IOMMU driver > > implementable as real platform drivers. > > Keeping the ability to write IOMMU drivers as platform drivers would be > nice indeed. > > The problem with the opt-out flag though is that one will need to check > every single driver out there to see whether it stills behave correctly > if its hardware is suddently put behind a IOMMU. That's actually my default assumption :-) On ARM platforms at least, for many devices, whether an IOMMU is present or not is an integration property, not a property of the device. The same way a device shouldn't care about the exact bus topology that connects it to the CPU and memory, it shouldn't care about whether an IOMMU is present on that bus, except in special cases. > Doing it the other way (a flag that enables IOMMU if available) sounds safer > to me. > > What we have right now is a mechanism that basically makes it impossible > to use the DMA API on many ARM platforms if ARM_DMA_USE_IOMMU is set > (and I suspect it would also make the IOMMU unusable as well, without > any way to fix things). This is quite concerning. > > Even more concerning is that -rc5 is about to be released and we have > in-tree drivers (Rockchip DRM) that are not working as they should > because of this patch. Will, what is your plan to fix this? Do we have > stuff that absolutely depends on this patch? If not, can we just revert > it until all these issues are solved?
On Sun, Jan 18, 2015 at 11:18:51AM +0000, Laurent Pinchart wrote: > On Sunday 18 January 2015 15:54:34 Alexandre Courbot wrote: > > On 01/16/2015 08:18 AM, Laurent Pinchart wrote: > > > On Thursday 15 January 2015 11:12:17 Will Deacon wrote: > > >> On Thu, Jan 15, 2015 at 08:28:44AM +0000, Thierry Reding wrote: > > >>> On Wed, Jan 14, 2015 at 10:46:10AM +0000, Will Deacon wrote: > > >>>> On Wed, Jan 14, 2015 at 09:00:24AM +0000, Alexandre Courbot wrote: > > >>> [...] > > >>> > > >>>>> 2) Say you want to use the IOMMU API in your driver, and have an iommu > > >>>>> property in your device's DT node. If by chance your IOMMU is > > >>>>> registered early, you will already have a mapping automatically > > >>>>> created even before your probe function is called. Can this be > > >>>>> avoided? Is it even safe? > > >>>> > > >>>> Currently, I think you have to either teardown the ops manually or > > >>>> return an error from of_xlate. Thierry was also looking at this sort of > > >>>> thing, so it might be worth talking to him. > > >>> > > >>> I already explained in earlier threads why I think this is a bad idea. > > >>> It's completely unnatural for any driver to manually tear down something > > >>> that it didn't want set up in the first place. It also means that you > > >>> have to carefully audit any users of these IOMMU APIs to make sure that > > >>> they do tear down. That doesn't sound like a good incremental approach, > > >>> as evidenced by the breakage that Alex and Heiko have encountered. > > >> > > >> Well, perhaps we hide that behind a get_iommu API or something. We *do* > > >> need this manual teardown step to support things like VFIO, so it makes > > >> sense to reuse it for other users too imo. > > >> > > >>> The solution for me has been to completely side-step the issue and not > > >>> register the IOMMU with the new mechanism at all. That is, there's no > > >>> .of_xlate() implementation, which means that the ARM DMA API glue won't > > >>> try to be smart and use the IOMMU in ways it's not meant to be used. > > > > > > That will break when someone will want to use the same IOMMU type for > > > devices that use the DMA mapping API to hide the IOMMU. That might not be > > > the case for your IOMMU today, but it's pretty fragile, we need to fix > > > it. > > > > > >>> This has several advantages, such as that I can also use the regular > > >>> driver model for suspend/resume of the IOMMU, and I get to enjoy the > > >>> benefits of devres in the IOMMU driver. Probe ordering is still a tiny > > >>> issue, but we can easily solve that using explicit initcall ordering > > >>> (which really isn't any worse than IOMMU_OF_DECLARE()). > > >> > > >> That's a pity. I'd much rather extend what we currently have to satisfy > > >> your use-case. Ho-hum. > > > > > > Assuming we want the IOMMU to be handled transparently for the majority of > > > devices I only see two ways to fix this, > > > > > > The first way is to create a default DMA mapping unconditionally and let > > > drivers that can't live with it tear it down. That's what is implemented > > > today. > > > > I strongly support Thierry's point that drivers should not have to tear > > down things they don't need. The issue we are facing today is a very > > good illustration of why one should not have to do this. > > > > Everybody hates to receive unsollicited email with a link that says "to > > unsubscribe, click here". Let's not import that unpleasant culture into > > the kernel. > > > > I am arriving late in this discussion, but what is wrong with asking > > drivers to explicitly state that they want the DMA API to be backed by > > the IOMMU instead of forcibly making it work that way? > > The vast majority of the drivers are not IOMMU-aware. We would thus need to > add a call at the beginning of the probe function of nearly every driver that > can perform DMA to state that the driver doesn't need to handle any IOMMU that > might be present in the system itself. I don't think that's a better solution. > > Explicitly tearing down mappings in drivers that want to manage IOMMUs isn't a > solution I like either. A possibly better solution would be to call a function > to state that the DMA mapping API shouldn't not handle IOMMUs. Something like > > dma_mapping_ignore_iommu(dev); > > at the beginning of the probe function of such drivers could do. The function > would perform behind the scene all operations needed to tear down everything > that shouldn't have been set up. An alternative would be to add a flag to platform_driver, like we have for "prevent_deferred_probe" which is something like "prevent_dma_configure". For the moment, that would actually teardown the DMA configuration in platform_drv_probe, but if things are reordering in future then we can avoid setting up the ops altogether without an API change. Will
Hi Will, On Monday 19 January 2015 11:12:02 Will Deacon wrote: > On Sun, Jan 18, 2015 at 11:18:51AM +0000, Laurent Pinchart wrote: > > On Sunday 18 January 2015 15:54:34 Alexandre Courbot wrote: > >> On 01/16/2015 08:18 AM, Laurent Pinchart wrote: > >>> On Thursday 15 January 2015 11:12:17 Will Deacon wrote: > >>>> On Thu, Jan 15, 2015 at 08:28:44AM +0000, Thierry Reding wrote: > >>>>> On Wed, Jan 14, 2015 at 10:46:10AM +0000, Will Deacon wrote: > >>>>>> On Wed, Jan 14, 2015 at 09:00:24AM +0000, Alexandre Courbot wrote: > >>>>> [...] > >>>>> > >>>>>>> 2) Say you want to use the IOMMU API in your driver, and have an > >>>>>>> iommu property in your device's DT node. If by chance your IOMMU > >>>>>>> is registered early, you will already have a mapping automatically > >>>>>>> created even before your probe function is called. Can this be > >>>>>>> avoided? Is it even safe? > >>>>>> > >>>>>> Currently, I think you have to either teardown the ops manually or > >>>>>> return an error from of_xlate. Thierry was also looking at this > >>>>>> sort of thing, so it might be worth talking to him. > >>>>> > >>>>> I already explained in earlier threads why I think this is a bad > >>>>> idea. It's completely unnatural for any driver to manually tear down > >>>>> something that it didn't want set up in the first place. It also > >>>>> means that you have to carefully audit any users of these IOMMU APIs > >>>>> to make sure that they do tear down. That doesn't sound like a good > >>>>> incremental approach, as evidenced by the breakage that Alex and > >>>>> Heiko have encountered. > >>>> > >>>> Well, perhaps we hide that behind a get_iommu API or something. We > >>>> *do* need this manual teardown step to support things like VFIO, so > >>>> it makes sense to reuse it for other users too imo. > >>>> > >>>>> The solution for me has been to completely side-step the issue and > >>>>> not register the IOMMU with the new mechanism at all. That is, > >>>>> there's no .of_xlate() implementation, which means that the ARM DMA > >>>>> API glue won't try to be smart and use the IOMMU in ways it's not > >>>>> meant to be used. > >>> > >>> That will break when someone will want to use the same IOMMU type for > >>> devices that use the DMA mapping API to hide the IOMMU. That might not > >>> be the case for your IOMMU today, but it's pretty fragile, we need to > >>> fix it. > >>> > >>>>> This has several advantages, such as that I can also use the regular > >>>>> driver model for suspend/resume of the IOMMU, and I get to enjoy the > >>>>> benefits of devres in the IOMMU driver. Probe ordering is still a > >>>>> tiny issue, but we can easily solve that using explicit initcall > >>>>> ordering (which really isn't any worse than IOMMU_OF_DECLARE()). > >>>> > >>>> That's a pity. I'd much rather extend what we currently have to > >>>> satisfy your use-case. Ho-hum. > >>> > >>> Assuming we want the IOMMU to be handled transparently for the > >>> majority of devices I only see two ways to fix this, > >>> > >>> The first way is to create a default DMA mapping unconditionally and > >>> let drivers that can't live with it tear it down. That's what is > >>> implemented today. > >> > >> I strongly support Thierry's point that drivers should not have to tear > >> down things they don't need. The issue we are facing today is a very > >> good illustration of why one should not have to do this. > >> > >> Everybody hates to receive unsollicited email with a link that says "to > >> unsubscribe, click here". Let's not import that unpleasant culture into > >> the kernel. > >> > >> I am arriving late in this discussion, but what is wrong with asking > >> drivers to explicitly state that they want the DMA API to be backed by > >> the IOMMU instead of forcibly making it work that way? > > > > The vast majority of the drivers are not IOMMU-aware. We would thus need > > to add a call at the beginning of the probe function of nearly every > > driver that can perform DMA to state that the driver doesn't need to > > handle any IOMMU that might be present in the system itself. I don't think > > that's a better solution. > > > > Explicitly tearing down mappings in drivers that want to manage IOMMUs > > isn't a solution I like either. A possibly better solution would be to > > call a function to state that the DMA mapping API shouldn't not handle > > IOMMUs. Something like > > > > dma_mapping_ignore_iommu(dev); > > > > at the beginning of the probe function of such drivers could do. The > > function would perform behind the scene all operations needed to tear > > down everything that shouldn't have been set up. > > An alternative would be to add a flag to platform_driver, like we have for > "prevent_deferred_probe" which is something like "prevent_dma_configure". That's a solution I have proposed (albeit as a struct device_driver field, but that's a small detail), so I'm fine with it :-) > For the moment, that would actually teardown the DMA configuration in > platform_drv_probe, but if things are reordering in future then we can avoid > setting up the ops altogether without an API change.
On Mon, Jan 19, 2015 at 01:34:24PM +0200, Laurent Pinchart wrote: > Hi Will, > > On Monday 19 January 2015 11:12:02 Will Deacon wrote: > > On Sun, Jan 18, 2015 at 11:18:51AM +0000, Laurent Pinchart wrote: > > > On Sunday 18 January 2015 15:54:34 Alexandre Courbot wrote: > > >> On 01/16/2015 08:18 AM, Laurent Pinchart wrote: > > >>> On Thursday 15 January 2015 11:12:17 Will Deacon wrote: > > >>>> On Thu, Jan 15, 2015 at 08:28:44AM +0000, Thierry Reding wrote: > > >>>>> On Wed, Jan 14, 2015 at 10:46:10AM +0000, Will Deacon wrote: > > >>>>>> On Wed, Jan 14, 2015 at 09:00:24AM +0000, Alexandre Courbot wrote: > > >>>>> [...] > > >>>>> > > >>>>>>> 2) Say you want to use the IOMMU API in your driver, and have an > > >>>>>>> iommu property in your device's DT node. If by chance your IOMMU > > >>>>>>> is registered early, you will already have a mapping automatically > > >>>>>>> created even before your probe function is called. Can this be > > >>>>>>> avoided? Is it even safe? > > >>>>>> > > >>>>>> Currently, I think you have to either teardown the ops manually or > > >>>>>> return an error from of_xlate. Thierry was also looking at this > > >>>>>> sort of thing, so it might be worth talking to him. > > >>>>> > > >>>>> I already explained in earlier threads why I think this is a bad > > >>>>> idea. It's completely unnatural for any driver to manually tear down > > >>>>> something that it didn't want set up in the first place. It also > > >>>>> means that you have to carefully audit any users of these IOMMU APIs > > >>>>> to make sure that they do tear down. That doesn't sound like a good > > >>>>> incremental approach, as evidenced by the breakage that Alex and > > >>>>> Heiko have encountered. > > >>>> > > >>>> Well, perhaps we hide that behind a get_iommu API or something. We > > >>>> *do* need this manual teardown step to support things like VFIO, so > > >>>> it makes sense to reuse it for other users too imo. > > >>>> > > >>>>> The solution for me has been to completely side-step the issue and > > >>>>> not register the IOMMU with the new mechanism at all. That is, > > >>>>> there's no .of_xlate() implementation, which means that the ARM DMA > > >>>>> API glue won't try to be smart and use the IOMMU in ways it's not > > >>>>> meant to be used. > > >>> > > >>> That will break when someone will want to use the same IOMMU type for > > >>> devices that use the DMA mapping API to hide the IOMMU. That might not > > >>> be the case for your IOMMU today, but it's pretty fragile, we need to > > >>> fix it. > > >>> > > >>>>> This has several advantages, such as that I can also use the regular > > >>>>> driver model for suspend/resume of the IOMMU, and I get to enjoy the > > >>>>> benefits of devres in the IOMMU driver. Probe ordering is still a > > >>>>> tiny issue, but we can easily solve that using explicit initcall > > >>>>> ordering (which really isn't any worse than IOMMU_OF_DECLARE()). > > >>>> > > >>>> That's a pity. I'd much rather extend what we currently have to > > >>>> satisfy your use-case. Ho-hum. > > >>> > > >>> Assuming we want the IOMMU to be handled transparently for the > > >>> majority of devices I only see two ways to fix this, > > >>> > > >>> The first way is to create a default DMA mapping unconditionally and > > >>> let drivers that can't live with it tear it down. That's what is > > >>> implemented today. > > >> > > >> I strongly support Thierry's point that drivers should not have to tear > > >> down things they don't need. The issue we are facing today is a very > > >> good illustration of why one should not have to do this. > > >> > > >> Everybody hates to receive unsollicited email with a link that says "to > > >> unsubscribe, click here". Let's not import that unpleasant culture into > > >> the kernel. > > >> > > >> I am arriving late in this discussion, but what is wrong with asking > > >> drivers to explicitly state that they want the DMA API to be backed by > > >> the IOMMU instead of forcibly making it work that way? > > > > > > The vast majority of the drivers are not IOMMU-aware. We would thus need > > > to add a call at the beginning of the probe function of nearly every > > > driver that can perform DMA to state that the driver doesn't need to > > > handle any IOMMU that might be present in the system itself. I don't think > > > that's a better solution. > > > > > > Explicitly tearing down mappings in drivers that want to manage IOMMUs > > > isn't a solution I like either. A possibly better solution would be to > > > call a function to state that the DMA mapping API shouldn't not handle > > > IOMMUs. Something like > > > > > > dma_mapping_ignore_iommu(dev); > > > > > > at the beginning of the probe function of such drivers could do. The > > > function would perform behind the scene all operations needed to tear > > > down everything that shouldn't have been set up. > > > > An alternative would be to add a flag to platform_driver, like we have for > > "prevent_deferred_probe" which is something like "prevent_dma_configure". > > That's a solution I have proposed (albeit as a struct device_driver field, but > that's a small detail), so I'm fine with it :-) I think Marek had proposed something similar initially as well. I don't see an immediate downside to that solution. It's still somewhat ugly in that a lot of stuff is set up before it's known to actually be used at all, but it seems like there's some consensus that this can be improved later on, so I have no objections to such a patch. Of course that doesn't solve the current breakage for the Rockchip DRM and OMAP ISP drivers. There's an additional issue with this automatic type of setting up mapping: on Tegra quite a few devices can use the IOMMU. Among those are SD/MMC controllers, HDA, SATA and things like the AHB bus. Creating a separate domain (== address space) for each of those devices is wasteful and there's currently no way to make them use a single virtual address space. Each of them really has no use for a full 4 GiB of address space. My earliest attempt to implement that was as a policy decision within the IOMMU driver, but that wasn't very well received. I'm now wondering whether the same could be done using this new type of flag. Perhaps it can be assumed that every driver that doesn't want its IOMMU master hooked up to the DMA API can (and will) manually manage the virtual address space. Conversely every driver that wants to use the DMA API to abstract away the existence of an IOMMU could be considered part of the group where a separate domain isn't necessary. Thierry
On Fri, Jan 16, 2015 at 01:18:21AM +0200, Laurent Pinchart wrote: > On Thursday 15 January 2015 11:12:17 Will Deacon wrote: > > On Thu, Jan 15, 2015 at 08:28:44AM +0000, Thierry Reding wrote: > > > On Wed, Jan 14, 2015 at 10:46:10AM +0000, Will Deacon wrote: > > > > On Wed, Jan 14, 2015 at 09:00:24AM +0000, Alexandre Courbot wrote: > > > [...] > > > > > >>> 2) Say you want to use the IOMMU API in your driver, and have an iommu > > >>> property in your device's DT node. If by chance your IOMMU is > > >>> registered early, you will already have a mapping automatically > > >>> created even before your probe function is called. Can this be > > >>> avoided? Is it even safe? > > >> > > >> Currently, I think you have to either teardown the ops manually or > > >> return an error from of_xlate. Thierry was also looking at this sort of > > >> thing, so it might be worth talking to him. > > > > > > I already explained in earlier threads why I think this is a bad idea. > > > It's completely unnatural for any driver to manually tear down something > > > that it didn't want set up in the first place. It also means that you > > > have to carefully audit any users of these IOMMU APIs to make sure that > > > they do tear down. That doesn't sound like a good incremental approach, > > > as evidenced by the breakage that Alex and Heiko have encountered. > > > > Well, perhaps we hide that behind a get_iommu API or something. We *do* > > need this manual teardown step to support things like VFIO, so it makes > > sense to reuse it for other users too imo. > > > > > The solution for me has been to completely side-step the issue and not > > > register the IOMMU with the new mechanism at all. That is, there's no > > > .of_xlate() implementation, which means that the ARM DMA API glue won't > > > try to be smart and use the IOMMU in ways it's not meant to be used. > > That will break when someone will want to use the same IOMMU type for devices > that use the DMA mapping API to hide the IOMMU. That might not be the case for > your IOMMU today, but it's pretty fragile, we need to fix it. No, there's absolutely no issue here. It simply means that you can't do this on Tegra. So far I'm not sure I even see an advantage in using the IOMMU for devices that don't care about it anyway. Consider the example of the SD/MMC or HDA. They typically allocate fairly small buffers, the order of a single page typically. They can simply use memory handed out by the CMA. So as long as we don't add a .of_xlate() implementation or instantiate via the IOMMU_OF_DECLARE() mechanism we simply don't support IOMMU-over- DMA on Tegra. Thierry
On Sun, Jan 18, 2015 at 01:18:51PM +0200, Laurent Pinchart wrote: > Hi Alexandre, > > On Sunday 18 January 2015 15:54:34 Alexandre Courbot wrote: > > On 01/16/2015 08:18 AM, Laurent Pinchart wrote: [...] > > > The second way is to implement a mechanism to let drivers signal that they > > > want to handle DMA mappings themselves. As the mappings need in the > > > general case to be created before the probe function is called > > > > Sorry for being ignorant here, but why is that? > > Because a driver can call the DMA mapping API in its probe function, to > allocate DMA coherent memory for instance. We need to ensure that the DMA > mapping IOMMU has set up the required IOMMU ops by that time. As explained > above I don't like the idea of sprinkling explicit calls to initialize IOMMU > support in the vast majority of drivers, especially when they shouldn't be > IOMMU-aware, so we then need to initialize everything that is needed before > the first call to the DMA mapping API. The original patch that Hiroshi posted based on my comments was to have the driver core call iommu_attach(), which would then set up everything needed right before calling into the driver's ->probe(). This works quite nicely because by definition the driver can't allocate any DMA before ->probe(). And, like you said, this allows deferred probe to be used. To me it's so obviously the right solution that I remain flabbergasted with how much resistance it's received (or how much it's being ignored). Thierry
On Fri, Jan 16, 2015 at 01:18:21AM +0200, Laurent Pinchart wrote: [...] > The second way is to implement a mechanism to let drivers signal that they > want to handle DMA mappings themselves. As the mappings need in the general > case to be created before the probe function is called we can't signal this by > calling a function in probe(). A new flag field for struct device_driver is a > possible solution. This would however require delaying the creation of DMA > mappings until right before probe time. Attaching to the IOMMU could be pushed > to right before probe() as well, which would have the added benefit of making > IOMMU driver implementable as real platform drivers. Right. This is a pretty important point, too. One of the things that we've been working on is suspend/resume. Now if you don't have a struct device you can't easily implement suspend/resume. You'd have to play tricks like using syscore_ops, which then leads to potentially problems with suspend/resume ordering. It also means we have to keep around global variables for driver-private data because there's no struct device to attach it to. By properly encoding the dependencies via deferred probe we get the proper ordering and we can use the regular driver model with all the goodies that we've come up with over the years. Thierry
On Mon, Jan 19, 2015 at 12:43:06PM +0000, Thierry Reding wrote: > On Sun, Jan 18, 2015 at 01:18:51PM +0200, Laurent Pinchart wrote: > > On Sunday 18 January 2015 15:54:34 Alexandre Courbot wrote: > > > On 01/16/2015 08:18 AM, Laurent Pinchart wrote: > [...] > > > > The second way is to implement a mechanism to let drivers signal that they > > > > want to handle DMA mappings themselves. As the mappings need in the > > > > general case to be created before the probe function is called > > > > > > Sorry for being ignorant here, but why is that? > > > > Because a driver can call the DMA mapping API in its probe function, to > > allocate DMA coherent memory for instance. We need to ensure that the DMA > > mapping IOMMU has set up the required IOMMU ops by that time. As explained > > above I don't like the idea of sprinkling explicit calls to initialize IOMMU > > support in the vast majority of drivers, especially when they shouldn't be > > IOMMU-aware, so we then need to initialize everything that is needed before > > the first call to the DMA mapping API. > > The original patch that Hiroshi posted based on my comments was to have > the driver core call iommu_attach(), which would then set up everything > needed right before calling into the driver's ->probe(). This works > quite nicely because by definition the driver can't allocate any DMA > before ->probe(). And, like you said, this allows deferred probe to be > used. > > To me it's so obviously the right solution that I remain flabbergasted > with how much resistance it's received (or how much it's being ignored). Have you considered reposting the patches based on what we currently have (which has the advantage of identifying a specific IOMMU instance)? Will
On Mon, Jan 19, 2015 at 12:50:52PM +0000, Will Deacon wrote: > On Mon, Jan 19, 2015 at 12:43:06PM +0000, Thierry Reding wrote: > > On Sun, Jan 18, 2015 at 01:18:51PM +0200, Laurent Pinchart wrote: > > > On Sunday 18 January 2015 15:54:34 Alexandre Courbot wrote: > > > > On 01/16/2015 08:18 AM, Laurent Pinchart wrote: > > [...] > > > > > The second way is to implement a mechanism to let drivers signal that they > > > > > want to handle DMA mappings themselves. As the mappings need in the > > > > > general case to be created before the probe function is called > > > > > > > > Sorry for being ignorant here, but why is that? > > > > > > Because a driver can call the DMA mapping API in its probe function, to > > > allocate DMA coherent memory for instance. We need to ensure that the DMA > > > mapping IOMMU has set up the required IOMMU ops by that time. As explained > > > above I don't like the idea of sprinkling explicit calls to initialize IOMMU > > > support in the vast majority of drivers, especially when they shouldn't be > > > IOMMU-aware, so we then need to initialize everything that is needed before > > > the first call to the DMA mapping API. > > > > The original patch that Hiroshi posted based on my comments was to have > > the driver core call iommu_attach(), which would then set up everything > > needed right before calling into the driver's ->probe(). This works > > quite nicely because by definition the driver can't allocate any DMA > > before ->probe(). And, like you said, this allows deferred probe to be > > used. > > > > To me it's so obviously the right solution that I remain flabbergasted > > with how much resistance it's received (or how much it's being ignored). > > Have you considered reposting the patches based on what we currently have > (which has the advantage of identifying a specific IOMMU instance)? No, I hadn't. Initially my patches included a solution for identifying individual IOMMU instances, too. There was a small registry with a list of struct iommus. That was supposed to get used to store per-instance data (by being embedded in a driver-specific structure). I'd need to look in more detail how that could be done with the infrastructure that your patchset creates. I'm somewhat burried below other tasks right now so don't expect to have any time to look into this anytime before -rc6 or -rc7 at the earliest. Thierry
On Monday 19 January 2015 13:36:24 Thierry Reding wrote: > On Fri, Jan 16, 2015 at 01:18:21AM +0200, Laurent Pinchart wrote: > > On Thursday 15 January 2015 11:12:17 Will Deacon wrote: > > > On Thu, Jan 15, 2015 at 08:28:44AM +0000, Thierry Reding wrote: > > > > On Wed, Jan 14, 2015 at 10:46:10AM +0000, Will Deacon wrote: > > > > > On Wed, Jan 14, 2015 at 09:00:24AM +0000, Alexandre Courbot wrote: > > > > [...] > > > > > > > >>> 2) Say you want to use the IOMMU API in your driver, and have an iommu > > > >>> property in your device's DT node. If by chance your IOMMU is > > > >>> registered early, you will already have a mapping automatically > > > >>> created even before your probe function is called. Can this be > > > >>> avoided? Is it even safe? > > > >> > > > >> Currently, I think you have to either teardown the ops manually or > > > >> return an error from of_xlate. Thierry was also looking at this sort of > > > >> thing, so it might be worth talking to him. > > > > > > > > I already explained in earlier threads why I think this is a bad idea. > > > > It's completely unnatural for any driver to manually tear down something > > > > that it didn't want set up in the first place. It also means that you > > > > have to carefully audit any users of these IOMMU APIs to make sure that > > > > they do tear down. That doesn't sound like a good incremental approach, > > > > as evidenced by the breakage that Alex and Heiko have encountered. > > > > > > Well, perhaps we hide that behind a get_iommu API or something. We *do* > > > need this manual teardown step to support things like VFIO, so it makes > > > sense to reuse it for other users too imo. > > > > > > > The solution for me has been to completely side-step the issue and not > > > > register the IOMMU with the new mechanism at all. That is, there's no > > > > .of_xlate() implementation, which means that the ARM DMA API glue won't > > > > try to be smart and use the IOMMU in ways it's not meant to be used. > > > > That will break when someone will want to use the same IOMMU type for devices > > that use the DMA mapping API to hide the IOMMU. That might not be the case for > > your IOMMU today, but it's pretty fragile, we need to fix it. > > No, there's absolutely no issue here. It simply means that you can't do > this on Tegra. So far I'm not sure I even see an advantage in using the > IOMMU for devices that don't care about it anyway. Consider the example > of the SD/MMC or HDA. They typically allocate fairly small buffers, the > order of a single page typically. They can simply use memory handed out > by the CMA. > > So as long as we don't add a .of_xlate() implementation or instantiate > via the IOMMU_OF_DECLARE() mechanism we simply don't support IOMMU-over- > DMA on Tegra. It breaks as soon as you have a system with memory above the 4GB boundary, which is the whole point of iommus for most users. CMA does not work for streaming mappings, only for the coherent API. Arnd
On Sunday 18 January 2015 13:18:51 Laurent Pinchart wrote: > On Sunday 18 January 2015 15:54:34 Alexandre Courbot wrote: > > On 01/16/2015 08:18 AM, Laurent Pinchart wrote: > > > On Thursday 15 January 2015 11:12:17 Will Deacon wrote: > > >> On Thu, Jan 15, 2015 at 08:28:44AM +0000, Thierry Reding wrote: > > >>> This has several advantages, such as that I can also use the regular > > >>> driver model for suspend/resume of the IOMMU, and I get to enjoy the > > >>> benefits of devres in the IOMMU driver. Probe ordering is still a tiny > > >>> issue, but we can easily solve that using explicit initcall ordering > > >>> (which really isn't any worse than IOMMU_OF_DECLARE()). > > >> > > >> That's a pity. I'd much rather extend what we currently have to satisfy > > >> your use-case. Ho-hum. > > > > > > Assuming we want the IOMMU to be handled transparently for the majority of > > > devices I only see two ways to fix this, > > > > > > The first way is to create a default DMA mapping unconditionally and let > > > drivers that can't live with it tear it down. That's what is implemented > > > today. > > > > I strongly support Thierry's point that drivers should not have to tear > > down things they don't need. The issue we are facing today is a very > > good illustration of why one should not have to do this. > > > > Everybody hates to receive unsollicited email with a link that says "to > > unsubscribe, click here". Let's not import that unpleasant culture into > > the kernel. > > > > I am arriving late in this discussion, but what is wrong with asking > > drivers to explicitly state that they want the DMA API to be backed by > > the IOMMU instead of forcibly making it work that way? > > The vast majority of the drivers are not IOMMU-aware. We would thus need to > add a call at the beginning of the probe function of nearly every driver that > can perform DMA to state that the driver doesn't need to handle any IOMMU that > might be present in the system itself. I don't think that's a better solution. Right, abstracting the presence of an IOMMU (along with things like cache management) is the whole point for having the dma-mapping API. The iommu driver should instead be able to make the decision on whether the device uses the iommu for DMA or not. In some cases, it's not an option because the iommu is mandatory for all DMA and there is no working passthrough mode. In other cases, it depends on the dma mask: as long as the device's dma_mask covers all of RAM, we can avoid using the IOMMU and get better performance (and also avoid setting up tables that may need to be removed again), but when the dma mask is too small, we have to use the iommu or fall back to swiotlb (which is currently not implemeted on arm32). > > > we can't signal this by calling a function in probe(). A new flag field > > > for struct device_driver is a possible solution. This would however > > > require delaying the creation of DMA mappings until right before probe > > > time. Attaching to the IOMMU could be pushed to right before probe() as > > > well, which would have the added benefit of making IOMMU driver > > > implementable as real platform drivers. > > > > Keeping the ability to write IOMMU drivers as platform drivers would be > > nice indeed. > > > > The problem with the opt-out flag though is that one will need to check > > every single driver out there to see whether it stills behave correctly > > if its hardware is suddently put behind a IOMMU. > > That's actually my default assumption :-) On ARM platforms at least, for many > devices, whether an IOMMU is present or not is an integration property, not a > property of the device. The same way a device shouldn't care about the exact > bus topology that connects it to the CPU and memory, it shouldn't care about > whether an IOMMU is present on that bus, except in special cases. Agreed. This must work by default, or basically all arm64 machines are broken. At the moment, arm64 does not support IOMMUs properly and uses uses swiotlb instead, but it's a huge performance bottleneck. On arm32, very few systems need an IOMMU at the moment, but it's getting more common. > > Doing it the other way (a flag that enables IOMMU if available) sounds safer > > to me. > > > > What we have right now is a mechanism that basically makes it impossible > > to use the DMA API on many ARM platforms if ARM_DMA_USE_IOMMU is set > > (and I suspect it would also make the IOMMU unusable as well, without > > any way to fix things). This is quite concerning. > > > > Even more concerning is that -rc5 is about to be released and we have > > in-tree drivers (Rockchip DRM) that are not working as they should > > because of this patch. Will, what is your plan to fix this? Do we have > > stuff that absolutely depends on this patch? If not, can we just revert > > it until all these issues are solved? keystone, shmobile, mvebu and highbank all have PCI buses that are unable to access all of RAM, with different kinds of hacks to work around that. Arnd
On Mon, Jan 19, 2015 at 04:52:41PM +0100, Arnd Bergmann wrote: > On Monday 19 January 2015 13:36:24 Thierry Reding wrote: > > On Fri, Jan 16, 2015 at 01:18:21AM +0200, Laurent Pinchart wrote: > > > On Thursday 15 January 2015 11:12:17 Will Deacon wrote: > > > > On Thu, Jan 15, 2015 at 08:28:44AM +0000, Thierry Reding wrote: > > > > > On Wed, Jan 14, 2015 at 10:46:10AM +0000, Will Deacon wrote: > > > > > > On Wed, Jan 14, 2015 at 09:00:24AM +0000, Alexandre Courbot wrote: > > > > > [...] > > > > > > > > > >>> 2) Say you want to use the IOMMU API in your driver, and have an iommu > > > > >>> property in your device's DT node. If by chance your IOMMU is > > > > >>> registered early, you will already have a mapping automatically > > > > >>> created even before your probe function is called. Can this be > > > > >>> avoided? Is it even safe? > > > > >> > > > > >> Currently, I think you have to either teardown the ops manually or > > > > >> return an error from of_xlate. Thierry was also looking at this sort of > > > > >> thing, so it might be worth talking to him. > > > > > > > > > > I already explained in earlier threads why I think this is a bad idea. > > > > > It's completely unnatural for any driver to manually tear down something > > > > > that it didn't want set up in the first place. It also means that you > > > > > have to carefully audit any users of these IOMMU APIs to make sure that > > > > > they do tear down. That doesn't sound like a good incremental approach, > > > > > as evidenced by the breakage that Alex and Heiko have encountered. > > > > > > > > Well, perhaps we hide that behind a get_iommu API or something. We *do* > > > > need this manual teardown step to support things like VFIO, so it makes > > > > sense to reuse it for other users too imo. > > > > > > > > > The solution for me has been to completely side-step the issue and not > > > > > register the IOMMU with the new mechanism at all. That is, there's no > > > > > .of_xlate() implementation, which means that the ARM DMA API glue won't > > > > > try to be smart and use the IOMMU in ways it's not meant to be used. > > > > > > That will break when someone will want to use the same IOMMU type for devices > > > that use the DMA mapping API to hide the IOMMU. That might not be the case for > > > your IOMMU today, but it's pretty fragile, we need to fix it. > > > > No, there's absolutely no issue here. It simply means that you can't do > > this on Tegra. So far I'm not sure I even see an advantage in using the > > IOMMU for devices that don't care about it anyway. Consider the example > > of the SD/MMC or HDA. They typically allocate fairly small buffers, the > > order of a single page typically. They can simply use memory handed out > > by the CMA. > > > > So as long as we don't add a .of_xlate() implementation or instantiate > > via the IOMMU_OF_DECLARE() mechanism we simply don't support IOMMU-over- > > DMA on Tegra. > > It breaks as soon as you have a system with memory above the 4GB boundary, > which is the whole point of iommus for most users. Why does it break? The IOMMU API simply gets a list of pages and gets the physical addresses from those pages when it maps them to the IO virtual addresses. How is .of_xlate() or of_iommu_configure() related? > CMA does not work for streaming mappings, only for the coherent API. Why not? And if it doesn't I'm not sure we currently care on Tegra since we've gotten away with using CMA just fine so far. Thierry
On Monday 19 January 2015 17:21:14 Thierry Reding wrote: > On Mon, Jan 19, 2015 at 04:52:41PM +0100, Arnd Bergmann wrote: > > On Monday 19 January 2015 13:36:24 Thierry Reding wrote: > > > On Fri, Jan 16, 2015 at 01:18:21AM +0200, Laurent Pinchart wrote: > > > > On Thursday 15 January 2015 11:12:17 Will Deacon wrote: > > > > > On Thu, Jan 15, 2015 at 08:28:44AM +0000, Thierry Reding wrote: > > > > > > On Wed, Jan 14, 2015 at 10:46:10AM +0000, Will Deacon wrote: > > > > > > > On Wed, Jan 14, 2015 at 09:00:24AM +0000, Alexandre Courbot wrote: > > > > > > [...] > > > > > > > > > > > >>> 2) Say you want to use the IOMMU API in your driver, and have an iommu > > > > > >>> property in your device's DT node. If by chance your IOMMU is > > > > > >>> registered early, you will already have a mapping automatically > > > > > >>> created even before your probe function is called. Can this be > > > > > >>> avoided? Is it even safe? > > > > > >> > > > > > >> Currently, I think you have to either teardown the ops manually or > > > > > >> return an error from of_xlate. Thierry was also looking at this sort of > > > > > >> thing, so it might be worth talking to him. > > > > > > > > > > > > I already explained in earlier threads why I think this is a bad idea. > > > > > > It's completely unnatural for any driver to manually tear down something > > > > > > that it didn't want set up in the first place. It also means that you > > > > > > have to carefully audit any users of these IOMMU APIs to make sure that > > > > > > they do tear down. That doesn't sound like a good incremental approach, > > > > > > as evidenced by the breakage that Alex and Heiko have encountered. > > > > > > > > > > Well, perhaps we hide that behind a get_iommu API or something. We *do* > > > > > need this manual teardown step to support things like VFIO, so it makes > > > > > sense to reuse it for other users too imo. > > > > > > > > > > > The solution for me has been to completely side-step the issue and not > > > > > > register the IOMMU with the new mechanism at all. That is, there's no > > > > > > .of_xlate() implementation, which means that the ARM DMA API glue won't > > > > > > try to be smart and use the IOMMU in ways it's not meant to be used. > > > > > > > > That will break when someone will want to use the same IOMMU type for devices > > > > that use the DMA mapping API to hide the IOMMU. That might not be the case for > > > > your IOMMU today, but it's pretty fragile, we need to fix it. > > > > > > No, there's absolutely no issue here. It simply means that you can't do > > > this on Tegra. So far I'm not sure I even see an advantage in using the > > > IOMMU for devices that don't care about it anyway. Consider the example > > > of the SD/MMC or HDA. They typically allocate fairly small buffers, the > > > order of a single page typically. They can simply use memory handed out > > > by the CMA. > > > > > > So as long as we don't add a .of_xlate() implementation or instantiate > > > via the IOMMU_OF_DECLARE() mechanism we simply don't support IOMMU-over- > > > DMA on Tegra. > > > > It breaks as soon as you have a system with memory above the 4GB boundary, > > which is the whole point of iommus for most users. > > Why does it break? The IOMMU API simply gets a list of pages and gets > the physical addresses from those pages when it maps them to the IO > virtual addresses. How is .of_xlate() or of_iommu_configure() related? Because almost no drivers use the IOMMU API, and they expect the dma-mapping API to do the right thing independent of the hardware configuration. > > CMA does not work for streaming mappings, only for the coherent API. > > Why not? And if it doesn't I'm not sure we currently care on Tegra since > we've gotten away with using CMA just fine so far. CMA is used as the backing for the dma_alloc_* API. It's impossible to use it for streaming data that originates from arbitrary page cache pages, like a send() operation on a user space network socket, or writing to disk. Arnd
Hi Thierry, On Monday 19 January 2015 17:21:14 Thierry Reding wrote: > On Mon, Jan 19, 2015 at 04:52:41PM +0100, Arnd Bergmann wrote: > > On Monday 19 January 2015 13:36:24 Thierry Reding wrote: > >> On Fri, Jan 16, 2015 at 01:18:21AM +0200, Laurent Pinchart wrote: > >>> On Thursday 15 January 2015 11:12:17 Will Deacon wrote: > >>>> On Thu, Jan 15, 2015 at 08:28:44AM +0000, Thierry Reding wrote: [...] > >>>>> The solution for me has been to completely side-step the issue and > >>>>> not register the IOMMU with the new mechanism at all. That is, > >>>>> there's no .of_xlate() implementation, which means that the ARM > >>>>> DMA API glue won't try to be smart and use the IOMMU in ways it's > >>>>> not meant to be used. > >>> > >>> That will break when someone will want to use the same IOMMU type for > >>> devices that use the DMA mapping API to hide the IOMMU. That might > >>> not be the case for your IOMMU today, but it's pretty fragile, we > >>> need to fix it. > >> > >> No, there's absolutely no issue here. It simply means that you can't do > >> this on Tegra. So far I'm not sure I even see an advantage in using the > >> IOMMU for devices that don't care about it anyway. Consider the example > >> of the SD/MMC or HDA. They typically allocate fairly small buffers, the > >> order of a single page typically. They can simply use memory handed out > >> by the CMA. > >> > >> So as long as we don't add a .of_xlate() implementation or instantiate > >> via the IOMMU_OF_DECLARE() mechanism we simply don't support IOMMU-over- > >> DMA on Tegra. > > > > It breaks as soon as you have a system with memory above the 4GB boundary, > > which is the whole point of iommus for most users. > > Why does it break? The IOMMU API simply gets a list of pages and gets > the physical addresses from those pages when it maps them to the IO > virtual addresses. How is .of_xlate() or of_iommu_configure() related? Arnd's point was that using the IOMMU for devices that deal with small buffers is mandatory in case those devices only support a 32-bit DMA address space, and the system has memory over 4GB that you want to use for DMA purpose. That's the case of Renesas' platforms with LPAE support for instance. Another use case for IOMMUs is device isolation, which is, depending on your use cases, something you might want for devices that are not IOMMU-aware. It should also be noted that even DRM drivers don't need to be explicit IOMMU users. The rcar-du driver allocates large frame buffers through the DMA mapping API, without caring about whether an IOMMU is present or not. > > CMA does not work for streaming mappings, only for the coherent API. > > Why not? And if it doesn't I'm not sure we currently care on Tegra since > we've gotten away with using CMA just fine so far.
Hi Thierry, On Monday 19 January 2015 14:36:38 Thierry Reding wrote: > On Mon, Jan 19, 2015 at 12:50:52PM +0000, Will Deacon wrote: > > On Mon, Jan 19, 2015 at 12:43:06PM +0000, Thierry Reding wrote: > >> On Sun, Jan 18, 2015 at 01:18:51PM +0200, Laurent Pinchart wrote: > >> > On Sunday 18 January 2015 15:54:34 Alexandre Courbot wrote: > >> > > On 01/16/2015 08:18 AM, Laurent Pinchart wrote: > >> [...] > >> > >>>>> The second way is to implement a mechanism to let drivers signal > >>>>> that they want to handle DMA mappings themselves. As the mappings > >>>>> need in the general case to be created before the probe function > >>>>> is called > >>>> > >>>> Sorry for being ignorant here, but why is that? > >>> > >>> Because a driver can call the DMA mapping API in its probe function, > >>> to allocate DMA coherent memory for instance. We need to ensure that > >>> the DMA mapping IOMMU has set up the required IOMMU ops by that time. > >>> As explained above I don't like the idea of sprinkling explicit calls > >>> to initialize IOMMU support in the vast majority of drivers, > >>> especially when they shouldn't be IOMMU-aware, so we then need to > >>> initialize everything that is needed before the first call to the DMA > >>> mapping API. > >> > >> The original patch that Hiroshi posted based on my comments was to have > >> the driver core call iommu_attach(), which would then set up everything > >> needed right before calling into the driver's ->probe(). This works > >> quite nicely because by definition the driver can't allocate any DMA > >> before ->probe(). And, like you said, this allows deferred probe to be > >> used. > >> > >> To me it's so obviously the right solution that I remain flabbergasted > >> with how much resistance it's received (or how much it's being ignored). > > > > Have you considered reposting the patches based on what we currently have > > (which has the advantage of identifying a specific IOMMU instance)? > > No, I hadn't. Initially my patches included a solution for identifying > individual IOMMU instances, too. There was a small registry with a list > of struct iommus. That was supposed to get used to store per-instance > data (by being embedded in a driver-specific structure). I'd need to > look in more detail how that could be done with the infrastructure that > your patchset creates. I'm somewhat burried below other tasks right now > so don't expect to have any time to look into this anytime before -rc6 > or -rc7 at the earliest. I'm pretty buried below other tasks as well, but if you find time to work on this I'll try to prioritize testing and review.
Hi Thierry, On Monday 19 January 2015 13:49:36 Thierry Reding wrote: > On Fri, Jan 16, 2015 at 01:18:21AM +0200, Laurent Pinchart wrote: > [...] > > > The second way is to implement a mechanism to let drivers signal that they > > want to handle DMA mappings themselves. As the mappings need in the > > general case to be created before the probe function is called we can't > > signal this by calling a function in probe(). A new flag field for struct > > device_driver is a possible solution. This would however require delaying > > the creation of DMA mappings until right before probe time. Attaching to > > the IOMMU could be pushed to right before probe() as well, which would > > have the added benefit of making IOMMU driver implementable as real > > platform drivers. > > Right. This is a pretty important point, too. One of the things that > we've been working on is suspend/resume. Now if you don't have a struct > device you can't easily implement suspend/resume. You'd have to play > tricks like using syscore_ops, which then leads to potentially problems > with suspend/resume ordering. It also means we have to keep around > global variables for driver-private data because there's no struct > device to attach it to. > > By properly encoding the dependencies via deferred probe we get the > proper ordering and we can use the regular driver model with all the > goodies that we've come up with over the years. Marek's patch set to port the Exynos IOMMU driver on Will's patches uses of_platform_device_create() in the init handler registered with IOMMU_OF_DECLARE() to create a platform device for the IOMMU. I've initially considered this as a dubious hack, but on the other hand it avoids modifying the whole IOMMU driver to get rid of struct device, making it easier to move to a deferred probe approach later if needed (and possible and desired).
Hi Thierry and Will, On Monday 19 January 2015 13:31:00 Thierry Reding wrote: > On Mon, Jan 19, 2015 at 01:34:24PM +0200, Laurent Pinchart wrote: > > On Monday 19 January 2015 11:12:02 Will Deacon wrote: > >> On Sun, Jan 18, 2015 at 11:18:51AM +0000, Laurent Pinchart wrote: > >>> On Sunday 18 January 2015 15:54:34 Alexandre Courbot wrote: > >>>> On 01/16/2015 08:18 AM, Laurent Pinchart wrote: > >>>>> On Thursday 15 January 2015 11:12:17 Will Deacon wrote: [snip] > >>>> I am arriving late in this discussion, but what is wrong with asking > >>>> drivers to explicitly state that they want the DMA API to be backed > >>>> by the IOMMU instead of forcibly making it work that way? > >>> > >>> The vast majority of the drivers are not IOMMU-aware. We would thus > >>> need to add a call at the beginning of the probe function of nearly > >>> every driver that can perform DMA to state that the driver doesn't need > >>> to handle any IOMMU that might be present in the system itself. I don't > >>> think that's a better solution. > >>> > >>> Explicitly tearing down mappings in drivers that want to manage IOMMUs > >>> isn't a solution I like either. A possibly better solution would be to > >>> call a function to state that the DMA mapping API shouldn't not handle > >>> IOMMUs. Something like > >>> > >>> dma_mapping_ignore_iommu(dev); > >>> > >>> at the beginning of the probe function of such drivers could do. The > >>> function would perform behind the scene all operations needed to tear > >>> down everything that shouldn't have been set up. > >> > >> An alternative would be to add a flag to platform_driver, like we have > >> for "prevent_deferred_probe" which is something like > >> "prevent_dma_configure". > > > > That's a solution I have proposed (albeit as a struct device_driver field, > > but that's a small detail), so I'm fine with it :-) > > I think Marek had proposed something similar initially as well. I don't > see an immediate downside to that solution. It's still somewhat ugly in > that a lot of stuff is set up before it's known to actually be used at > all, but it seems like there's some consensus that this can be improved > later on, so I have no objections to such a patch. > > Of course that doesn't solve the current breakage for the Rockchip DRM > and OMAP ISP drivers. And, as I came to realize after a long bisect yesternight, the Renesas IPMMU driver :-/ Basically any platform that relied on arm_iommu_attach_device() to set the IOMMU DMA ops is now broken. > There's an additional issue with this automatic type of setting up > mapping: on Tegra quite a few devices can use the IOMMU. Among those are > SD/MMC controllers, HDA, SATA and things like the AHB bus. Creating a > separate domain (== address space) for each of those devices is wasteful > and there's currently no way to make them use a single virtual address > space. Each of them really has no use for a full 4 GiB of address space. > My earliest attempt to implement that was as a policy decision within > the IOMMU driver, but that wasn't very well received. A similar discussion started at http://www.spinics.net/lists/arm-kernel/msg385805.html but didn't go very far. In the end it's really a policy decision. The question is how to express that policy. As policies in DT are frowned upon we have several subsystems currently hacking around similar issues by implementing heuristics or defaults that nobody really complained about so far. We might be able to do the same for IOMMUs as a first step. > I'm now wondering whether the same could be done using this new type of > flag. Perhaps it can be assumed that every driver that doesn't want its > IOMMU master hooked up to the DMA API can (and will) manually manage the > virtual address space. Conversely every driver that wants to use the DMA > API to abstract away the existence of an IOMMU could be considered part > of the group where a separate domain isn't necessary.
On Tue, Jan 20, 2015 at 03:14:01PM +0000, Laurent Pinchart wrote: > On Monday 19 January 2015 13:31:00 Thierry Reding wrote: > > On Mon, Jan 19, 2015 at 01:34:24PM +0200, Laurent Pinchart wrote: > > > On Monday 19 January 2015 11:12:02 Will Deacon wrote: > > >> On Sun, Jan 18, 2015 at 11:18:51AM +0000, Laurent Pinchart wrote: > > >>> On Sunday 18 January 2015 15:54:34 Alexandre Courbot wrote: > > >>>> On 01/16/2015 08:18 AM, Laurent Pinchart wrote: > > >>>>> On Thursday 15 January 2015 11:12:17 Will Deacon wrote: > > [snip] > > > >>>> I am arriving late in this discussion, but what is wrong with asking > > >>>> drivers to explicitly state that they want the DMA API to be backed > > >>>> by the IOMMU instead of forcibly making it work that way? > > >>> > > >>> The vast majority of the drivers are not IOMMU-aware. We would thus > > >>> need to add a call at the beginning of the probe function of nearly > > >>> every driver that can perform DMA to state that the driver doesn't need > > >>> to handle any IOMMU that might be present in the system itself. I don't > > >>> think that's a better solution. > > >>> > > >>> Explicitly tearing down mappings in drivers that want to manage IOMMUs > > >>> isn't a solution I like either. A possibly better solution would be to > > >>> call a function to state that the DMA mapping API shouldn't not handle > > >>> IOMMUs. Something like > > >>> > > >>> dma_mapping_ignore_iommu(dev); > > >>> > > >>> at the beginning of the probe function of such drivers could do. The > > >>> function would perform behind the scene all operations needed to tear > > >>> down everything that shouldn't have been set up. > > >> > > >> An alternative would be to add a flag to platform_driver, like we have > > >> for "prevent_deferred_probe" which is something like > > >> "prevent_dma_configure". > > > > > > That's a solution I have proposed (albeit as a struct device_driver field, > > > but that's a small detail), so I'm fine with it :-) > > > > I think Marek had proposed something similar initially as well. I don't > > see an immediate downside to that solution. It's still somewhat ugly in > > that a lot of stuff is set up before it's known to actually be used at > > all, but it seems like there's some consensus that this can be improved > > later on, so I have no objections to such a patch. > > > > Of course that doesn't solve the current breakage for the Rockchip DRM > > and OMAP ISP drivers. > > And, as I came to realize after a long bisect yesternight, the Renesas IPMMU > driver :-/ Basically any platform that relied on arm_iommu_attach_device() to > set the IOMMU DMA ops is now broken. We could restore the set_dma_ops call in arm_iommu_attach_device as a temporary hack (along with a big fat comment), since arch_setup_dma_ops actually sets the ops correct *after* the call to arm_get_iommu_dma_map_ops... It doesn't provide any motivation for people to consider moving over to the new framework, but it fixes the current issues affecting mainline. Will
On Tue, Jan 20, 2015 at 03:19:10PM +0000, Will Deacon wrote: > We could restore the set_dma_ops call in arm_iommu_attach_device as a > temporary hack (along with a big fat comment), since arch_setup_dma_ops > actually sets the ops correct *after* the call to > arm_get_iommu_dma_map_ops... s/arm_get_iommu_dma_map_ops/arm_setup_iommu_dma_ops/ Will
Hi Will, On Tuesday 20 January 2015 15:19:11 Will Deacon wrote: > On Tue, Jan 20, 2015 at 03:14:01PM +0000, Laurent Pinchart wrote: > > On Monday 19 January 2015 13:31:00 Thierry Reding wrote: > >> On Mon, Jan 19, 2015 at 01:34:24PM +0200, Laurent Pinchart wrote: > >>> On Monday 19 January 2015 11:12:02 Will Deacon wrote: > >>>> On Sun, Jan 18, 2015 at 11:18:51AM +0000, Laurent Pinchart wrote: > >>>>> On Sunday 18 January 2015 15:54:34 Alexandre Courbot wrote: > >>>>>> On 01/16/2015 08:18 AM, Laurent Pinchart wrote: > >>>>>>> On Thursday 15 January 2015 11:12:17 Will Deacon wrote: > > > > [snip] > > > >>>>>> I am arriving late in this discussion, but what is wrong with > >>>>>> asking drivers to explicitly state that they want the DMA API to be > >>>>>> backed by the IOMMU instead of forcibly making it work that way? > >>>>> > >>>>> The vast majority of the drivers are not IOMMU-aware. We would thus > >>>>> need to add a call at the beginning of the probe function of nearly > >>>>> every driver that can perform DMA to state that the driver doesn't > >>>>> need to handle any IOMMU that might be present in the system itself. > >>>>> I don't think that's a better solution. > >>>>> > >>>>> Explicitly tearing down mappings in drivers that want to manage > >>>>> IOMMUs isn't a solution I like either. A possibly better solution > >>>>> would be to call a function to state that the DMA mapping API > >>>>> shouldn't not handle IOMMUs. Something like > >>>>> > >>>>> dma_mapping_ignore_iommu(dev); > >>>>> > >>>>> at the beginning of the probe function of such drivers could do. The > >>>>> function would perform behind the scene all operations needed to > >>>>> tear down everything that shouldn't have been set up. > >>>> > >>>> An alternative would be to add a flag to platform_driver, like we > >>>> have for "prevent_deferred_probe" which is something like > >>>> "prevent_dma_configure". > >>> > >>> That's a solution I have proposed (albeit as a struct device_driver > >>> field, but that's a small detail), so I'm fine with it :-) > >> > >> I think Marek had proposed something similar initially as well. I don't > >> see an immediate downside to that solution. It's still somewhat ugly in > >> that a lot of stuff is set up before it's known to actually be used at > >> all, but it seems like there's some consensus that this can be improved > >> later on, so I have no objections to such a patch. > >> > >> Of course that doesn't solve the current breakage for the Rockchip DRM > >> and OMAP ISP drivers. > > > > And, as I came to realize after a long bisect yesternight, the Renesas > > IPMMU driver :-/ Basically any platform that relied on > > arm_iommu_attach_device() to set the IOMMU DMA ops is now broken. > > We could restore the set_dma_ops call in arm_iommu_attach_device as a > temporary hack (along with a big fat comment), since arch_setup_dma_ops > actually sets the ops correct *after* the call to > arm_get_iommu_dma_map_ops... > > It doesn't provide any motivation for people to consider moving over to the > new framework, but it fixes the current issues affecting mainline. I'm all for incentives, but I think avoiding a major v3.19 regression would be good, too :-) I wanted to test your LPAE page table allocator yesterday with the Renesas IPMMU driver, and ended up spending the whole night bisecting the regression instead.
Hi Arnd, On Monday 19 January 2015 17:13:14 Arnd Bergmann wrote: > On Sunday 18 January 2015 13:18:51 Laurent Pinchart wrote: > > On Sunday 18 January 2015 15:54:34 Alexandre Courbot wrote: > >> On 01/16/2015 08:18 AM, Laurent Pinchart wrote: > >>> On Thursday 15 January 2015 11:12:17 Will Deacon wrote: > >>>> On Thu, Jan 15, 2015 at 08:28:44AM +0000, Thierry Reding wrote: > >>>>> This has several advantages, such as that I can also use the regular > >>>>> driver model for suspend/resume of the IOMMU, and I get to enjoy the > >>>>> benefits of devres in the IOMMU driver. Probe ordering is still a > >>>>> tiny issue, but we can easily solve that using explicit initcall > >>>>> ordering (which really isn't any worse than IOMMU_OF_DECLARE()). > >>>> > >>>> That's a pity. I'd much rather extend what we currently have to > >>>> satisfy your use-case. Ho-hum. > >>> > >>> Assuming we want the IOMMU to be handled transparently for the > >>> majority of devices I only see two ways to fix this, > >>> > >>> The first way is to create a default DMA mapping unconditionally and > >>> let drivers that can't live with it tear it down. That's what is > >>> implemented today. > >> > >> I strongly support Thierry's point that drivers should not have to tear > >> down things they don't need. The issue we are facing today is a very > >> good illustration of why one should not have to do this. > >> > >> Everybody hates to receive unsollicited email with a link that says "to > >> unsubscribe, click here". Let's not import that unpleasant culture into > >> the kernel. > >> > >> I am arriving late in this discussion, but what is wrong with asking > >> drivers to explicitly state that they want the DMA API to be backed by > >> the IOMMU instead of forcibly making it work that way? > > > > The vast majority of the drivers are not IOMMU-aware. We would thus need > > to add a call at the beginning of the probe function of nearly every > > driver that can perform DMA to state that the driver doesn't need to > > handle any IOMMU that might be present in the system itself. I don't think > > that's a better solution. > > Right, abstracting the presence of an IOMMU (along with things like > cache management) is the whole point for having the dma-mapping API. > > The iommu driver should instead be able to make the decision on whether > the device uses the iommu for DMA or not. In some cases, it's not an > option because the iommu is mandatory for all DMA and there is no working > passthrough mode. In other cases, it depends on the dma mask: as long as > the device's dma_mask covers all of RAM, we can avoid using the IOMMU > and get better performance (and also avoid setting up tables that may > need to be removed again), but when the dma mask is too small, we have > to use the iommu or fall back to swiotlb (which is currently not implemeted > on arm32). > > >>> we can't signal this by calling a function in probe(). A new flag > >>> field for struct device_driver is a possible solution. This would > >>> however require delaying the creation of DMA mappings until right > >>> before probe time. Attaching to the IOMMU could be pushed to right > >>> before probe() as well, which would have the added benefit of making > >>> IOMMU driver implementable as real platform drivers. > >> > >> Keeping the ability to write IOMMU drivers as platform drivers would be > >> nice indeed. > >> > >> The problem with the opt-out flag though is that one will need to check > >> every single driver out there to see whether it stills behave correctly > >> if its hardware is suddently put behind a IOMMU. > > > > That's actually my default assumption :-) On ARM platforms at least, for > > many devices, whether an IOMMU is present or not is an integration > > property, not a property of the device. The same way a device shouldn't > > care about the exact bus topology that connects it to the CPU and memory, > > it shouldn't care about whether an IOMMU is present on that bus, except > > in special cases. > > Agreed. This must work by default, or basically all arm64 machines are > broken. At the moment, arm64 does not support IOMMUs properly and uses > uses swiotlb instead, but it's a huge performance bottleneck. On arm32, > very few systems need an IOMMU at the moment, but it's getting more common. > > >> Doing it the other way (a flag that enables IOMMU if available) sounds > >> safer to me. > >> > >> What we have right now is a mechanism that basically makes it impossible > >> to use the DMA API on many ARM platforms if ARM_DMA_USE_IOMMU is set > >> (and I suspect it would also make the IOMMU unusable as well, without > >> any way to fix things). This is quite concerning. > >> > >> Even more concerning is that -rc5 is about to be released and we have > >> in-tree drivers (Rockchip DRM) that are not working as they should > >> because of this patch. Will, what is your plan to fix this? Do we have > >> stuff that absolutely depends on this patch? If not, can we just revert > >> it until all these issues are solved? > > keystone, shmobile, mvebu and highbank all have PCI buses that are unable > to access all of RAM, with different kinds of hacks to work around that. But Will's series doesn't fix that, does it ?
--- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -1961,6 +1961,7 @@ int arm_iommu_attach_device(struct device *dev, kref_get(&mapping->kref); dev->archdata.mapping = mapping; + set_dma_ops(dev, &iommu_ops); pr_debug("Attached IOMMU controller to %s device.\n", dev_name(dev));