Message ID | 20210530150628.2063957-9-ltykernel@gmail.com (mailing list archive) |
---|---|
State | RFC |
Headers | show |
Series | x86/Hyper-V: Add Hyper-V Isolation VM support | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Guessing tree name failed - patch did not apply |
On Sun, May 30, 2021 at 11:06:25AM -0400, Tianyu Lan wrote: > From: Tianyu Lan <Tianyu.Lan@microsoft.com> > > For Hyper-V isolation VM with AMD SEV SNP, the bounce buffer(shared memory) > needs to be accessed via extra address space(e.g address above bit39). > Hyper-V code may remap extra address space outside of swiotlb. swiotlb_ > bounce() needs to use remap virtual address to copy data from/to bounce > buffer. Add new interface swiotlb_set_bounce_remap() to do that. Why can't you use the bus_dma_region ranges to remap to your preferred address?
On 6/7/2021 2:43 PM, Christoph Hellwig wrote: > On Sun, May 30, 2021 at 11:06:25AM -0400, Tianyu Lan wrote: >> From: Tianyu Lan <Tianyu.Lan@microsoft.com> >> >> For Hyper-V isolation VM with AMD SEV SNP, the bounce buffer(shared memory) >> needs to be accessed via extra address space(e.g address above bit39). >> Hyper-V code may remap extra address space outside of swiotlb. swiotlb_ >> bounce() needs to use remap virtual address to copy data from/to bounce >> buffer. Add new interface swiotlb_set_bounce_remap() to do that. > > Why can't you use the bus_dma_region ranges to remap to your preferred > address? > Thanks for your suggestion. These addresses in extra address space works as system memory mirror. The shared memory with host in Isolation VM needs to be accessed via extra address space which is above shared gpa boundary. During initializing swiotlb bounce buffer pool, only address bellow shared gpa boundary can be accepted by swiotlb API because it is treated as system memory and managed by memory management. This is why Hyper-V swiotlb bounce buffer pool needs to be allocated in Hyper-V code and map associated physical address in extra address space. The patch target is to add the new interface to set start virtual address of bounce buffer pool and let swiotlb boucne buffer copy function to use right virtual address for extra address space. bus_dma_region is to translate cpu physical address to dma address. It can't modify the virtual address of bounce buffer pool and let swiotlb code to copy data with right address. If some thing missed, please correct me. Thanks.
On 6/7/2021 10:56 PM, Tianyu Lan wrote: > > On 6/7/2021 2:43 PM, Christoph Hellwig wrote: >> On Sun, May 30, 2021 at 11:06:25AM -0400, Tianyu Lan wrote: >>> From: Tianyu Lan <Tianyu.Lan@microsoft.com> >>> >>> For Hyper-V isolation VM with AMD SEV SNP, the bounce buffer(shared >>> memory) >>> needs to be accessed via extra address space(e.g address above bit39). >>> Hyper-V code may remap extra address space outside of swiotlb. swiotlb_ >>> bounce() needs to use remap virtual address to copy data from/to bounce >>> buffer. Add new interface swiotlb_set_bounce_remap() to do that. >> >> Why can't you use the bus_dma_region ranges to remap to your preferred >> address? >> > > Thanks for your suggestion. > > These addresses in extra address space works as system memory mirror. > The shared memory with host in Isolation VM needs to be accessed via > extra address space which is above shared gpa boundary. During > initializing swiotlb bounce buffer pool, only address bellow shared gpa > boundary can be accepted by swiotlb API because it is treated as system > memory and managed by memory management. This is why Hyper-V swiotlb > bounce buffer pool needs to be allocated in Hyper-V code and map > associated physical address in extra address space. The patch target is > to add the new interface to set start virtual address of bounce buffer > pool and let swiotlb boucne buffer copy function to use right virtual > address for extra address space. > > bus_dma_region is to translate cpu physical address to dma address. > It can't modify the virtual address of bounce buffer pool and let > swiotlb code to copy data with right address. If some thing missed, > please correct me. > Hi Christoph: Sorry to bother you. Could you have a look at my previous reply? I try figuring out the right way. Thanks.
On Mon, Jun 07, 2021 at 10:56:47PM +0800, Tianyu Lan wrote: > These addresses in extra address space works as system memory mirror. The > shared memory with host in Isolation VM needs to be accessed via extra > address space which is above shared gpa boundary. Why?
On 6/14/21 2:12 AM, Christoph Hellwig wrote: > On Mon, Jun 07, 2021 at 10:56:47PM +0800, Tianyu Lan wrote: >> These addresses in extra address space works as system memory mirror. The >> shared memory with host in Isolation VM needs to be accessed via extra >> address space which is above shared gpa boundary. > > Why? > IIUC, this is using the vTOM feature of SEV-SNP. When this feature is enabled for a VMPL level, any physical memory addresses below vTOM are considered private/encrypted and any physical memory addresses above vTOM are considered shared/unencrypted. With this option, you don't need a fully enlightened guest that sets and clears page table encryption bits. You just need the DMA buffers to be allocated in the proper range above vTOM. See the section on "Virtual Machine Privilege Levels" in https://www.amd.com/system/files/TechDocs/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf. Thanks, Tom
On 6/14/2021 3:12 PM, Christoph Hellwig wrote: > On Mon, Jun 07, 2021 at 10:56:47PM +0800, Tianyu Lan wrote: >> These addresses in extra address space works as system memory mirror. The >> shared memory with host in Isolation VM needs to be accessed via extra >> address space which is above shared gpa boundary. > > Why? > The shared_gpa_boundary in the AMD SEV SNP spec is called virtual top of memory(vTOM). Memory addresses below vTOM are automatically treated as private while memory above vTOM is treated as shared. Using vTOM to separate memory in this way avoids the need to augment the standard x86 page tables with C-bit markings, simplifying guest OS software.
On 6/14/2021 9:37 PM, Tianyu Lan wrote: > > > On 6/14/2021 3:12 PM, Christoph Hellwig wrote: >> On Mon, Jun 07, 2021 at 10:56:47PM +0800, Tianyu Lan wrote: >>> These addresses in extra address space works as system memory mirror. >>> The >>> shared memory with host in Isolation VM needs to be accessed via extra >>> address space which is above shared gpa boundary. >> >> Why? >> > > The shared_gpa_boundary in the AMD SEV SNP spec is called virtual top of > memory(vTOM). Memory addresses below vTOM are automatically treated as > private while memory above vTOM is treated as shared. Using vTOM to > separate memory in this way avoids the need to augment the standard x86 > page tables with C-bit markings, simplifying guest OS software. Here is the spec link and vTOM description is in the page 14. https://www.amd.com/system/files/TechDocs/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf Thanks.
On 2021-06-07 07:43, Christoph Hellwig wrote: > On Sun, May 30, 2021 at 11:06:25AM -0400, Tianyu Lan wrote: >> From: Tianyu Lan <Tianyu.Lan@microsoft.com> >> >> For Hyper-V isolation VM with AMD SEV SNP, the bounce buffer(shared memory) >> needs to be accessed via extra address space(e.g address above bit39). >> Hyper-V code may remap extra address space outside of swiotlb. swiotlb_ >> bounce() needs to use remap virtual address to copy data from/to bounce >> buffer. Add new interface swiotlb_set_bounce_remap() to do that. > > Why can't you use the bus_dma_region ranges to remap to your preferred > address? FWIW, I think a better generalisation for this would be allowing set_memory_decrypted() to return an address rather than implicitly operating in-place, and hide all the various hypervisor hooks behind that. Robin.
On Mon, Jun 14, 2021 at 02:49:51PM +0100, Robin Murphy wrote: > FWIW, I think a better generalisation for this would be allowing > set_memory_decrypted() to return an address rather than implicitly > operating in-place, and hide all the various hypervisor hooks behind that. Yes, something like that would be a good idea. As-is set_memory_decrypted is a pretty horribly API anyway due to passing the address as void, and taking a size parameter while it works in units of pages. So I'd very much welcome a major overhaul of this API.
On 6/14/2021 11:32 PM, Christoph Hellwig wrote: > On Mon, Jun 14, 2021 at 02:49:51PM +0100, Robin Murphy wrote: >> FWIW, I think a better generalisation for this would be allowing >> set_memory_decrypted() to return an address rather than implicitly >> operating in-place, and hide all the various hypervisor hooks behind that. > > Yes, something like that would be a good idea. As-is > set_memory_decrypted is a pretty horribly API anyway due to passing > the address as void, and taking a size parameter while it works in units > of pages. So I'd very much welcome a major overhaul of this API. > Hi Christoph and Robin: Thanks for your suggestion. I will try this idea in the next version. Besides make the address translation into set_memory_ decrypted() and return address, do you want to make other changes to the API in order to make it more reasonable(e.g size parameter)? Thanks
Hi Christoph and Robin: I introduced new interface set_memory_decrypted_map() to hide all the hypervisor code behind it in the latest version. In current generic code, only swiotlb bounce buffer needs to be decrypted and remapped in the same time and so keep set_memory_decrypted(). If there were more requests of set_memory_decrypted_map() for other platform, we may replace set_memory_decrypted() step by step. Please have a look. https://lkml.org/lkml/2021/7/7/570 Thanks. On 6/15/2021 11:24 PM, Tianyu Lan wrote: > On 6/14/2021 11:32 PM, Christoph Hellwig wrote: >> On Mon, Jun 14, 2021 at 02:49:51PM +0100, Robin Murphy wrote: >>> FWIW, I think a better generalisation for this would be allowing >>> set_memory_decrypted() to return an address rather than implicitly >>> operating in-place, and hide all the various hypervisor hooks behind >>> that. >> >> Yes, something like that would be a good idea. As-is >> set_memory_decrypted is a pretty horribly API anyway due to passing >> the address as void, and taking a size parameter while it works in units >> of pages. So I'd very much welcome a major overhaul of this API. >> > > Hi Christoph and Robin: > Thanks for your suggestion. I will try this idea in the next > version. Besides make the address translation into set_memory_ > decrypted() and return address, do you want to make other changes to the > API in order to make it more reasonable(e.g size parameter)? > > Thanks
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 216854a5e513..43f53cf52f48 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -113,8 +113,13 @@ unsigned int swiotlb_max_segment(void); size_t swiotlb_max_mapping_size(struct device *dev); bool is_swiotlb_active(void); void __init swiotlb_adjust_size(unsigned long size); +void swiotlb_set_bounce_remap(unsigned char *vaddr); #else #define swiotlb_force SWIOTLB_NO_FORCE +static inline void swiotlb_set_bounce_remap(unsigned char *vaddr) +{ +} + static inline bool is_swiotlb_buffer(phys_addr_t paddr) { return false; diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 8ca7d505d61c..fbc827ab5fb4 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -70,6 +70,7 @@ struct io_tlb_mem *io_tlb_default_mem; * not be bounced (unless SWIOTLB_FORCE is set). */ static unsigned int max_segment; +static unsigned char *swiotlb_bounce_remap_addr; static unsigned long default_nslabs = IO_TLB_DEFAULT_SIZE >> IO_TLB_SHIFT; @@ -334,6 +335,11 @@ void __init swiotlb_exit(void) io_tlb_default_mem = NULL; } +void swiotlb_set_bounce_remap(unsigned char *vaddr) +{ + swiotlb_bounce_remap_addr = vaddr; +} + /* * Bounce: copy the swiotlb buffer from or back to the original dma location */ @@ -345,7 +351,13 @@ static void swiotlb_bounce(struct device *dev, phys_addr_t tlb_addr, size_t size phys_addr_t orig_addr = mem->slots[index].orig_addr; size_t alloc_size = mem->slots[index].alloc_size; unsigned long pfn = PFN_DOWN(orig_addr); - unsigned char *vaddr = phys_to_virt(tlb_addr); + unsigned char *vaddr; + + if (swiotlb_bounce_remap_addr) + vaddr = swiotlb_bounce_remap_addr + tlb_addr - + io_tlb_default_mem->start; + else + vaddr = phys_to_virt(tlb_addr); if (orig_addr == INVALID_PHYS_ADDR) return;