Adding Hyper-V Nested Virt and Guest UEFI Support#729
Adding Hyper-V Nested Virt and Guest UEFI Support#729willronchetti wants to merge 2 commits intoOpenXT:masterfrom
Conversation
| @@ -1,48 +1,8 @@ | |||
| ################################################################################ | |||
There was a problem hiding this comment.
Why are you removing this header?
There was a problem hiding this comment.
My mistake - this patch was giving me some trouble. Will add it back in.
|
Fails to build (xen patchqueue doesn't apply): http://openxt-builder.ainfosec.com:8010/builders/openxt/builds/1042/steps/Build/logs/stdio |
|
I have done a cursory read of the change set and will provide those comments later. Before that, I would note that this adds two major features in a single PR that is one commit. The single PR is fine since it is related featues, but I would suggest that this at least gets broken into two commits, one that adds the guest UEFI support and one builds on it to add the nested virtualization support. |
| if ( !++port ) | ||
| nvcpu->nv_vmexit_pending = 1; | ||
| } while ( !nvcpu->nv_vmexit_pending ); | ||
| +#if 0 |
There was a problem hiding this comment.
Instead of just disabling, why not test for DEBUG. That way it is easier to turn on when troubleshooting.
| +} | ||
| + | ||
| + | ||
| +static int paging_read_l2_entry(struct vcpu *v, unsigned long l2_pa, uint64_t * entry) |
There was a problem hiding this comment.
Please be consistent with spacing and honor spacing conventions of existing code you are patching.
| + return 0; | ||
| +} | ||
| + | ||
| +static unsigned int paging_get_l2_pa_from_l2_va(struct vcpu *v,unsigned long l2_va, unsigned long *l2_pa) |
| + | ||
| + uint64_t pml4e_addr; | ||
| + uint64_t pml4e; | ||
| + |
There was a problem hiding this comment.
Nitpick: Is this extra white space needed?
| + | ||
| + uint64_t l2_pa; | ||
| + | ||
| +if (paging_get_l2_pa_from_l2_va(v,va,&l2_pa)) |
|
|
||
| if ( is_hvm_vcpu(v) && paging_mode_hap(v->domain) && nestedhvm_is_n2(v) ) | ||
| { | ||
| +#if 0 |
There was a problem hiding this comment.
Please don't just turn off code blocks, if replacing then replace. If it is being left to provide understanding, then add a comment within the block to let people know why its still here.
| if ( (vendor_id == 0xffff) && (device_id == 0xffff) ) | ||
| continue; | ||
|
|
||
| +#if 0 |
There was a problem hiding this comment.
Again don't #if out replaced code.
| Device (ISA) | ||
| { | ||
| - Name (_ADR, 0x00010000) /* device 1, fn 0 */ | ||
| + //Name (_ADR, 0x00010000) /* device 1, fn 0 */ |
There was a problem hiding this comment.
Just like #if, if you are leaving for understanding, then explain otherwise just delete the line
| + if (b_info->stubdomain_version == LIBXL_STUBDOMAIN_VERSION_LINUX) { | ||
| + drive = libxl__sprintf | ||
| + (gc, "file=%s%c,if=scsi,bus=0,unit=%d,format=%s,cache=writeback", | ||
| + "/dev/xvd", 'a'+disk, disk, format); |
There was a problem hiding this comment.
Is there any way not to have to hard code the device path? Maybe make it a #define variable?
| libxl__xs_get_dompath(gc, dm_domid)), | ||
| "%d", guest_domid); | ||
| if (guest_config->b_info.stubdomain_version == LIBXL_STUBDOMAIN_VERSION_LINUX) { | ||
| +#if 0 /* LIES */ |
There was a problem hiding this comment.
If this is wrong, then just replace it.
| (AHCI_NUM_COMMAND_SLOTS << 8) | | ||
| (AHCI_SUPPORTED_SPEED_GEN1 << AHCI_SUPPORTED_SPEED) | | ||
| - HOST_CAP_NCQ | HOST_CAP_AHCI; | ||
| + /*HOST_CAP_NCQ |*/ HOST_CAP_AHCI; |
There was a problem hiding this comment.
Just delete and if you feel its needed, add a comment that it was removed and why
| return; | ||
| } | ||
| - dma_blk_unmap(dbs); | ||
| + //dma_blk_unmap(dbs); |
| #include "migration/migration.h" | ||
| #include "kvm_i386.h" | ||
|
|
||
| +extern int xen_q35; |
There was a problem hiding this comment.
I do not like blind accessible global state, they are too easily abused and leaves yourself open to potential exploit. Make it local and export wrappers to set and get that guard for correct values, etc.
There was a problem hiding this comment.
Seems like it's how QEMU defines a bunch of other Xen variables too in include/hw/xen/xen.h, so I would just place this next to the other ones:
extern uint32_t xen_domid;
extern enum xen_mode xen_mode;
extern bool xen_allowed;
| + fatal_errmsg = g_strdup_printf("failed to get backing file size"); | ||
| + } else if (size == 0) { | ||
| + fatal_errmsg = g_strdup_printf("PC system firmware (pflash) " | ||
| + "cannot have zero size"); |
There was a problem hiding this comment.
I am all about observing 80 char, and this looks like it would have been fine. Please don't unnecessarily split.
| + } | ||
| + | ||
| + | ||
| + blk = blk_by_legacy_dinfo(pflash_drv); |
| + | ||
| + | ||
| + if (size < 0) { | ||
| + fatal_errmsg = g_strdup_printf("failed to get backing file size"); |
| pc_memory_init(pcms, get_system_memory(), | ||
| rom_memory, &ram_memory); | ||
| + } else { | ||
| + xen_map_efi_var_rom(rom_memory); |
There was a problem hiding this comment.
Will this logic trigger for non-uefi guest on xen and if so, is it handled gracefully?
| pc_q35_machine_options(m); | ||
| - m->alias = "q35"; | ||
| + if (xen_q35) | ||
| + m->alias = "pc"; |
| IN EFI_HANDLE Controller | ||
| ) | ||
| { | ||
| +#if 0 |
|
|
||
| #include "AcpiPlatform.h" | ||
|
|
||
| +#define FISH \ |
There was a problem hiding this comment.
I get this for development, but maybe a more meaningful debug augmentation would be better.
| EFI_STATUS Status; | ||
| VOID *Interface; | ||
| EFI_EVENT PciEnumerated; | ||
| +#if 0 |
There was a problem hiding this comment.
Just remove, leave comment if necessary
| return Status; | ||
| } | ||
|
|
||
| +#if 0 |
|
|
||
| EFI_ACPI_2_0_ROOT_SYSTEM_DESCRIPTION_POINTER *XenAcpiRsdpStructurePtr = NULL; | ||
|
|
||
| +#define FISH \ |
| SETTINGS Settings; | ||
|
|
||
| Status = GetSettings (&Settings); | ||
| +#if 0 |
| AddReservedMemoryBaseSizeHob (0xFC000000, 0x1000000, FALSE); | ||
|
|
||
| - PcdSetBool (PcdPciDisableBusEnumeration, TRUE); | ||
| + //PcdSetBool (PcdPciDisableBusEnumeration, TRUE); |
| // 0xFEE00000 LAPIC 1 MB | ||
| // | ||
| PciSize = 0xFC000000 - PciBase; | ||
| + |
There was a problem hiding this comment.
is returns necessary? would reduce diff churn
| + PcdSet64 (PcdPciMmio32Base, 0x80000000); | ||
| + PcdSet64 (PcdPciMmio32Size, 0x7c000000); | ||
| + | ||
| +# if 0 |
There was a problem hiding this comment.
turned off code you added, should it just not be added?
| XENIO_PROTOCOL *XenIo; | ||
| EFI_DEVICE_PATH_PROTOCOL *DevicePath; | ||
|
|
||
| + return EFI_UNSUPPORTED; |
There was a problem hiding this comment.
instead of just shorting out this function, can you not go to the invocations of this function and implement a check whether to even call?
| EFI_STATUS Status; | ||
| XENBUS_PROTOCOL *XenBusIo; | ||
|
|
||
| + return EFI_UNSUPPORTED; |
| XEN_BLOCK_FRONT_DEVICE *Dev; | ||
| EFI_BLOCK_IO_MEDIA *Media; | ||
|
|
||
| + return EFI_UNSUPPORTED; |
|
I'm very pleased to see this progress on nested virtualization -- thanks for the PR. We need a detailed and clear set of test validation steps for this feature, or set of features, in this PR so that we will be able to verify that the new functionality has been preserved correctly when OpenXT uprevs to a newer version of the Xen hypervisor and toolstack software. We need to know exactly what software, and what features of that software, this introduces a requirement to remain compatible with, please. |
|
A few updates:
|
|
I have attached two documents here that may be freely distributed that describe how to go about testing out the new features. I hope these documents are helpful. I will be periodically keeping track of this PR and hope to contribute more in the future, but for now the remaining work will be picked up by someone else at AIS. There is, however, one unfortunate thing to note. In my latest round of testing I've found that enabling Windows Hyper-V causes a triple fault upon boot as both a standard and UEFI guest. Having glanced through the Xen 4.6.6 changeset, I am not convinced that the problem was caused there. It's possible that other changes have come in recently that may have had an effect. Fortunately, the surface where this could have happened is relatively small. In addition, the original document we received suggests that all of this functionality (and more) was working in April 2017, thus the time frame for such a change is relatively small. This is the only issue I've found in my (limited) testing since merging many commits from master today. |
|
You may wish to leave the "#if 0"s in place as this functionality is likely to be merged upstream (Citrix are also instrested in q35, ovmf and Credential guard and HVCI), and it will make unpicking your PQ from the the upstream changes easier. |
|
Okay, then don't just #if 0 out the code. Use a descriptive flag and possible comments that a) makes the code easily locatable and b) helps reviewers understand why the code is disabled. |
|
Now OVMF won't build. Does it need |
|
You should obtain the original documentation provided with this patch series, it explains in detail what each part of the patches do and why. How you wish to modify it an maintain them in this PR is entirely up to you: but each modification that diverges significantly from upstream is something you're going to either have to merge in later or for which a PQ must be maintained - eg it's quite likely that xen upstream will implement q35 and ovmf support itself soon. |
|
@jean-edouard I did not experience any issues building OVMF, but judging from that error message it looks like nasm is indeed needed to build OVMF. My build machine must've had it, which is why I did not make any note of it. The only original documentation given is the 'readme' posted above, which gives some additional background/details. |
|
Thanks @willronchetti. In the meantime, I will manually add it to the AIS builders to be able to build and test this PR. |
|
It also requires adding a device to qemu, which is not covered by this PR: |
|
I ran a build with a qemu fix here: http://openxt-builder.ainfosec.com:8010/builders/openxt/builds/1077 I really feel that this PR should be split into multiple ones. I was able to smoke-test the guest UEFI support (ovmf), and it actually works pretty well. It's also optional and not enabled by default, so I would be willing to merge it pretty quickly if it was in its own PR. The nested-virt support is a more important feature that needs more reviewing, and not having guest UEFI support mixed-in would also help. |
|
The nesting support is dependent on a work-around in this PR. Quoting readme.pdf that @willronchetti added:
See also comments inline:
Until a proper fix is available, this is a -1 from me. @jean-edouard +1 on Splitting the PR to get the guest UEFI support merged independently. |
|
What are the test cases for guest UEFI support and nested virt in this PR? In today's community call, it was stated that new developers would be working on these features. Should we wait for them to submit a new PR? |
|
@eric-ch the fix is already upstream, and I beleive there's a 4.8 backport. |
| - outb(0x4d0, (uint8_t)(PCI_ISA_IRQ_MASK >> 0)); | ||
| - outb(0x4d1, (uint8_t)(PCI_ISA_IRQ_MASK >> 8)); | ||
| + if ((pci_readw(PCI_ISA_DEVFN, PCI_VENDOR_ID) == 0x8086) && | ||
| + (pci_readw(PCI_ISA_DEVFN, PCI_VENDOR_ID) == 0x7000)) { |
There was a problem hiding this comment.
This here should be PCI_DEVICE_ID
|
Relevant discussion on xen-devel https://lists.xenproject.org/archives/html/xen-devel/2018-06/msg00379.html The same conclusion has been reached about this approach I shared during the meeting we had in the fall: "While Secure Boot can be enabled with this implementation, it is not sufficiently secure because the guest is able to write anything it wants to the emulated flash. KVM solved this problem with SMM mode but I don't like that solution either." |
|
The correct solution is to implement efi variable reads and writes with a hypercall. There are only 3 functions. GetNextVariableName only needs 3 arguments, Get and Set both take 5 arguments, Since the interface is not often used, and performance isn't an issue, you could easily
This would be easy to implement, the only subtlety is that the code needs to register |
This patch set integrates nested virtualization and guest UEFI support into OpenXT. These patches have been tested on OpenXT. The following tests have been carried out. Areas where testing can be improved are highlighted.
This is a very large patch set that will require thorough testing beyond what I've done before being merged into master.