Web lists-archives.com

Re: [RFC 1/2] misc: Add vboxguest driver for Virtual Box Guest integration




On Fri, Aug 11, 2017 at 3:23 PM, Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
> This commit adds a driver for the Virtual Box Guest PCI device used in
> Virtual Box virtual machines. Enabling this driver will add support for
> Virtual Box Guest integration features such as copy-and-paste, seamless
> mode and OpenGL pass-through.
>
> This driver also offers vboxguest IPC functionality which is needed
> for the vboxfs driver which offers folder sharing support.
>
> Signed-off-by: Hans de Goede <hdegoede@xxxxxxxxxx>

Since you are still working on coding style cleanups, I'll comment mainly on
the ioctl interface for now. Overall it already looks much better than
I expected
from previous experience with the vbox source.

Most of the issues I would normally comment on are already moot based
on the assumption that we won't be able to make substantial changes to
either the user space portion or the hypervisor side.

> +/**
> + * Inflate the balloon by one chunk.
> + *
> + * The caller owns the balloon mutex.
> + *
> + * @returns VBox status code
> + * @param   gdev        The Guest extension device.
> + * @param   chunk_idx  Index of the chunk.
> + */
> +static int vbg_balloon_inflate(PVBOXGUESTDEVEXT gdev, u32 chunk_idx)
> +{
> +       VMMDevChangeMemBalloon *req = gdev->mem_balloon.change_req;
> +       struct page **pages;
> +       int i, rc;
> +
> +       pages = kmalloc(sizeof(*pages) * VMMDEV_MEMORY_BALLOON_CHUNK_PAGES,
> +                       GFP_KERNEL | __GFP_NOWARN);
> +       if (!pages)
> +               return VERR_NO_MEMORY;
> +
> +       req->header.size = sizeof(*req);
> +       req->inflate = true;
> +       req->pages = VMMDEV_MEMORY_BALLOON_CHUNK_PAGES;

The balloon section seems to be rather simplistic with its ioctl interface.
Ideally this should use CONFIG_BALLOON_COMPACTION and an
oom notifier like the virtio_balloon driver does. Yes, I realize that only
one of the six (or more) balloon drivers we have in the kernel does it ;-)

> +/**
> + * Callback for heartbeat timer.
> + */
> +static void vbg_heartbeat_timer(unsigned long data)
> +{
> +       PVBOXGUESTDEVEXT gdev = (PVBOXGUESTDEVEXT)data;
> +
> +       vbg_req_perform(gdev, gdev->guest_heartbeat_req);
> +       mod_timer(&gdev->heartbeat_timer,
> +                 msecs_to_jiffies(gdev->heartbeat_interval_ms));
> +}

This looks like a watchdog and should use the drivers/watchdog
subsystem interfaces.

> +/**
> + * vbg_query_host_version try get the host feature mask and version information
> + * (vbg_host_version).
> + *
> + * @returns 0 or negative errno value (ignored).
> + * @param   gdev         The Guest extension device.
> + */
> +static int vbg_query_host_version(PVBOXGUESTDEVEXT gdev)
> +{
> +       VMMDevReqHostVersion *req;
> +       int rc, ret;
> +
> +       req = vbg_req_alloc(sizeof(*req), VMMDevReq_GetHostVersion);
> +       if (!req)
> +               return -ENOMEM;

While I understand you want to keep the ioctl interface, the version information
looks like something that we should also have in sysfs in an appropriate
location.

> diff --git a/drivers/misc/vboxguest/vboxguest_linux.c b/drivers/misc/vboxguest/vboxguest_linux.c
> new file mode 100644
> index 000000000000..8468c7139b98
> --- /dev/null
> +++ b/drivers/misc/vboxguest/vboxguest_linux.c

This looks like a fairly short file, and could be merged into the core file.

> +/** The file_operations structures. */
> +static const struct file_operations vbg_misc_device_fops = {
> +       .owner                  = THIS_MODULE,
> +       .open                   = vbg_misc_device_open,
> +       .release                = vbg_misc_device_close,
> +       .unlocked_ioctl         = vgdrvLinuxIOCtl,
> +};
> +static const struct file_operations vbg_misc_device_user_fops = {
> +       .owner                  = THIS_MODULE,
> +       .open                   = vbg_misc_device_user_open,
> +       .release                = vbg_misc_device_close,
> +       .unlocked_ioctl         = vgdrvLinuxIOCtl,
> +};

These lack a .compat_ioctl callback.

> +/**
> + * Device I/O Control entry point.
> + *
> + * @returns -ENOMEM or -EFAULT for errors inside the ioctl callback; 0
> + * on success, or a positive VBox status code on vbox guest-device errors.
> + *
> + * @param   pInode      Associated inode pointer.
> + * @param   pFilp       Associated file pointer.
> + * @param   uCmd        The function specified to ioctl().
> + * @param   ulArg       The argument specified to ioctl().
> + */
> +static long vgdrvLinuxIOCtl(struct file *pFilp, unsigned int uCmd,
> +                           unsigned long ulArg)
> +{
> +       PVBOXGUESTSESSION session = (PVBOXGUESTSESSION) pFilp->private_data;
> +       u32 cbData = _IOC_SIZE(uCmd);
> +       void *pvBufFree;
> +       void *pvBuf;
> +       int rc, ret = 0;
> +       u64 au64Buf[32 / sizeof(u64)];
> +
> +       /*
> +        * For small amounts of data being passed we use a stack based buffer
> +        * except for VMMREQUESTs where the data must not be on the stack.
> +        */
> +       if (cbData <= sizeof(au64Buf) &&
> +           VBOXGUEST_IOCTL_STRIP_SIZE(uCmd) !=
> +           VBOXGUEST_IOCTL_STRIP_SIZE(VBOXGUEST_IOCTL_VMMREQUEST(0))) {
> +               pvBufFree = NULL;
> +               pvBuf = &au64Buf[0];
> +       } else {
> +               /* __GFP_DMA32 for VBOXGUEST_IOCTL_VMMREQUEST */
> +               pvBufFree = pvBuf = kmalloc(cbData, GFP_KERNEL | __GFP_DMA32);
> +               if (!pvBuf)
> +                       return -ENOMEM;
> +       }
> +       if (copy_from_user(pvBuf, (void *)ulArg, cbData) == 0) {

There should be a check for a maximum argument size. I'd also change
the commands
to not always do both read and write, but only whichever applies. This function
would then do the copy_from_user/copy_to_user conditionally.

> +static int hgcm_call_preprocess_linaddr(const HGCMFunctionParameter *src_parm,
> +                                       bool is_user, void **bounce_buf_ret,
> +                                       size_t *pcb_extra)
> +{
> +       void *buf, *bounce_buf;
> +       bool copy_in;
> +       u32 len;
> +       int ret;
> +
> +       buf = (void *)src_parm->u.Pointer.u.linearAddr;
> +       len = src_parm->u.Pointer.size;
> +       copy_in = src_parm->type != VMMDevHGCMParmType_LinAddr_Out;
> +
> +       if (!is_user) {
> +               if (WARN_ON(len > VBGLR0_MAX_HGCM_KERNEL_PARM))
> +                       return VERR_OUT_OF_RANGE;
> +
> +               hgcm_call_inc_pcb_extra(buf, len, pcb_extra);
> +               return VINF_SUCCESS;
> +       }
> +
> +       if (len > VBGLR0_MAX_HGCM_USER_PARM)
> +               return VERR_OUT_OF_RANGE;
> +
> +       if (len <= PAGE_SIZE * 2)
> +               bounce_buf = kmalloc(len, GFP_KERNEL);
> +       else
> +               bounce_buf = vmalloc(len);
> +

we have kvmalloc() for this now


> +#ifdef CONFIG_X86_64
> +int vbg_hgcm_call32(VBOXGUESTDEVEXT *gdev, VBoxGuestHGCMCallInfo * pCallInfo,
> +                   u32 cbCallInfo, u32 timeout_ms, bool is_user)
> +{
> +       VBoxGuestHGCMCallInfo *pCallInfo64 = NULL;
> +       HGCMFunctionParameter *pParm64 = NULL;
> +       HGCMFunctionParameter32 *pParm32 = NULL;
> +       u32 cParms = pCallInfo->cParms;
> +       u32 iParm;
> +       int rc = VINF_SUCCESS;
> +
> +       /*
> +        * The simple approach, allocate a temporary request and convert the parameters.
> +        */
> +       pCallInfo64 = kzalloc(sizeof(*pCallInfo64) +
> +                             cParms * sizeof(HGCMFunctionParameter),
> +                             GFP_KERNEL);
> +       if (!pCallInfo64)
> +               return VERR_NO_MEMORY;
> +
> +       *pCallInfo64 = *pCallInfo;
> +       pParm32 = VBOXGUEST_HGCM_CALL_PARMS32(pCallInfo);
> +       pParm64 = VBOXGUEST_HGCM_CALL_PARMS(pCallInfo64);
> +       for (iParm = 0; iParm < cParms; iParm++, pParm32++, pParm64++) {
> +               switch (pParm32->type) {
> +               case VMMDevHGCMParmType_32bit:
> +                       pParm64->type = VMMDevHGCMParmType_32bit;
> +                       pParm64->u.value32 = pParm32->u.value32;
> +                       break;
> +
> +               case VMMDevHGCMParmType_64bit:
> +                       pParm64->type = VMMDevHGCMParmType_64bit;
> +                       pParm64->u.value64 = pParm32->u.value64;
> +                       break;
> +
> +               case VMMDevHGCMParmType_LinAddr_Out:
> +               case VMMDevHGCMParmType_LinAddr:
> +               case VMMDevHGCMParmType_LinAddr_In:
> +                       pParm64->type = pParm32->type;
> +                       pParm64->u.Pointer.size = pParm32->u.Pointer.size;
> +                       pParm64->u.Pointer.u.linearAddr =
> +                           pParm32->u.Pointer.u.linearAddr;
> +                       break;

It would be good to clean this up and always use the same structure here.

> diff --git a/include/linux/vbox_err.h b/include/linux/vbox_err.h
> new file mode 100644
> index 000000000000..906ff7d2585d
> --- /dev/null
> +++ b/include/linux/vbox_err.h
> @@ -0,0 +1,6 @@
> +#ifndef __VBOX_ERR_H__
> +#define __VBOX_ERR_H__
> +
> +#include <uapi/linux/vbox_err.h>
> +
> +#endif
> diff --git a/include/linux/vbox_ostypes.h b/include/linux/vbox_ostypes.h
> new file mode 100644
> index 000000000000..ea2a391f135f
> --- /dev/null
> +++ b/include/linux/vbox_ostypes.h
> @@ -0,0 +1,6 @@
> +#ifndef __VBOX_OSTYPES_H__
> +#define __VBOX_OSTYPES_H__
> +
> +#include <uapi/linux/vbox_ostypes.h>
> +
> +#endif
> diff --git a/include/linux/vboxguest.h b/include/linux/vboxguest.h
> new file mode 100644
> index 000000000000..fca5d199a884
> --- /dev/null
> +++ b/include/linux/vboxguest.h
> @@ -0,0 +1,6 @@
> +#ifndef __VBOXGUEST_H__
> +#define __VBOXGUEST_H__
> +
> +#include <uapi/linux/vboxguest.h>
> +
> +#endif

This should not be needed, we add -I${srctree}/include/uapi/ to the include path
already.

> +/**
> + * The layout of VMMDEV RAM region that contains information for guest.
> + */
> +typedef struct VMMDevMemory {
> +       /** The size of this structure. */
> +       u32 u32Size;
> +       /** The structure version. (VMMDEV_MEMORY_VERSION) */
> +       u32 u32Version;
> +
> +       union {
> +               struct {
> +                       /** Flag telling that VMMDev has events pending. */
> +                       bool fHaveEvents;
> +               } V1_04;
> +

As this is a hardware interface, maybe use u32 instead of 'bool'.

Strictly speaking, the ring buffer structures would even have to
be __le32 and use byteorder specific read/write access, but that
could perhaps be skipped here when you add a Kconfig
dependency on !CONFIG_CPU_BIG_ENDIAN (which won't
ever be set on x86).

> diff --git a/include/uapi/linux/vbox_err.h b/include/uapi/linux/vbox_err.h
> new file mode 100644
> index 000000000000..e6e7ba835e36
> --- /dev/null
> +++ b/include/uapi/linux/vbox_err.h
> diff --git a/include/uapi/linux/vbox_ostypes.h b/include/uapi/linux/vbox_ostypes.h
> new file mode 100644
> index 000000000000..abe9a38ebfbd
> --- /dev/null
> +++ b/include/uapi/linux/vbox_ostypes.h
> @@ -0,0 +1,158 @@

Some of these headers are not really ABI between the kernel and user
space but are between the vbox host and guest, so include/uapi is maybe
not the best place for them.

Do we need all of this both in the kernel and in the header that actually
defines the kernel/user interface?

> +/**
> + * @name VBoxGuest IOCTL codes and structures.
> + *
> + * The range 0..15 is for basic driver communication.
> + * The range 16..31 is for HGCM communication.
> + * The range 32..47 is reserved for future use.
> + * The range 48..63 is for OS specific communication.
> + * The 7th bit is reserved for future hacks.
> + * The 8th bit is reserved for distinguishing between 32-bit and 64-bit
> + * processes in future 64-bit guest additions.
> + * @{
> + */
> +#if __BITS_PER_LONG == 64
> +#define VBOXGUEST_IOCTL_FLAG           128
> +#else
> +#define VBOXGUEST_IOCTL_FLAG           0
> +#endif
> +/** @} */

This seems misguided, we already know the difference between
32-bit and 64-bit tasks based on the entry point, and if the
interface is properly designed then we don't care about this
difference at all.

> +#define VBOXGUEST_IOCTL_CODE_(function, size) \
> +       _IOC(_IOC_READ|_IOC_WRITE, 'V', (function), (size))
> +#define VBOXGUEST_IOCTL_STRIP_SIZE(code) \
> +       VBOXGUEST_IOCTL_CODE_(_IOC_NR((code)), 0)
> +
> +#define VBOXGUEST_IOCTL_CODE(function, size) \
> +       VBOXGUEST_IOCTL_CODE_((function) | VBOXGUEST_IOCTL_FLAG, size)
> +/* Define 32 bit codes to support 32 bit applications in 64 bit guest driver. */
> +#define VBOXGUEST_IOCTL_CODE_32(function, size) \
> +VBOXGUEST_IOCTL_CODE_(function, size)

If the command numbers can be changed wihtout causing too many
problems, I'd just do away with these wrappers and use _IOR/_IOW/_IORW
as needed.

> +/** Input and output buffers layout of the IOCTL_VBOXGUEST_WAITEVENT */
> +typedef struct VBoxGuestWaitEventInfo {
> +       /** timeout in milliseconds */
> +       u32 u32TimeoutIn;
> +       /** events to wait for */
> +       u32 u32EventMaskIn;
> +       /** result code */
> +       u32 u32Result;
> +       /** events occurred */
> +       u32 u32EventFlagsOut;
> +} VBoxGuestWaitEventInfo;
> +VBOXGUEST_ASSERT_SIZE(VBoxGuestWaitEventInfo, 16);

'u32' is not an approprioate type for a kernel header, use '__u32'
instead.

      Arnd