Web lists-archives.com

[Spca50x-devs] (Re)design of gspca-v4l2

Hi All,

I've been thinking some more (a lot more) about the current gspca-v4l2 design, 
and I'm sad to say that I believe that after all it is not such a good design.

The current design, which queues urbs for later processing in the application 
kernel context when a read or dqbuf call is made. Has one big advantage, the 
urb_complete handler which gets called in bottom half interrupt context is 
extremely short.

It however also has several big drawbacks:

1) It abuses the urb datastructure as data buffers, this is not how/what the 
isoc handling of the kernel usb subsystem was designed for. The whole idea of 
the current design is that you do the minimum amount of processing necessary in 
complete_urb and then resubmit the urb. This is the least of my concerns however:

2) The current design basicly completely ignores the v4l2 api design. It 
doesn't matter how much frames an application requests in its request buffers 
ioctl, only as much data as fits in the urbs will be buffered. So take for 
example an application which does not live stream, but rather records to disk 
and doesn't want to miss a single frame. So it requests (and gets) 16 v4l 
buffers. However if there is some significant latency in calling dqbuf caused 
by whatever, it will still loose data as the whole isoc flow will have stopped 
as it has runned out of urbs. This is: bad, bad, bad!  Repeating myself: The 
current design basicly completely ignores the buffer management as intented and 
designed in the current v4l2 api, so its a rather bad (non faithfull) 
implementation of that api, this really should be fixed.


So what do I suggest instead: use 2 queues for v4l_buf structs (or structs 
"derived" from / containing v4l_buf structs) one queue which holds unused 
frames, which have been queued to get filled with data with the qbuf ioctl, and 
one queue with bufs which holds captured frames, which can then be dequeued 
using dqbuf. This way if an applicaiton has 16 buffers queued for filling with 
frames, he can actually get delayed for whatever reasons for 14 frames, then 
start dequeing (and requeing once processed) and not miss a single frame, as 
intended by the api design.

Yes this is the way its currently done in my usbvideo2 "core", and no I didn't 
think this up myself its stolen from the drivers by Luca Risolia, and it works!

So what does urb_complete have todo then (simplified, misses a few corner
1) check if there is a v4l buf currently being filled, if not get one
    from the unused frames queue, if this fails resubmit urb and exit
2) check for sof found: goto 3, not found goto 4
3a) copy data before sof to v4lbuf currently being filled
3b) put current v4lbuf in outqueue
3c) get a new v4lbuf from unusedqueue, if this fails resubmit urb and exit
3d) move data pointer to behind sof, decrease length accordingly
3e) goto 2 (look for another sof, can we ever have 2 sof's in one
     iso packet? That would be one low res cam!)
4) copy data of packet to current v4lbuf


Problems with the suggested design: what todo with processing which needs to be 
done on the raw isoc packet data before being suitable for userspace.

1) Don't, always feed raw data (frame aligned by checking for sof's but
   otherwise raw) to userspace.

   Pro: simple, elegant
   Con: won't work for many of todays applications

2) Process the data when being removed from the outqueue, so in the
   applications (kernel) context.

   Pro: simple, elegant, no possibly heavy processing done in interrupt context
    a) one could argue non of this kind of processing belongs in kernelspace. I
       disagree the kernel should also serve as a hardware abstraction layer,
       sure jpeg and raw bayer processing should be done in userspace. But
       manufacturer specific raw bayer decompression like pac207 and sn2c109
       belongs in the driver as this is hardware specific knowledge, which
       should all be bundled in one place / one body of code / one object.

    b) if the processing cannot be done inline / in the buffer, a bounce buffer
       is needed, and then a memcpy back over the original data.

3) As 2, but:
   -In reqbufs give the application one more buffer then it has asked for.
   -in dqbuf / read, take a v4l_buf from the unused queue, and use that as
    destination buffer for the processing, then return this buf to the
    application and put the buf taken from the outqueue into the unused
    buffers queue
   -if the unused queue is empty for whatever reason (shouldn't happen
    often)fall back to using a bounce buffer

   Pro: simple, elegant, no possibly heavy processing done in interrupt context
   Con: see a) con of 2

4) Process the data one iso packet at a time in the urb_complete handler,
    immediately writing application ready data to the v4lbuf.

    Assuming the processing code knows when it has a complete frame, this would
    allow us to optimize things by not looking for a sof until a frame is either
    complete or some error has happened, thereby avoiding the not always cpu
    friendly sof finding most of the time.

    Notice that this what the usbvideo2 "core" + pac207 driver currently does,
    except that the optimalization of no unnecessary sof finding has not been
    made yet.

   Pro: possible sof finding optimalization resulting in less cpu use during
        interrupt context (which we can always do if the data has a fixed size,
        and can never do if its compressed in a format which we send compressed
        to userspace (jpg).
   Con: The process code needs to keep lots of state as it can get fed half a
        scanline, etc. Resulting in a non trivial statemachine see the pac207
        code for example.

        Lots of processing done in a bottom half int. handler!


After having written this all down, I believe I've answered my own question 
with with regards howto solve the processing of raw data problem, the answer I 
believe is method 3.

So what do others think, what is the best design for usb webcam drivers?

Thanks & Regards,


Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
Spca50x-devs mailing list