[Spca50x-devs] (Re)design of gspca-v4l2
- Date: Sun, 30 Mar 2008 11:46:33 +0200
- From: Hans de Goede <j.w.r.degoede@xxxxxx>
- Subject: [Spca50x-devs] (Re)design of gspca-v4l2
Hi All,
I've been thinking some more (a lot more) about the current gspca-v4l2 design,
and I'm sad to say that I believe that after all it is not such a good design.
The current design, which queues urbs for later processing in the application
kernel context when a read or dqbuf call is made. Has one big advantage, the
urb_complete handler which gets called in bottom half interrupt context is
extremely short.
It however also has several big drawbacks:
1) It abuses the urb datastructure as data buffers, this is not how/what the
isoc handling of the kernel usb subsystem was designed for. The whole idea of
the current design is that you do the minimum amount of processing necessary in
complete_urb and then resubmit the urb. This is the least of my concerns however:
2) The current design basicly completely ignores the v4l2 api design. It
doesn't matter how much frames an application requests in its request buffers
ioctl, only as much data as fits in the urbs will be buffered. So take for
example an application which does not live stream, but rather records to disk
and doesn't want to miss a single frame. So it requests (and gets) 16 v4l
buffers. However if there is some significant latency in calling dqbuf caused
by whatever, it will still loose data as the whole isoc flow will have stopped
as it has runned out of urbs. This is: bad, bad, bad! Repeating myself: The
current design basicly completely ignores the buffer management as intented and
designed in the current v4l2 api, so its a rather bad (non faithfull)
implementation of that api, this really should be fixed.
---
So what do I suggest instead: use 2 queues for v4l_buf structs (or structs
"derived" from / containing v4l_buf structs) one queue which holds unused
frames, which have been queued to get filled with data with the qbuf ioctl, and
one queue with bufs which holds captured frames, which can then be dequeued
using dqbuf. This way if an applicaiton has 16 buffers queued for filling with
frames, he can actually get delayed for whatever reasons for 14 frames, then
start dequeing (and requeing once processed) and not miss a single frame, as
intended by the api design.
Yes this is the way its currently done in my usbvideo2 "core", and no I didn't
think this up myself its stolen from the drivers by Luca Risolia, and it works!
So what does urb_complete have todo then (simplified, misses a few corner
cases):
1) check if there is a v4l buf currently being filled, if not get one
from the unused frames queue, if this fails resubmit urb and exit
2) check for sof found: goto 3, not found goto 4
3a) copy data before sof to v4lbuf currently being filled
3b) put current v4lbuf in outqueue
3c) get a new v4lbuf from unusedqueue, if this fails resubmit urb and exit
3d) move data pointer to behind sof, decrease length accordingly
3e) goto 2 (look for another sof, can we ever have 2 sof's in one
iso packet? That would be one low res cam!)
4) copy data of packet to current v4lbuf
---
Problems with the suggested design: what todo with processing which needs to be
done on the raw isoc packet data before being suitable for userspace.
Solutions:
1) Don't, always feed raw data (frame aligned by checking for sof's but
otherwise raw) to userspace.
Pro: simple, elegant
Con: won't work for many of todays applications
2) Process the data when being removed from the outqueue, so in the
applications (kernel) context.
Pro: simple, elegant, no possibly heavy processing done in interrupt context
Con:
a) one could argue non of this kind of processing belongs in kernelspace. I
disagree the kernel should also serve as a hardware abstraction layer,
sure jpeg and raw bayer processing should be done in userspace. But
manufacturer specific raw bayer decompression like pac207 and sn2c109
belongs in the driver as this is hardware specific knowledge, which
should all be bundled in one place / one body of code / one object.
b) if the processing cannot be done inline / in the buffer, a bounce buffer
is needed, and then a memcpy back over the original data.
3) As 2, but:
-In reqbufs give the application one more buffer then it has asked for.
-in dqbuf / read, take a v4l_buf from the unused queue, and use that as
destination buffer for the processing, then return this buf to the
application and put the buf taken from the outqueue into the unused
buffers queue
-if the unused queue is empty for whatever reason (shouldn't happen
often)fall back to using a bounce buffer
Pro: simple, elegant, no possibly heavy processing done in interrupt context
Con: see a) con of 2
4) Process the data one iso packet at a time in the urb_complete handler,
immediately writing application ready data to the v4lbuf.
Assuming the processing code knows when it has a complete frame, this would
allow us to optimize things by not looking for a sof until a frame is either
complete or some error has happened, thereby avoiding the not always cpu
friendly sof finding most of the time.
Notice that this what the usbvideo2 "core" + pac207 driver currently does,
except that the optimalization of no unnecessary sof finding has not been
made yet.
Pro: possible sof finding optimalization resulting in less cpu use during
interrupt context (which we can always do if the data has a fixed size,
and can never do if its compressed in a format which we send compressed
to userspace (jpg).
Con: The process code needs to keep lots of state as it can get fed half a
scanline, etc. Resulting in a non trivial statemachine see the pac207
code for example.
Lots of processing done in a bottom half int. handler!
---
After having written this all down, I believe I've answered my own question
with with regards howto solve the processing of raw data problem, the answer I
believe is method 3.
So what do others think, what is the best design for usb webcam drivers?
Thanks & Regards,
Hans
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
Spca50x-devs mailing list
Spca50x-devs@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/spca50x-devs