Web lists-archives.com

Re: [linux-usb-devel] OHCI hangs after failing to free resources

David Brownell wrote:
> On Wednesday 02 May 2007, Mike Nuss wrote:
> It's possible that the SKIP bit isn't handled correctly.
> I've not looked at that code in some time, but I seem to
> recall thinking that setting SKIP was an action more in
> the "defensive paranoia" category than the "essential" one,
> so far as "functionally necessary" criteria go.  It's just
> an optimization ... every time it's applied, the QH is also
> removed from the schedule.  Either one alone should prevent
> the HCD from doing much more with that QH...
> But I don't have time to sort out the relationship between
> the SKIP bit and the software DEQUEUE flag.  The complication
> is dl_done_list().
> See if you can make that code behave without turning on SKIP.
> A quick'n'dirty experiment might be #defining that bit to zero
> and deferring the clear of ED_H so that dl_done_list() still
> has a way to tell when it's cleaning up after a halt.
> - Dave

Thanks for your reply. ISTR clearing SKIP after the fact without any success, but the damage is done at that point, especially if the hardware doesn't handle it as we'd expect. I'll take a look at the code and see if I can get things to work properly without it, per your suggestion.

I posted this dump in my last message, looks like the transfer completed but the TD was not put on the donelist:

ohci_hcd 0000:00:13.0: read endpoint, ed c2d912c0 state 0x0 type intr;
next ed 00000000
ohci_hcd 0000:00:13.0:   info 08405110 MAX=64 DQ SKIP EP=2-IN DEV=16
ohci_hcd 0000:00:13.0:   tds: head 02ba7300 DATA0 tail 02ba7300
ohci_hcd 0000:00:13.0:   -> td c2ba7340; urb c272ca40 index 0; hw next
td 00000000
ohci_hcd 0000:00:13.0:      info 02140000 CC=0 DATA0 DI=0 IN R
ohci_hcd 0000:00:13.0:      cbp 02dbe37a be 02dbe39f (len 38)

1) CC = 0 seems to indicate that the HC has successfully completed the transfer (I believe the HCD sets it to 0xf initially).
2) All our URBs are submitted with 64byte buffers. len=38 means 26 bytes have already been transfered, which is the number of bytes we were expecting in this particular test.
3) HwNextTD is null. This would happen when the HC has moved it to the donelist when the donelist was previously empty (which it should be, because HccaDoneHead is updated and WDH is sent after every single completion).
4) However, it never shows up on the donelist. I added some 'tracking' code to keep track of the last 50 TDs pulled off the donelist. The last TD for this endpoint that appeared on the donelist was the TD at 0x02ba7300 (the current 'dummy'). 

The spec mentions that setting CC and updating HwNextTD "may" be done in the same write cycle, but I don't know about updating the donehead. Who knows what this particular controller is doing. Maybe if the HCD happens to set SKIP in that small timing window it gets mishandled.

As a side note - even though we don't know what the problem is, it seems to me that the error message "INTR_SF lossage" and the comments surrounding it should be changed. We're not losing interrupts. SOF and WDH are both still being generated, actually. ISTR the 2.4 code had a "?" after the equivalent message, which was at least a little more accurate ;)


This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
To unsubscribe, use the last form field at: