Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nut_libusb_get_interrupt: Connection timed out on debug #2644

Open
masterwishx opened this issue Sep 27, 2024 · 38 comments
Open

nut_libusb_get_interrupt: Connection timed out on debug #2644

masterwishx opened this issue Sep 27, 2024 · 38 comments

Comments

@masterwishx
Copy link
Contributor

Using NUT plugin for Unraid
When debug is on have many interrupt transfers :

have a lot of nut_libusb_get_interrupt: Connection timed out in debug log and
maybe also affect on #2643
or its just closing connection every time for cycle of reading data?

From libusb/libusb#1548 (comment) by @tormodvolden (have issues with interrupt transfers )

Mostly looks like this :

send_to_all: SETINFO driver.state "quiet"
Sep 27 09:37:30 DaRKNaS usbhid-ups[24218]: [D6:24218] send_to_all: write 29 bytes to socket 16 succeeded (ret=29): SETINFO driver.state "quiet"
Sep 27 09:37:31 DaRKNaS usbhid-ups[24218]: [D5:24218] send_to_all: SETINFO driver.state "updateinfo"
Sep 27 09:37:31 DaRKNaS usbhid-ups[24218]: [D6:24218] send_to_all: write 34 bytes to socket 16 succeeded (ret=34): SETINFO driver.state "updateinfo"
Sep 27 09:37:31 DaRKNaS usbhid-ups[24218]: [D1:24218] upsdrv_updateinfo...
Sep 27 09:37:32 DaRKNaS usbhid-ups[24218]: [D2:24218] nut_libusb_get_interrupt: Connection timed out
Sep 27 09:37:32 DaRKNaS usbhid-ups[24218]: [D1:24218] Got 0 HID objects...
Sep 27 09:37:32 DaRKNaS usbhid-ups[24218]: [D1:24218] Quick update...
Sep 27 09:37:32 DaRKNaS usbhid-ups[24218]: [D4:24218] 

image

@masterwishx
Copy link
Contributor Author

@tormodvolden @jimklimov do you think it's related to how nut proccess libusb or I can test it with xusb if it related to libusb itself?

I have also nut client on windows that read values from nut, also telegraf that read values and write to influxdbv2 for grafana dashboard.
Not shure if it can affect on isshue...

@tormodvolden
Copy link
Contributor

Is this causing a problem or is it just that you see the debug message in the log? You'll have to find out what the nut code tries to do, then what to expect from the device and libusb calls.

If you have suspicions about a libusb issue, you must provide LIBUSB_DEBUG=4 logs to start.

xusb can do some HID interrupt transfers, but I don't know if it fits with your device.

@masterwishx
Copy link
Contributor Author

Is this causing a problem or is it just that you see the debug message in the log? You'll have to find out what the nut code tries to do, then what to expect from the device and libusb calls.

Yes the problem upsrw commands not working for me , hovewer it write OK or Success #2643

@masterwishx
Copy link
Contributor Author

Is this causing a problem or is it just that you see the debug message in the log? You'll have to find out what the nut code tries to do, then what to expect from the device and libusb calls.

If you have suspicions about a libusb issue, you must provide LIBUSB_DEBUG=4 logs to start.

xusb can do some HID interrupt transfers, but I don't know if it fits with your device.

image

@tormodvolden
Copy link
Contributor

Personally I am allergic to text being pasted as images :) There are so many great tools to efficiently search and process textual information.

@masterwishx
Copy link
Contributor Author

Sorry, got it, will post the log here..

@masterwishx
Copy link
Contributor Author

Personally I am allergic to text being pasted as images :) There are so many great tools to efficiently search and process textual information.

nut debug log 1.10.24.txt

@masterwishx
Copy link
Contributor Author

Also tryed to add exported LIBUSB_DEBUG = 4 to ups.conf with debug_min = 6 but seems no effect with LIBUSB_DEBUG = 4

@tormodvolden
Copy link
Contributor

Please see #2616.

@masterwishx
Copy link
Contributor Author

Please see #2616.

Thanks, by #2616
I tryed to add LIBUSB_DEBUG = 4, but seems not worked for, I mean only debug_min=6 working by log I posted

@tormodvolden
Copy link
Contributor

Maybe bring that up in #2616, if you can't figure out how to enable this, the documentation is definitely not good enough.

@tormodvolden
Copy link
Contributor

Did you already read docs/man/nut.conf.txt ?

@masterwishx
Copy link
Contributor Author

Did you already read docs/man/nut.conf.txt ?

No just was reading #2616

@masterwishx
Copy link
Contributor Author

exported LIBUSB_DEBUG = 4

seems exported LIBUSB_DEBUG = 4 not documented yet in docs/man/nut.conf.txt
im using nut plugin for Unraid ,
So i just select UPS Driver Debug Level: 6 and its adding debug_min = 6 in ups.cong file

@tormodvolden
Copy link
Contributor

The LIBUSB_DEBUG variable is mentioned in docs/man/nut_usb_addvars.txt but docs/man/nut.conf.txt explains more in general how to set variables correctly.

@masterwishx
Copy link
Contributor Author

The LIBUSB_DEBUG variable is mentioned in docs/man/nut_usb_addvars.txt but docs/man/nut.conf.txt explains more in general how to set variables correctly.

So i added exported LIBUSB_DEBUG = 4 like in

`LIBUSB_DEBUG` before starting a NUT driver (may be set and "exported" via

my bad was i added to ups.conf instead of nut.conf

and nut started OK, if wrong variable was set in ups.conf NUT will not start .
So i added to nut.conf same effect ...

and from both logs , no info about libusb debug :

0.000000#011[D1:24188] debug level is '6'
0.000008#011[D5:24188] send_to_all: SETINFO driver.debug "6"
0.000021#011[D5:24188] send_to_all: SETFLAGS driver.debug RW NUMBER

@tormodvolden
Copy link
Contributor

What does docs/man/nut.conf.txt say about spaces? :)

@masterwishx
Copy link
Contributor Author

What does docs/man/nut.conf.txt say about spaces? :)

:) OK Got it, i will try it
but in nut plugin for Unraid that im using all config with spaces around equal , sorry for picture:

image

image

@jimklimov jimklimov added this to the 2.8.4 milestone Oct 7, 2024
@michal4132
Copy link
Contributor

michal4132 commented Nov 6, 2024

Hi,

I'm experiencing similar issue: after a certain period, the UPS (APC Back-UPS BX750MI) stops responding without any indication of stale data.

Nov 04 09:52:55 taylor usbhid-ups[16704]: [D2] nut_libusb_get_interrupt: Connection timed out
Nov 04 09:52:55 taylor usbhid-ups[16704]: [D1] Got 0 HID objects...
Nov 04 09:52:55 taylor usbhid-ups[16704]: [D1] Quick update...

Under normal operation, the HIDGetEvents function returns a positive non-zero value (e.g., 1). However, when the data appears stale, HIDGetEvents returns 0.

/* On success, returns an item count > 0. When no notifications are available,
 * returns an error or 'no event' code.
 */
int HIDGetEvents(hid_dev_handle_t udev, HIDData_t **event, int eventsize);

The documentation specifies that, on success, the return value should be greater than 0. Perhaps we should interpret a return value of 0 as an error and attempt to reconnect in that case?

I've tested this approach with the attached patch, and it has worked without any data staleness for over two days.

0001-apc-fix.patch.txt

@jimklimov
Copy link
Member

Looks reasonable; would you care to make it a pull request in your name?

Also, with an APC BX###MI device, #2565 and related issues and PRs may be relevant.

In your case, before the fix, did it begin reporting "0 HID objects" and then continued to do so indefinitely, or recovered after some time (~10-30 sec)?

@michal4132
Copy link
Contributor

Sure, I'll create a PR later.

I’ve looked through other issues, including #2565, and can confirm that the LB+RB bug is still present even with this fix.

Before applying the patch, the UPS would remain stuck at "0 HID objects" indefinitely (at least for several hours) and wouldn’t recover without a driver restart.

@masterwishx
Copy link
Contributor Author

I'm experiencing similar issue: after a certain period, the UPS (APC Back-UPS BX750MI) stops responding without any indication of stale data.

In my case ups not stop responding but seems some upsrw commands have no effect, also seems I'm not mentioned that I have ups Eaton 9E2000i model.

@masterwishx
Copy link
Contributor Author

with LIBUSB_DEBUG = 4 i can see next on timeouts

@tormodvolden
can you please check if you see some issues here :

Nov 16 14:42:07 server rc.nut:    7.463963#011[D1:11606] upsdrv_updateinfo...
Nov 16 14:42:07 server rc.nut: [ 6.649112] [00002d56] libusb: debug [libusb_submit_transfer] transfer 0x480c40
Nov 16 14:42:07 server rc.nut: [ 6.649113] [00002d56] libusb: debug [add_to_flying_list] arm timer for timeout in 750ms (first in line)
Nov 16 14:42:07 server rc.nut: [ 6.649118] [00002d56] libusb: debug [submit_bulk_transfer] need 1 urbs for new transfer with length 512
Nov 16 14:42:07 server rc.nut: [ 6.649128] [00002d56] libusb: debug [libusb_handle_events_timeout_completed] doing our own event handling
Nov 16 14:42:07 server rc.nut: [ 6.649130] [00002d56] libusb: debug [usbi_wait_for_events] poll() 3 fds with timeout in 60000ms
Nov 16 14:42:08 server rc.nut: [ 7.399124] [00002d56] libusb: debug [usbi_wait_for_events] poll() returned 1
Nov 16 14:42:08 server rc.nut: [ 7.399135] [00002d56] libusb: debug [libusb_cancel_transfer] transfer 0x480c40
Nov 16 14:42:08 server rc.nut: [ 7.399481] [00002d56] libusb: debug [arm_timer_for_next_timeout] no timeouts, disarming timer
Nov 16 14:42:08 server rc.nut: [ 7.399489] [00002d56] libusb: debug [libusb_handle_events_timeout_completed] doing our own event handling
Nov 16 14:42:08 server rc.nut: [ 7.399491] [00002d56] libusb: debug [usbi_wait_for_events] poll() 3 fds with timeout in 60000ms
Nov 16 14:42:08 server rc.nut: [ 7.399494] [00002d56] libusb: debug [usbi_wait_for_events] poll() returned 1
Nov 16 14:42:08 server rc.nut: [ 7.399498] [00002d56] libusb: debug [reap_for_handle] urb type=1 status=-2 transferred=0
Nov 16 14:42:08 server rc.nut: [ 7.399500] [00002d56] libusb: debug [handle_bulk_completion] handling completion status -2 of bulk urb 1/1
Nov 16 14:42:08 server rc.nut: [ 7.399502] [00002d56] libusb: debug [handle_bulk_completion] abnormal reap: urb status -2
Nov 16 14:42:08 server rc.nut: [ 7.399503] [00002d56] libusb: debug [handle_bulk_completion] abnormal reap: last URB handled, reporting
Nov 16 14:42:08 server rc.nut: [ 7.399505] [00002d56] libusb: debug [usbi_handle_transfer_cancellation] detected timeout cancellation
Nov 16 14:42:08 server rc.nut: [ 7.399506] [00002d56] libusb: debug [arm_timer_for_next_timeout] no timeouts, disarming timer
Nov 16 14:42:08 server rc.nut: [ 7.399508] [00002d56] libusb: debug [usbi_handle_transfer_completion] transfer 0x480c40 has callback 0x14aae8475630
Nov 16 14:42:08 server rc.nut: [ 7.399510] [00002d56] libusb: debug [sync_transfer_cb] actual_length=0
Nov 16 14:42:08 server rc.nut: [ 7.399512] [00002d56] libusb: debug [libusb_free_transfer] transfer 0x480c40
Nov 16 14:42:08 server rc.nut:    8.214372#011[D2:11606] nut_libusb_get_interrupt: Connection timed out
Nov 16 14:42:08 server rc.nut:    8.214378#011[D1:11606] Got 0 HID objects...
Nov 16 14:42:08 server rc.nut:    8.214381#011[D1:11606] Quick update...

@tormodvolden
Copy link
Contributor

I see a bulk transfer that times out, with the timeout set to 750ms. Have you tried a longer timeout?
What is the request? Is it something that the device should always handle?

@tormodvolden
Copy link
Contributor

Note that this is bulk transfer. Is it a bulk endpoint? I was confused by the talk about "interrupt transfers" in the other ticket. What is nut_libusb_get_interrupt() supposed to do?

@masterwishx
Copy link
Contributor Author

masterwishx commented Nov 16, 2024

Note that this is bulk transfer. Is it a bulk endpoint? I was confused by the talk about "interrupt transfers" in the other ticket. What is nut_libusb_get_interrupt() supposed to do?

It's for Eaton 9E 2000i ups that you fixed buggy firmware with wrong language ID. #2604
Maybe @jimklimov can help here?

@masterwishx
Copy link
Contributor Author

Nov 16 11:47:33 server rc.nut:    0.961876#011[D1:31179] Checking device 10 of 11 (0463/FFFF)
Nov 16 11:47:33 server rc.nut:    0.961876#011[D1:31179] nut_usb_get_string: Broken language identifier, assuming en_US
Nov 16 11:47:33 server rc.nut:    1.016918#011[D1:31179] nut_usb_get_string: Broken language identifier, assuming en_US
Nov 16 11:47:33 server rc.nut:    1.082520#011[D1:31179] nut_usb_get_string: Broken language identifier, assuming en_US
Nov 16 11:47:33 server rc.nut:    1.131127#011[D2:31179] - VendorID: 0463
Nov 16 11:47:33 server rc.nut:    1.131148#011[D2:31179] - ProductID: ffff
Nov 16 11:47:33 server rc.nut:    1.131159#011[D2:31179] - Manufacturer: EATON
Nov 16 11:47:33 server rc.nut:    1.131161#011[D2:31179] - Product: Eaton 9E
Nov 16 11:47:33 server rc.nut:    1.131163#011[D2:31179] - Serial Number: reducted
Nov 16 11:47:33 server rc.nut:    1.131165#011[D2:31179] - Bus: 003
Nov 16 11:47:33 server rc.nut:    1.131166#011[D2:31179] - Bus Port: 003
Nov 16 11:47:33 server rc.nut:    1.131168#011[D2:31179] - Device: 002
Nov 16 11:47:33 server rc.nut:    1.131170#011[D2:31179] - Device release number: 0202
Nov 16 11:47:33 server rc.nut:    1.131173#011[D2:31179] Trying to match device
Nov 16 11:47:33 server rc.nut:    1.131176#011[D2:31179] match_function_subdriver (non-SHUT mode): matching a device...
Nov 16 11:47:33 server rc.nut:    1.131179#011[D3:31179] match_function_regex: matching a device...
Nov 16 11:47:33 server rc.nut:    1.131182#011[D2:31179] Device matches
Nov 16 11:47:33 server rc.nut:    1.131184#011[D2:31179] Reading configuration descriptor 1 of 1
Nov 16 11:47:33 server rc.nut:    1.131193#011[D3:31179] libusb_kernel_driver_active() returned 0: Success
Nov 16 11:47:33 server rc.nut:    1.131209#011[D2:31179] Claimed interface 0 successfully
Nov 16 11:47:33 server rc.nut:    1.131212#011[D3:31179] nut_libusb_set_altinterface: skipped libusb_set_interface_alt_setting(udev, 0, 0)
Nov 16 11:47:33 server rc.nut:    1.162016#011[D2:31179] Retrieved HID descriptor (expected 9, got 9)
Nov 16 11:47:33 server rc.nut:    1.162038#011[D3:31179] HID descriptor, method 1: (9 bytes) => 09 21 10 01 21 01 22 05 08
Nov 16 11:47:33 server rc.nut:    1.162040#011[D3:31179] HID descriptor length (method 1) 2053
Nov 16 11:47:33 server rc.nut:    1.162042#011[D4:31179] i=0, extra[i]=09, extra[i+1]=21
Nov 16 11:47:33 server rc.nut:    1.162045#011[D3:31179] HID descriptor, method 2: (9 bytes) => 09 21 10 01 21 01 22 f6 03
Nov 16 11:47:33 server rc.nut:    1.162049#011[D3:31179] HID descriptor length (method 2) 1014
Nov 16 11:47:33 server rc.nut:    1.162052#011[D1:31179] Eaton device v2.02. Using full report descriptor
Nov 16 11:47:33 server rc.nut:    1.162054#011[D2:31179] Warning: two different HID descriptors retrieved (Reportlen = 2053 vs. 1014)
Nov 16 11:47:33 server rc.nut:    1.162056#011[D2:31179] HID descriptor length 2053
Nov 16 11:47:37 server rc.nut:    4.427419#011[D2:31179] Report Descriptor size = 2053

@masterwishx
Copy link
Contributor Author

masterwishx commented Nov 16, 2024

found only :

https://github.com/networkupstools/nut/blob/0f6ae03925b139f64e8f45576dd9d5623cd1fd34/drivers/usbhid-ups.c#L1120C1-L1124C1

@jimklimov can it be related to pollonly ?
but seems its only when "nut_libusb_get_interrupt: Input/Output Error\"

@jimklimov
Copy link
Member

Not really sure, it was before my time and USB code is still not something I am able to navigate well. But just tracing the method/variable names gets us quite far along:

  • nut/drivers/libusb0.c

    Lines 920 to 950 in 0f6ae03

    /* Expected evaluated types for the API:
    * static int nut_libusb_get_interrupt(usb_dev_handle *udev,
    * unsigned char *buf, int bufsize, int timeout)
    */
    static int nut_libusb_get_interrupt(
    usb_dev_handle *udev,
    usb_ctrl_charbuf buf,
    usb_ctrl_charbufsize bufsize,
    usb_ctrl_timeout_msec timeout)
    {
    int ret;
    if (!udev) {
    return -1;
    }
    /* Interrupt EP is USB_ENDPOINT_IN with offset defined in hid_ep_in, which is 0 by default, unless overridden in subdriver. */
    ret = usb_interrupt_read(udev, USB_ENDPOINT_IN + usb_subdriver.hid_ep_in, (char *)buf, bufsize, timeout);
    #ifdef WIN32
    errno = -ret;
    #endif
    /* Clear stall condition */
    if (ret == -EPIPE) {
    ret = usb_clear_halt(udev, 0x81);
    }
    return nut_libusb_strerror(ret, __func__);
    }
    called usb_interrupt_read()
  • nut/drivers/libusb1.c

    Lines 1032 to 1097 in 0f6ae03

    /* Expected evaluated types for the API:
    * static int nut_libusb_get_interrupt(libusb_device_handle *udev,
    * unsigned char *buf, int bufsize, int timeout)
    */
    static int nut_libusb_get_interrupt(
    libusb_device_handle *udev,
    usb_ctrl_charbuf buf,
    usb_ctrl_charbufsize bufsize,
    usb_ctrl_timeout_msec timeout)
    {
    int ret, tmpbufsize;
    #if (defined HAVE_PRAGMA_GCC_DIAGNOSTIC_PUSH_POP) && ( (defined HAVE_PRAGMA_GCC_DIAGNOSTIC_IGNORED_TYPE_LIMITS) || (defined HAVE_PRAGMA_GCC_DIAGNOSTIC_IGNORED_TAUTOLOGICAL_CONSTANT_OUT_OF_RANGE_COMPARE) || (defined HAVE_PRAGMA_GCC_DIAGNOSTIC_IGNORED_TAUTOLOGICAL_UNSIGNED_ZERO_COMPARE) )
    # pragma GCC diagnostic push
    #endif
    #ifdef HAVE_PRAGMA_GCC_DIAGNOSTIC_IGNORED_TYPE_LIMITS
    # pragma GCC diagnostic ignored "-Wtype-limits"
    #endif
    #ifdef HAVE_PRAGMA_GCC_DIAGNOSTIC_IGNORED_TAUTOLOGICAL_CONSTANT_OUT_OF_RANGE_COMPARE
    # pragma GCC diagnostic ignored "-Wtautological-constant-out-of-range-compare"
    #endif
    #ifdef HAVE_PRAGMA_GCC_DIAGNOSTIC_IGNORED_TAUTOLOGICAL_UNSIGNED_ZERO_COMPARE
    # pragma GCC diagnostic ignored "-Wtautological-unsigned-zero-compare"
    #endif
    if (!udev
    || bufsize < 0 || (uintmax_t)bufsize > (uintmax_t)INT_MAX
    ) {
    #if (defined HAVE_PRAGMA_GCC_DIAGNOSTIC_PUSH_POP) && ( (defined HAVE_PRAGMA_GCC_DIAGNOSTIC_IGNORED_TYPE_LIMITS) || (defined HAVE_PRAGMA_GCC_DIAGNOSTIC_IGNORED_TAUTOLOGICAL_CONSTANT_OUT_OF_RANGE_COMPARE) || (defined HAVE_PRAGMA_GCC_DIAGNOSTIC_IGNORED_TAUTOLOGICAL_UNSIGNED_ZERO_COMPARE) )
    # pragma GCC diagnostic pop
    #endif
    return -1;
    }
    /* NOTE: With all the fuss about word sized arguments,
    * the libusb_interrupt_transfer() lengths are about ints:
    * int LIBUSB_CALL libusb_interrupt_transfer(libusb_device_handle *dev_handle,
    * unsigned char endpoint, unsigned char *data, int length,
    * int *actual_length, unsigned int timeout);
    */
    tmpbufsize = (int)bufsize;
    /* FIXME: hardcoded interrupt EP => need to get EP descr for IF descr */
    /* ret = libusb_interrupt_transfer(udev, 0x81, buf, bufsize, &bufsize, timeout); */
    /* libusb0: ret = usb_interrupt_read(udev, USB_ENDPOINT_IN + usb_subdriver.hid_ep_in, (char *)buf, bufsize, timeout); */
    /* Interrupt EP is LIBUSB_ENDPOINT_IN with offset defined in hid_ep_in, which is 0 by default, unless overridden in subdriver. */
    ret = libusb_interrupt_transfer(udev,
    LIBUSB_ENDPOINT_IN + usb_subdriver.hid_ep_in,
    (unsigned char *)buf, tmpbufsize, &tmpbufsize, timeout);
    /* Clear stall condition */
    if (ret == LIBUSB_ERROR_PIPE) {
    ret = libusb_clear_halt(udev, 0x81);
    }
    /* In case of success, return the operation size, as done with libusb 0.1 */
    if (ret == LIBUSB_SUCCESS) {
    if (tmpbufsize < 0
    || (uintmax_t)tmpbufsize > (uintmax_t)USB_CTRL_CHARBUFSIZE_MAX
    ) {
    return -1;
    }
    ret = (usb_ctrl_charbufsize)bufsize;
    }
    return nut_libusb_strerror(ret, __func__);
    }
    calls libusb_interrupt_transfer()
  • nut/drivers/usbhid-ups.c

    Lines 1271 to 1274 in 0f6ae03

    /* ignore (broken) interrupt pipe */
    if (testvar("pollonly")) {
    use_interrupt_pipe = FALSE;
    }
    Here the pollonly setting is translated use_interrupt_pipe boolean in C code, which goes to call or not HIDGetEvents() in upsdrv_updateinfo() at

    nut/drivers/usbhid-ups.c

    Lines 1144 to 1187 in 0f6ae03

    /* Get HID notifications on Interrupt pipe first */
    if (use_interrupt_pipe == TRUE) {
    evtCount = HIDGetEvents(udev, event, MAX_EVENT_NUM);
    switch (evtCount)
    {
    case LIBUSB_ERROR_BUSY: /* Device or resource busy */
    upslog_with_errno(LOG_CRIT, "Got disconnected by another driver");
    goto fallthrough_reconnect;
    #if WITH_LIBUSB_0_1 /* limit to libusb 0.1 implementation */
    case -EPERM: /* Operation not permitted */
    #endif
    case NO_EVENTS: /* No HID Events */
    case LIBUSB_ERROR_NO_DEVICE: /* No such device */
    case LIBUSB_ERROR_ACCESS: /* Permission denied */
    #if WITH_LIBUSB_0_1 /* limit to libusb 0.1 implementation */
    case -ENXIO: /* No such device or address */
    #endif
    case LIBUSB_ERROR_NOT_FOUND: /* No such file or directory */
    case LIBUSB_ERROR_NO_MEM: /* Insufficient memory */
    fallthrough_reconnect:
    /* Uh oh, got to reconnect! */
    dstate_setinfo("driver.state", "reconnect.trying");
    hd = NULL;
    return;
    case LIBUSB_ERROR_IO: /* I/O error */
    /* Uh oh, got to reconnect, with a special suggestion! */
    dstate_setinfo("driver.state", "reconnect.trying");
    interrupt_pipe_EIO_count++;
    hd = NULL;
    return;
    default:
    upsdebugx(1, "Got %i HID objects...", (evtCount >= 0) ? evtCount : 0);
    break;
    }
    } else {
    evtCount = 0;
    upsdebugx(1, "Not using interrupt pipe...");
    }
    /* Process pending events (HID notifications on Interrupt pipe) */
    for (i = 0; i < evtCount; i++) {
    if (HIDGetDataValue(udev, event[i], &value, poll_interval) != 1)
    continue;
  • My understanding that the alternative to this is to just rely on calling hid_ups_walk() regularly, lower in the method.

@masterwishx
Copy link
Contributor Author

Not really sure, it was before my time and USB code is still not something I am able to navigate well. But just tracing the method/variable names gets us quite far along:

Thanks for hint , take me some time to get on the code :(

Note that this is bulk transfer. Is it a bulk endpoint? I was confused by the talk about "interrupt transfers" in the other ticket. What is nut_libusb_get_interrupt() supposed to do?

Maybe you can help here also to questions from @tormodvolden above ?

@jimklimov
Copy link
Member

Can't really elaborate on that authoritatively, but my understanding is that we can get device data from regular polling, which is probably "expensive" (walking all data points known to be served by the device, more so for a "Full update"), so done infrequently (pollfreq in some drivers ~ 5-30 sec and poll_interval in all via the data loop in main.c may both have an impact on timing).

So where interrupts are a possibility (and do not freeze the controller firmware, hence the toggles) they allow us to see individual events more frequently as they happen.

But that's "handwavium" - I can only hope the inherited code actually does that :)

@masterwishx
Copy link
Contributor Author

But that's "handwavium" - I can only hope the inherited code actually does that :)

Thanks for explaining, just wanted to mention again that I dont have a big problem with these connection timeout and 0 hid objects, only suspect that some commands and r/w values not working sometime becose of it :(

@tormodvolden
Copy link
Contributor

About my question, I was doubly confused. Internally, libusb wraps interrupt transfers in the bulk transfer code path, so it is perfectly normal to see "bulk transfer" in the log for interrupt transfers.

A question is how often does the device "post" a new value on its interrupt endpoint? The polling interval is defined by the device, in its descriptors. If you are polling faster than that, I can imagine it is normal to have some timeouts, but I am not sure what exactly you will see.

Please see https://libusb.sourceforge.io/api-1.0/group__libusb__syncio.html#ga0f171a72904a552fc43e6e6564d108a3 about the transferred parameter and https://libusb.sourceforge.io/api-1.0/group__libusb__asyncio.html#asynctimeout which also explains that timeouts do not always mean failure, it can be a partial data transfer. So you should check for any transferred data in this case.

@masterwishx
Copy link
Contributor Author

masterwishx commented Nov 17, 2024

So you should check for any transferred data in this case.

Thanks i will check, I can see in debug log after upsdrv updateinfo and nut_libusb_get_interrupt: connection timed out, Got 0 HID object
and libusb_get_report all current relevant paths from NUT i beleave.

Also when send by r/w value or command I got OK or Success but need to check again if it really worked and how we got OK/success in this case.

@masterwishx
Copy link
Contributor Author

just for the info by #2685 user log for Eaton 9SX have same : nut_libusb_get_interrupt: connection timed out, Got 0 HID object
but not have issues with commands ...

@masterwishx
Copy link
Contributor Author

just for the info by #2685 user log for Eaton 9SX have same : nut_libusb_get_interrupt: connection timed out, Got 0 HID object but not have issues with commands ...

Also in log I see PONG that I don't have, maybe related to OS.
@jimklimov not sure how to check if data received how @tormodvolden was said to check...

@jimklimov
Copy link
Member

I suppose "PONG" sighting can be related to having upsd running and the debug level involved: the data server occasionally PING's the drivers, if there was otherwise no data from them coming in recently, and the driver is expected to reply PONG or is considered frozen => Data stale situation.

@masterwishx
Copy link
Contributor Author

I suppose "PONG" sighting can be related to having upsd running and the debug level involved: the data server occasionally PING's the drivers, if there was otherwise no data from them coming in recently, and the driver is expected to reply PONG or is considered frozen => Data stale situation.

That is very strange because i dont have any PING or PONG in my debug logs :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants