Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Could not retrieve device information #538

Open
arthurcruz1 opened this issue Nov 1, 2023 · 12 comments
Open

[BUG] Could not retrieve device information #538

arthurcruz1 opened this issue Nov 1, 2023 · 12 comments
Labels
bug Something isn't working enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@arthurcruz1
Copy link

arthurcruz1 commented Nov 1, 2023

Drive is not shown on the WebUI

docker compose exec scrutiny /opt/scrutiny/bin/scrutiny-collector-metrics run

Error:
Could not retrieve device information for sda: exit status 2  type=metrics

smartctl --info --json --device sat /dev/sda type=metrics

smartctl --info --json --device sat /dev/sda  type=metrics
{
  "json_format_version": [
    1,
    0
  ],
  "smartctl": {
    "version": [
      7,
      3
    ],
    "svn_revision": "5338",
    "platform_info": "aarch64-linux-5.15.0-1042-raspi",
    "build_info": "(local build)",
    "argv": [
      "smartctl",
      "--info",
      "--json",
      "--device",
      "sat",
      "/dev/sda",
      "type=metrics"
    ],
    "messages": [
      {
        "string": "ERROR: smartctl takes ONE device name as the final command-line argument.",
        "severity": "error"
      }
    ],
    "exit_status": 1
  }
}

docker logs scrutiny

 ___   ___  ____  __  __  ____  ____  _  _  _  _
/ __) / __)(  _ \(  )(  )(_  _)(_  _)( \( )( \/ )
\__ \( (__  )   / )(__)(   )(   _)(_  )  (  \  /
(___/ \___)(_)\_)(______) (__) (____)(_)\_) (__)
AnalogJ/scrutiny/metrics                                dev-0.7.2

time="2023-11-01T14:38:46Z" level=info msg="Verifying required tools" type=metrics
time="2023-11-01T14:38:46Z" level=info msg="Executing command: smartctl --scan --json" type=metrics
time="2023-11-01T14:38:46Z" level=info msg="Executing command: smartctl --info --json --device sat /dev/sda" type=metrics
time="2023-11-01T14:38:46Z" level=error msg="Could not retrieve device information for sda: exit status 2" type=metrics
time="2023-11-01T14:38:46Z" level=info msg="Sending detected devices to API, for filtering & validation" type=metrics
time="2023-11-01T14:38:46Z" level=info msg="127.0.0.1 - 8848b38a722f [01/Nov/2023:14:38:46 +0000] \"POST /api/devices/register\" 200 40 \"\" \"Go-http-client/1.1\" (1ms)" clientIP=127.0.0.1 hostname=8848b38a722f latency=1 method=POST path=/api/devices/register referer= respLength=40 statusCode=200 type=web userAgent=Go-http-client/1.1
time="2023-11-01T14:38:46Z" level=info msg="Main: Completed" type=metrics

smartctl --scan

/dev/sda -d sntasmedia # /dev/sda [USB NVMe ASMedia], NVMe device

Using omnibus image

@arthurcruz1 arthurcruz1 added the bug Something isn't working label Nov 1, 2023
@vincejv
Copy link

vincejv commented Nov 1, 2023

@arthurcruz1

--cap-add SYS_RAWIO is necessary to allow smartctl permission to query your device SMART data
NOTE: If you have NVMe drives, you must add --cap-add SYS_ADMIN as well. See issue #26 (comment)

@arthurcruz1
Copy link
Author

arthurcruz1 commented Nov 1, 2023

@arthurcruz1

--cap-add SYS_RAWIO is necessary to allow smartctl permission to query your device SMART data
NOTE: If you have NVMe drives, you must add --cap-add SYS_ADMIN as well. See issue #26 (comment)

@vincejv This issue occurs with SYS_ADMIN added to the docker-compose file

version: '3.5'

services:
  scrutiny:
    container_name: scrutiny
    image: ghcr.io/analogj/scrutiny:master-omnibus
    cap_add:
      - SYS_RAWIO
      - SYS_ADMIN
    ports:
      - "8082:8080" # webapp
      - "8086:8086" # influxDB admin
    volumes:
      - /run/udev:/run/udev:ro
      - ./config:/opt/scrutiny/config
      - ./influxdb:/opt/scrutiny/influxdb
    devices:
      - "/dev/sda"
    network_mode: bridge
    restart: unless-stopped

Running the container with --privileged also returns the same error

@arthurcruz1
Copy link
Author

arthurcruz1 commented Nov 1, 2023

I also read this #344 (comment) This does not resolve the issue for me. Thank you for your help.

@vincejv
Copy link

vincejv commented Nov 1, 2023

try running this command instead for debugging purposes smartctl --info --json --device sat /dev/sda remove the type=metrics

@arthurcruz1
Copy link
Author

sudo smartctl --info --json --device sntasmedia /dev/sda
{
  "json_format_version": [
    1,
    0
  ],
  "smartctl": {
    "version": [
      7,
      4
    ],
    "pre_release": false,
    "svn_revision": "5530",
    "platform_info": "aarch64-linux-5.15.0-1042-raspi",
    "build_info": "(local build)",
    "argv": [
      "smartctl",
      "--info",
      "--json",
      "--device",
      "sntasmedia",
      "/dev/sda"
    ],
    "exit_status": 0
  },
  "local_time": {
    "time_t": 1698900025,
    "asctime": "Thu Nov  2 00:40:25 2023 EDT"
  },
  "device": {
    "name": "/dev/sda",
    "info_name": "/dev/sda [USB NVMe ASMedia]",
    "type": "sntasmedia",
    "protocol": "NVMe"
  },
  "model_name": "WD Blue SN570 1TB",
  "serial_number": "23124D800392",
  "firmware_version": "234110WD",
  "nvme_pci_vendor": {
    "id": 5559,
    "subsystem_id": 5559
  },
  "nvme_ieee_oui_identifier": 6980,
  "nvme_total_capacity": 1000204886016,
  "nvme_unallocated_capacity": 0,
  "nvme_controller_id": 0,
  "nvme_version": {
    "string": "1.4",
    "value": 66560
  },
  "nvme_number_of_namespaces": 1,
  "nvme_namespaces": [
    {
      "id": 1,
      "size": {
        "blocks": 1953525168,
        "bytes": 1000204886016
      },
      "capacity": {
        "blocks": 1953525168,
        "bytes": 1000204886016
      },
      "utilization": {
        "blocks": 1953525168,
        "bytes": 1000204886016
      },
      "formatted_lba_size": 512,
      "eui64": {
        "oui": 6980,
        "ext_id": 598312908214
      }
    }
  ],
  "user_capacity": {
    "blocks": 1953525168,
    "bytes": 1000204886016
  },
  "logical_block_size": 512,
  "smart_support": {
    "available": true,
    "enabled": true
  }
}

@AnalogJ
Copy link
Owner

AnalogJ commented Nov 17, 2023

Are you running those smartctl commands inside the container? or on your host?

@arthurcruz1
Copy link
Author

arthurcruz1 commented Nov 18, 2023

@AnalogJ I ran the commands on my host. This is the output inside the container:

root@3fbcca32c0c0:/opt/scrutiny# smartctl --info --json --device sntasmedia /dev/sda
{
  "json_format_version": [
    1,
    0
  ],
  "smartctl": {
    "version": [
      7,
      2
    ],
    "svn_revision": "5155",
    "platform_info": "aarch64-linux-5.15.0-1042-raspi",
    "build_info": "(local build)",
    "argv": [
      "smartctl",
      "--info",
      "--json",
      "--device",
      "sntasmedia",
      "/dev/sda"
    ],
    "messages": [
      {
        "string": "/dev/sda: Unknown SNT device type 'sntasmedia'",
        "severity": "error"
      },
      {
        "string": "=======> VALID ARGUMENTS ARE: ata, scsi[+TYPE], nvme[,NSID], sat[,auto][,N][+TYPE], usbcypress[,X], usbjmicron[,p][,x][,N], usbprolific, usbsunplus, sntjmicron[,NSID], sntrealtek, intelliprop,N[+TYPE], jmb39x[-q],N[,sLBA][,force][+TYPE], jms56x,N[,sLBA][,force][+TYPE], marvell, areca,N/E, 3ware,N, hpt,L/M/N, megaraid,N, aacraid,H,L,ID, cciss,N, auto, test <=======",
        "severity": "error"
      }
    ],
    "exit_status": 1
  }
}

I noticed when I ran it inside the container the smartctl version is 7.2 which does not support 'sntasmedia' type. On the host it's 7.3. I'm not sure why that is.

@AnalogJ
Copy link
Owner

AnalogJ commented Nov 18, 2023

hey @arthurcruz1
Scrutiny is basically just a webui for smartctl. If smartctl is failing within the container, you're going to be missing data/seeing errors in the UI as well. The Scrutiny container comes with its own version of smartctl pre-installed, which is why the versions are different.

Usually I'd say that it's unlikely the error was caused by a version difference, since smartctl is pretty bulletproof, but I just found this with a quick google:

You might need to update your device's firmware?

@iball
Copy link

iball commented Dec 14, 2023

I'm seeing the same thing with the same device. It seems the collector-metrics is running it with the "sat" switch and that fails however replacing "sat" with "sntasmedia" or "scsi" works just fine.

And yes, my device is on the latest firmware from the manufacturer.

root@e65834db6996:/opt/scrutiny# scrutiny-collector-metrics run
2023/12/14 23:50:03 No configuration file found at /opt/scrutiny/config/collector.yaml. Using Defaults.

 ___   ___  ____  __  __  ____  ____  _  _  _  _
/ __) / __)(  _ \(  )(  )(_  _)(_  _)( \( )( \/ )
\__ \( (__  )   / )(__)(   )(   _)(_  )  (  \  /
(___/ \___)(_)\_)(______) (__) (____)(_)\_) (__)
AnalogJ/scrutiny/metrics                                dev-0.7.2

INFO[0000] Verifying required tools                      type=metrics
INFO[0000] Executing command: smartctl --scan --json     type=metrics
INFO[0000] Executing command: smartctl --info --json --device sat /dev/sda  type=metrics
ERRO[0000] Could not retrieve device information for sda: exit status 2  type=metrics
INFO[0000] Sending detected devices to API, for filtering & validation  type=metrics
INFO[0000] Main: Completed                               type=metrics

RUNNING sntasmedia switch:

root@Argon:~# smartctl --info --json --device sntasmedia /dev/sda

{
  "json_format_version": [
    1,
    0
  ],
  "smartctl": {
    "version": [
      7,
      3
    ],
    "svn_revision": "5338",
    "platform_info": "aarch64-linux-6.1.21-v8+",
    "build_info": "(local build)",
    "argv": [
      "smartctl",
      "--info",
      "--json",
      "--device",
      "sntasmedia",
      "/dev/sda"
    ],
    "exit_status": 0
  },
  "local_time": {
    "time_t": 1702598057,
    "asctime": "Thu Dec 14 23:54:17 2023 GMT"
  },
  "device": {
    "name": "/dev/sda",
    "info_name": "/dev/sda [USB NVMe ASMedia]",
    "type": "sntasmedia",
    "protocol": "NVMe"
  },
  "model_name": "PCIe SSD",
  "serial_number": "23030910000351",
  "firmware_version": "ELFM00.3",
  "nvme_pci_vendor": {
    "id": 6535,
    "subsystem_id": 6535
  },
  "nvme_ieee_oui_identifier": 6584743,
  "nvme_total_capacity": 1000204886016,
  "nvme_unallocated_capacity": 0,
  "nvme_controller_id": 0,
  "nvme_version": {
    "string": "1.4",
    "value": 66560
  },
  "nvme_number_of_namespaces": 1,
  "nvme_namespaces": [
    {
      "id": 1,
      "size": {
        "blocks": 1953525168,
        "bytes": 1000204886016
      },
      "capacity": {
        "blocks": 1953525168,
        "bytes": 1000204886016
      },
      "utilization": {
        "blocks": 1953525168,
        "bytes": 1000204886016
      },
      "formatted_lba_size": 512,
      "eui64": {
        "oui": 6584743,
        "ext_id": 510465672036
      }
    }
  ],
  "user_capacity": {
    "blocks": 1953525168,
    "bytes": 1000204886016
  },
  "logical_block_size": 512,
  "smart_support": {
    "available": true,
    "enabled": true
  }
}

RUNNING SAT SWITCH:

root@Argon:~# smartctl --info --json --device sat /dev/sda

{
  "json_format_version": [
    1,
    0
  ],
  "smartctl": {
    "version": [
      7,
      3
    ],
    "svn_revision": "5338",
    "platform_info": "aarch64-linux-6.1.21-v8+",
    "build_info": "(local build)",
    "argv": [
      "smartctl",
      "--info",
      "--json",
      "--device",
      "sat",
      "/dev/sda"
    ],
    "exit_status": 2
  },
  "local_time": {
    "time_t": 1702598090,
    "asctime": "Thu Dec 14 23:54:50 2023 GMT"
  },
  "device": {
    "name": "/dev/sda",
    "info_name": "/dev/sda [SAT]",
    "type": "sat",
    "protocol": "ATA"
  }
}

@iball
Copy link

iball commented Dec 15, 2023

OK, running smartctl --scan on the server itself results in this:

root@Argon:~# smartctl --scan
/dev/sda -d sntasmedia # /dev/sda [USB NVMe ASMedia], NVMe device

Whereas running the same command inside the docker container results in this:

root@e65834db6996:/opt/scrutiny# smartctl --scan
/dev/sda -d sat # /dev/sda [SAT], ATA device

@iball
Copy link

iball commented Dec 15, 2023

Figured out the problem. The underlying issue is that the version of smartmontools that the scrutiny docker image is using is version 7.2 dated 2020-12-30 whereas the version installed on my server itself is version 7.3 dated 2022-02-28.

My server is Debian Bookworm and the Scrutiny container is built on Debian Bullseye.

The older 7.2 version of smartmontools doesn't have support for sntasmedia controlled drives.

I do believe the "fix" is to just upgrade the version of smartmontools inside the container to 7.3. Not sure how possible that's going to be since it looks like the Scrutiny docker image is based upon Debian Bullseye and says it's already using the latest version of smartmontools when I try to upgrade it from inside the container so it may be time to "re-base" the image itself on Debian Bookworm?

@iball
Copy link

iball commented Dec 15, 2023

After further testing it is indeed due to smartmontools 7.2 not supporting sntasmedia.
I upgraded smartmontools inside the container itself and now it works properly. Of course it won't last for long, so @AnalogJ might want to think about re-basing Scrutiny container image on Debian Bookworm.

 ___   ___  ____  __  __  ____  ____  _  _  _  _
/ __) / __)(  _ \(  )(  )(_  _)(_  _)( \( )( \/ )
\__ \( (__  )   / )(__)(   )(   _)(_  )  (  \  /
(___/ \___)(_)\_)(______) (__) (____)(_)\_) (__)
AnalogJ/scrutiny/metrics                                dev-0.7.2

INFO[0000] Verifying required tools                      type=metrics
INFO[0000] Executing command: smartctl --scan --json     type=metrics
INFO[0000] Executing command: smartctl --info --json --device sntasmedia /dev/sda  type=metrics
INFO[0000] Using WWN Fallback                            type=metrics
INFO[0000] Sending detected devices to API, for filtering & validation  type=metrics
INFO[0000] Collecting smartctl results for sda           type=metrics
INFO[0000] Executing command: smartctl --xall --json --device sntasmedia /dev/sda  type=metrics
INFO[0000] Publishing smartctl results for 23030910000351  type=metrics
INFO[0000] Main: Completed                               type=metrics

image

@AnalogJ AnalogJ added enhancement New feature or request help wanted Extra attention is needed good first issue Good for newcomers and removed waiting for response labels Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants