Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug - Number summaries need to handle missing values better #9

Open
2 tasks
a-l-holmes opened this issue Dec 19, 2024 · 0 comments
Open
2 tasks

Bug - Number summaries need to handle missing values better #9

a-l-holmes opened this issue Dec 19, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@a-l-holmes
Copy link
Collaborator

Describe the bug

Namely in summarize_csv, if there are any missing values (e.g., NA, ., nan), then some of the summary statistics do not calculate correctly.

To Reproduce

Steps to reproduce the behavior:

  1. Example command: summarize_csv -dd data/input/condition_INCLUDE_copy.csv -dt data/input/teamds_conditions_harmonized_2024-12-17.csv -m "" -e data/output/TEAM-DS_conditions_summary_2024_12_18.yaml where teamds_conditions_harmonized_2024-12-17.csv includes empty cells.
  2. Example output with missing values:
Age At Condition or Measure Observation:
  Max: 6209.0
  Mean: .nan
  Median: .nan
  Min: 2192.0
  Q1: .nan
  Q3: .nan
  Total Count of Observations: 2990
  Total Missing Values(None): 2654

Expected behavior

Output for numeric handling should be able to compute all values by dropping empty cells from computation to provide a summary on the existing numbers.

Desktop (please complete the following information):

  • OS: Windows
  • Python Version: 3.12.2

Additional context

Review needed on statistics.quantiles, statistics.mean, min, and max in the summarize_csv.py script.

@a-l-holmes a-l-holmes added the bug Something isn't working label Dec 19, 2024
@a-l-holmes a-l-holmes self-assigned this Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant