Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Second half of 2022 missing from fuel_receipts_costs_aggs_eia table #2956

Closed
arengel opened this issue Oct 19, 2023 · 4 comments
Closed

Second half of 2022 missing from fuel_receipts_costs_aggs_eia table #2956

arengel opened this issue Oct 19, 2023 · 4 comments
Labels
bug Things that are just plain broken.

Comments

@arengel
Copy link
Collaborator

arengel commented Oct 19, 2023

Describe the bug

In the fuel_receipts_costs_aggs_eia of the pudl.sqlite, at least as of Oct 18, 2022 data is incomplete for 'quarterly' and 'monthly' temporal aggregations and missing for 'annual'.

Bug Severity

How badly is this bug affecting you?

  • High: This bug is preventing me from using PUDL for 2022 fuel costs.

To Reproduce

Steps to reproduce the behavior -- ideally including a code snippet that causes the error to appear.

import pandas as pd
import sqlalchemy as sa

pd.read_sql_table(
    "fuel_receipts_costs_aggs_eia",
    sa.create_engine("sqlite:////.../pudl.sqlite").connect(),
).groupby("temporal_agg").report_date.max()
temporal_agg
annual      2021-01-01
monthly     2022-06-01
quarterly   2022-04-01
Name: report_date, dtype: datetime64[ns]

Expected behavior

A clear and concise description of what you expected to happen, or what you expected the data to look like.

Software Environment?

  • Operating System. MacOS 13.5.2
  • Python version and distribution '3.11.5 | packaged by conda-forge | (main, Aug 27 2023, 03:33:12) [Clang 15.0.7 ]'
  • How did you install PUDL?
    • PUDL not installed, reading from pudl.sqlite

@UdayVaradarajan, adding you here for visibility or to add any info.

@arengel arengel added the bug Things that are just plain broken. label Oct 19, 2023
@zaneselvans
Copy link
Member

zaneselvans commented Oct 19, 2023

@arengel in previous outputs was the data available through the end of 2022? The input archive looks like it's from February, 2023, and I think EIA updates that bulk data output frequently, so I would imagine it should have been. But just wanted to clarify if this was a regression, or a newly discovered deficiency.

Looking at an older pudl.sqlite I've laying around from 2023-09-21 I get the same result as above.

@arengel
Copy link
Collaborator Author

arengel commented Oct 19, 2023

Its a newly discovered deficiency as far as I know, found while updating our input datasets to include 2022.

@zaneselvans
Copy link
Member

Okay, good to know.

Unfortunately Zenodo's migration to a new backend at the end of last week has temporarily hosed our archiving infrastructure. @e-belfer is updating it to work with the new API over in this PR. As soon as that's fixed we can make a new bulk electricity data archive and get the most recent data in there.

@e-belfer e-belfer moved this from New to Backlog in Catalyst Megaproject Oct 23, 2023
@e-belfer
Copy link
Member

e-belfer commented Aug 2, 2024

Doing some old issue cleanup! We've updated the bulk electricity data 3-4 times since this issue was made, and it looks to me like there is 2022 data in the core_eia__yearly_fuel_receipts_costs_aggs table. So I'm closing this issue, but let me know if it has not been addressed as expected!

@e-belfer e-belfer closed this as completed Aug 2, 2024
@github-project-automation github-project-automation bot moved this from Backlog to Done in Catalyst Megaproject Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Things that are just plain broken.
Projects
Archived in project
Development

No branches or pull requests

3 participants