Skip to content

Commit

Permalink
Merge pull request #3028 from catalyst-cooperative/create-renaming-re…
Browse files Browse the repository at this point in the history
…lease-notes

Add naming convention change to release notes
  • Loading branch information
bendnorman authored Nov 9, 2023
2 parents c329804 + 479ec7f commit 53d5618
Show file tree
Hide file tree
Showing 3 changed files with 39 additions and 2 deletions.
14 changes: 12 additions & 2 deletions docs/dev/naming_conventions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,9 @@ names should generally follow this naming convention:
``eia860``, ``ferc1`` and ``epacems``.
* ``asset_type`` describes how the asset in modeled.
* ``asset_name`` should describe the entity, categorical code type, or measurement of
the asset.
the asset. Note: FERC Form 1 assets typically include the schedule number in the
``asset_name`` so users and contributors know which schedule the cleaned asset
refers to.

Raw layer
^^^^^^^^^
Expand All @@ -55,14 +57,22 @@ These assets are typically stored in parquet files or tables in a database.

Naming convention: ``core_{source}__{asset_type}_{asset_name}``

* ``source`` is sometimes ``pudl``. This means the asset
is a derived connection the contributors of PUDL created to connect multiple
datasets via manual or machine learning methods.

* ``asset_type`` describes how the asset is modeled and its role in PUDL’s
collection of core assets. There are a handful of table types in this layer:

* ``assn``: Association tables provide connections between entities. This data
can be manually compiled or extracted from data sources. Examples:
can be manually compiled or extracted from data sources. If the asset associates
data from two sources, the source names should be included in the ``asset_name``
in alphabetical order. Examples:

* ``core_pudl__assn_plants_eia`` associates EIA Plant IDs and manually assigned
PUDL Plant IDs.
* ``core_epa__assn_epacamd_eia`` associates EPA units with EIA plants, boilers,
and generators.
* ``codes``: Code tables contain more verbose descriptions of categorical codes
typically manually compiled from source data dictionaries. Examples:

Expand Down
16 changes: 16 additions & 0 deletions docs/release_notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,22 @@ Dagster Adoption
* :mod:`pudl.convert.censusdp1tract_to_sqlite` and :mod:`pudl.output.censusdp1tract`
are now integrated into dagster. See :issue:`1973` and :pr:`2621`.

New Asset Naming Convention
^^^^^^^^^^^^^^^^^^^^^^^^^^^
There are hundreds of new tables in ``pudl.sqlite`` now that the methods in ``PudlTabl``
have been converted to Dagster assets. This significant increase in tables and diversity
of table types prompted us to create a new naming convention to make the table names
more descriptive and organized. You can read about the new naming convention in the
:ref:`docs <asset-naming>`.

To help users migrate away from using ``PudlTabl`` and our temporary table names,
we've created a `google sheet <https://docs.google.com/spreadsheets/d/1RBuKl_xKzRSLgRM7GIZbc5zUYieWFE20cXumWuv5njo/edit?usp=sharing>`__
that maps the old table names and ``PudlTabl`` methods to the new table names.

We've added deprecation warnings to the ``PudlTabl`` class. We plan to remove
``PudlTabl`` from the ``pudl`` package once our known users have
succesfully migrated to pulling data directly from ``pudl.sqlite``.

Data Coverage
^^^^^^^^^^^^^

Expand Down
11 changes: 11 additions & 0 deletions src/pudl/output/pudltabl.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,11 @@ def __init__(
unit_ids: If True, use several heuristics to assign
individual generators to functional units. EXPERIMENTAL.
"""
logger.warning(
"PudlTabl is deprecated and will be removed from the pudl package "
"once known users have migrated to accessing the data directly from "
"pudl.sqlite. "
)
if not isinstance(pudl_engine, sa.engine.base.Engine):
raise TypeError(
"PudlTabl needs pudl_engine to be a SQLAlchemy Engine, but we "
Expand Down Expand Up @@ -296,6 +301,12 @@ def _get_table_from_db(
"It is retained for backwards compatibility only."
)
table_name = self._agg_table_name(table_name)
logger.warning(
"PudlTabl is deprecated and will be removed from the pudl package "
"once known users have migrated to accessing the data directly from "
"pudl.sqlite. To access the data returned by this method, "
f"use the {table_name} table in the pudl.sqlite database."
)
resource = Resource.from_id(table_name)
return pd.concat(
[
Expand Down

0 comments on commit 53d5618

Please sign in to comment.