Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add docstrings for Identifier, Properties, RecursiveDict #1530

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rodrigc
Copy link
Contributor

@rodrigc rodrigc commented Jan 16, 2025

This allows mkdocstrings to generate hyperlinks
to these types in the documentation.

Fixes: #1529

Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this.
Looks like there's a linter issue in CI, you can fix it by running make lint locally.

Also, it would be helpful to see the rendered doc page for these fields along with showing that it is hyperlinked in create_table_if_not_exists

pyiceberg/typedef.py Outdated Show resolved Hide resolved
@rodrigc rodrigc force-pushed the add-typedef-docstrings branch from f3bd4a6 to f21a448 Compare January 17, 2025 00:52
@rodrigc
Copy link
Contributor Author

rodrigc commented Jan 17, 2025

The issue with the linter is that it does not allow docstrings to appear after the variable.

For this code:

Identifier = Tuple[str, ...]
"""A tuple of strings representing a table identifier.

Each string in the tuple represents a part of the table's unique path. For example,
a table in a namespace might be identified as:

    ("namespace", "table_name")

Examples:
    >>> identifier: Identifier = ("namespace", "table_name")
"""

Properties = Dict[str, Any]
"""A dictionary type for properties in PyIceberg."""


RecursiveDict = Dict[str, Union[str, "RecursiveDict"]]
"""A recursive dictionary type for nested structures in PyIceberg."""

The linter fails with:

- hook id: check-docstring-first
- exit code: 1

pyiceberg/typedef.py:77: Module docstring appears after code (code seen on line 17).

If I change to:

"""A tuple of strings representing a table identifier.

Each string in the tuple represents a part of the table's unique path. For example,
a table in a namespace might be identified as:

    ("namespace", "table_name")

Examples:
    >>> identifier: Identifier = ("namespace", "table_name")
"""
Identifier = Tuple[str, ...]

"""A dictionary type for properties in PyIceberg."""
Properties = Dict[str, Any]


"""A recursive dictionary type for nested structures in PyIceberg."""
RecursiveDict = Dict[str, Union[str, "RecursiveDict"]]

the linter fails with:

check docstring is first.................................................Failed
- hook id: check-docstring-first
- exit code: 1

pyiceberg/typedef.py:76: Module docstring appears after code (code seen on line 17).

Any ideas as to the best way forward?

@pawamoy
Copy link

pawamoy commented Jan 17, 2025

Docstrings must definitely be written below attribute assignments for mkdocstrings to pick them up. I'd recommend disabling this lint rule as it apparently doesn't support code with several docstrings at the module-level (which is something Python itself happily deals with, using only the first docstring as the module's __doc__ attribute).

@rodrigc
Copy link
Contributor Author

rodrigc commented Jan 17, 2025

I see what you are saying. It looks like this issue has come up before:
pre-commit/pre-commit-hooks#159

This project chose to remove the check-docstring-first pre-commit hook:
tiacsys/verylittlewire@07b2cd2

Do you suggest that pyiceberg do the same?
Should that sort of change be done in a separate PR, or can it be combined with this one?

@rodrigc
Copy link
Contributor Author

rodrigc commented Jan 17, 2025

@pawamoy I submitted a separate PR: #1531 to remove check-docstring-first

@pawamoy
Copy link

pawamoy commented Jan 17, 2025

@rodrigc do note that I'm just a passer-by! I have no authority here 😄 I'm the maintainer of mkdocstrings and found this issue while stalking a bit, just wanted to help 🙂 I understand it's not great to just drop the lint rule (because it can be helpful), but I'm not sure if there's a better rule out there 🤷 I don't use pre-commit (the tool) myself.

@kevinjqliu
Copy link
Contributor

Looks like #1531 is merged, could you rebase this PR?

This allows mkdocstrings to generate hyperlinks
to these types in the documentation.

Fixes: apache#1529
@rodrigc rodrigc force-pushed the add-typedef-docstrings branch from f21a448 to 62f352f Compare January 17, 2025 17:26
@rodrigc
Copy link
Contributor Author

rodrigc commented Jan 17, 2025

I rebased this PR. Do you have any further suggestions for improving the docstrings for Properties and RecursiveDict?

Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Properties and RecursiveDict are pretty generic types so the current docstring looks good to me

@kevinjqliu
Copy link
Contributor

Could you add some screenshots to the PR description showing the hyperlink?

@rodrigc
Copy link
Contributor Author

rodrigc commented Jan 17, 2025

a
b
c
d
e

@kevinjqliu
Copy link
Contributor

Amazing, thank you!

@kevinjqliu kevinjqliu requested a review from Fokko January 17, 2025 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Typedefs such as Identifier, Properties, and RecursiveDict are not hyperlinked in the generated documentation
3 participants