-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
General Trilinos
maintenance: useless spaces are making up to 2 MB
#12398
Comments
This will replace any line that is only spaces with an empty line:
This will remove any trailing spaces on a line (including a line that is only spaces, note missing
e.g. for Tpetra: |
@romintomasetti @cwpearson I have done some work related to this previously in MueLu with #11258 using a Developing such a Github action is a work in progress still, but it's only one package. Enforcing it across all of Trilinos is a different story entirely and would require mass adoption from developers and leadership. |
Hi @cwpearson and @GrahamBenHarper ! Thanks for sharing your thoughts. I think there are 2 points w.r.t. trailing spaces:
Based on these 2 points, I would propose that a Github action is setup for banning new trailing spaces in any Trilinos package. See below for a simple script that does that. This means:
@GrahamBenHarper Enforcing Here is a quickly drafted script. I used Python instead of any bash-based solution because of the additional flexibility to filter files you want to ignore for instance. I could run in import argparse
import logging
import os
import pathlib
import typing
import git
import typeguard
@typeguard.typechecked
def parse_args() -> argparse.Namespace:
"""Parse CLI arguments."""
parser = argparse.ArgumentParser()
parser.add_argument("--directory", help = "Directory wherein checks are performed (recursively)", required = False, default = os.getcwd())
group = parser.add_argument_group(title = 'Tracked changed only', description = "Only look at Git tracked files that changed between 2 commits.")
group.add_argument("--tracked-changed-only", action = "store_true", help = "Enable looking at tracked changed files only.")
group.add_argument("--commit-start", type = str, required = False, help = "If only tracked changed files must be considered, this is the start commit.")
group.add_argument("--commit-end" , type = str, required = False, help = "If only tracked changed files must be considered, this is the end commit.")
args = parser.parse_args()
if args.tracked_changed_only and (args.commit_start is None or args.commit_end is None):
parser.error("Start and end commits are required.")
return args
@typeguard.typechecked
def files_to_look_at(args : argparse.Namespace) -> typing.Iterator[pathlib.Path]:
"""
Generator that either:
- only keeps files that are tracked and modified between 2 Git commits are kept.
- keeps all tracked files
"""
logging.info(f"Listing tracked files to look at based on {args}.")
repo = git.Repo(path = args.directory)
if args.tracked_changed_only:
commit_start = repo.commit(args.commit_start)
commit_end = repo.commit(args.commit_end)
changed_files = repo.git.diff('--name-only', commit_start, commit_end).splitlines()
logging.debug(
"Filtering any file matching generator with files changed between "
+ f"'{args.commit_start}' ({commit_start}) and '{args.commit_end}' ({commit_end}):\n"
+ str(changed_files)
)
for changed in changed_files:
yield pathlib.Path(changed).absolute()
else:
for tracked in repo.commit().tree.traverse():
path = args.directory / pathlib.Path(tracked.path)
if path.is_file():
yield path
@typeguard.typechecked
def filter_what_you_want_to_ignore(files : typing.Iterable[pathlib.Path]) -> typing.Iterator[pathlib.Path]:
"""
Add any filtering rule.
"""
logging.info("Filtering to keep only files we want.")
# As an example, let's filter any PNG or YAML file.
for file in files:
if not any(file.suffix == suffix for suffix in ['.png', '.yaml']):
yield file
@typeguard.typechecked
def check_trailing_space(files : typing.Iterable[pathlib.Path]) -> None:
"""
Check for any trailing space in provided files.
"""
logging.info(f"Checking for trailing spaces.")
has_error = False
for file in files:
with open(file, 'r') as f:
for counter, line in enumerate(f):
if line.endswith(' \n'):
has_error = True
logging.error(f"Trailing space at {file}:{counter+1}")
if has_error:
raise RuntimeError("At least one trailing space detected.")
if __name__ == "__main__":
logging.basicConfig(level = logging.INFO)
args = parse_args()
check_trailing_space(files = filter_what_you_want_to_ignore(files = files_to_look_at(args = args))) |
I would like to add this topic to the TUG's Developers Day Open Discussion, and get additional input and see if we can get consensus. |
@ccober6 It seems the topic was not addressed during TUG's Open Discussion session... What's the way forward? 😄 |
@romintomasetti We will put together a clang-format approach at the MueLu level (github action, pre-commit hook, and some type of bot). Once we feel like we have a reasonable approach other packages can opt in. We will have one massive reformatting commit that can be ignored in git blame. |
This issue has had no activity for 365 days and is marked for closure. It will be closed after an additional 30 days of inactivity. |
This issue was closed due to inactivity for 395 days. |
Enhancement
Trilinos
is full of:On today's develop, running the Python script attached (see below) leads to:
That is, each time someone makes a shallow clone of
Trilinos
(thinking of the CI mostly), it gets almost 2 megabytes of...spaces (I don't know what happens for the 480 files for which I got aUnicodeDecodeError
and I guess it deserves a new issue on its own).This:
Actions:
commonTools/test/utilities/check-mpi-comm-world-usage.py
(@JacobDomagala) that ensures no new trailing space is added.Trilinos
packages, removes lines whose content is only spaces.201760
characters (I guess a bit less than that, see next bullet point).Google Test
,Kokkos
and others)Trilinos
packages so it's worth a try.To quote someone that phrased it much better then me
Last pro: useless tokens clearly have an (slight) impact on the compiler's required time to parse
Trilinos
source code. Removing the trailing spaces might induce very tiny gains with that respect.trailing_white_space.py.txt
The text was updated successfully, but these errors were encountered: