Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated style checking #619

Open
nworden opened this issue Apr 2, 2019 · 11 comments
Open

Automated style checking #619

nworden opened this issue Apr 2, 2019 · 11 comments

Comments

@nworden
Copy link
Contributor

nworden commented Apr 2, 2019

It'd be nice to have some kind of style check run automatically. yapf seems like a good place to start. The docs say if you run it with -d you get a non-zero return code if changes are required, so you can integrate it into an automated script. It's not a complete solution (e.g., it doesn't seem to detect imports of individual functions, issue #615, even if you run it with --style google), but it's better than nothing.

We should also consider using pylint, which won't fix things automatically but is more thorough.

I don't think it would be practical to try to do this all at once, but we could probably do it the same way I'm handling Python 3 testing: have a list of specific files that should be checked, and add files to it as we update them to be in line with the style guide. I'll start by running pylint and yapf over the new Django view classes as I write them.

@gimite
Copy link
Contributor

gimite commented Apr 3, 2019

Oh so yapf performs auto formatting like clang-format. That sounds nice! I like auto formatting (though some people hates it).

I don't think it would be practical to try to do this all at once

Just curious, why is that? Is it possible that yapf breaks the code, not just changing formatting? Can't we apply it to all files at once if it safely reformats files without changing semantics?

Also just curious, do you know of anything similar for JS(X)?

@nworden
Copy link
Contributor Author

nworden commented Apr 3, 2019

yapf isn't expected to change semantics, but it looks like there are going to be thousands of lines that need to change just under /app (not including app/vendors), so if we tried to do it all at once we'd have a high risk of merge conflicts and a huge PR to review. Maybe we could have a rule that if you change a file that's not already checked by yapf you should run yapf on it and add it to the list? Since every file is probably going to be affected by the Python 3 work, every file would get done this year (hopefully).

I'm not sure what to do about pylint, which will be a lot more work; I wouldn't want to block PRs on fixing pylint warnings.

It looks like there's a bunch of options for formatting JS, but none of them really dominate the market (for lack of a better way to put it). The Google JS style guide recommends clang-format (the Closure team has instructions here), but I'd rather not make people go out of their way to install too much stuff when we can help it (pylint and yapf don't seem like a problem since we can just add them to the pip install command). Maybe we could use js-beautifier or prettier (both available as NPM packages).

@gimite
Copy link
Contributor

gimite commented Apr 4, 2019

yapf isn't expected to change semantics, but it looks like there are going to be thousands of lines that need to change just under /app (not including app/vendors), so if we tried to do it all at once we'd have a high risk of merge conflicts and a huge PR to review. Maybe we could have a rule that if you change a file that's not already checked by yapf you should run yapf on it and add it to the list?

That sounds good to me if the formatting PR is separated from the main change PR (otherwise the diff would be messy).

I'm still wondering maybe we can still try applying it to all files for following reasons, but it's up to you:

  • Code review: We probably don't need a full code review once we trust that the tool does the right thing. In order to trust it, we can:

    • Do manual spot review to see if the diff looks good
    • Run unit/server tests
    • Perform manual QA for basic scenarios
  • Merge conflicts: I believe there are not many people modifying Python code now, maybe just you and the contributor of Make imports in Python consistent with the style guide #615. (I'm currently mostly touching just JS code.) So I hope it's not a disaster if you do that whenever you are comfortable, and maybe ask the contributor of Make imports in Python consistent with the style guide #615 to put their work on hold when you do that.

Since every file is probably going to be affected by the Python 3 work, every file would get done this year (hopefully).

I'm not sure what to do about pylint, which will be a lot more work; I wouldn't want to block PRs on fixing pylint warnings.

It looks like there's a bunch of options for formatting JS, but none of them really dominate the market (for lack of a better way to put it). The Google JS style guide recommends clang-format (the Closure team has instructions here), but I'd rather not make people go out of their way to install too much stuff when we can help it (pylint and yapf don't seem like a problem since we can just add them to the pip install command). Maybe we could use js-beautifier or prettier (both available as NPM packages).

Makes sense, agreed to prefer whatever available in NPM if it is good quality. I'm happy to take the task for JS and try the tools unless you want to take it.

@nworden
Copy link
Contributor Author

nworden commented Apr 4, 2019

I'm still wondering maybe we can still try applying it to all files for following reasons...

Maybe we could do it in batches. yapf seems to support regexes in the file list (e.g., tools/lint yapf-fix app/[c-f]*.py fixes all the files starting with letters c-f), so I could do a set of files I know aren't being touched right now and then do other batches later.

Makes sense, agreed to prefer whatever available in NPM if it is good quality. I'm happy to take the task for JS and try the tools unless you want to take it.

If you have time to take on the JS side that'd be great!

@gimite
Copy link
Contributor

gimite commented Apr 5, 2019

I'm still wondering maybe we can still try applying it to all files for following reasons...

Maybe we could do it in batches. yapf seems to support regexes in the file list (e.g., tools/lint yapf-fix app/[c-f]*.py fixes all the files starting with letters c-f), so I could do a set of files I know aren't being touched right now and then do other batches later.

SGTM!

Makes sense, agreed to prefer whatever available in NPM if it is good quality. I'm happy to take the task for JS and try the tools unless you want to take it.

If you have time to take on the JS side that'd be great!

Sure. I may not have time immediately but I will take a look when I have time.

nworden added a commit to nworden/personfinder that referenced this issue Apr 7, 2019
This covers all our files under app/ which don't start with a, f, m, or v.

Starts addressing issue google#619.
@nworden
Copy link
Contributor Author

nworden commented Apr 9, 2019

Update from PR #630: YAPF had a new release and broke our build, after being enabled for two files for four days, so I think we should reconsider enforcing it.

Quoting from @gimite from that PR:

If we cannot find a good alternative, does it make sense to stick to a specific version of yapf (and perform project-wide batch update when we feel the version is too old)?

I'd rather not make auto-reformatting the codebase a regular thing if we can help it. It'll be annoying to avoid merge conflicts, and it messes up file history too: if we reformat things once, then at least it's just one hop to get back before the reformat, but if we have to do it repeatedly it could make it really inconvenient to trace back through the code's history.

I think YAPF is more likely to have this sort of problem than other formatters, because of this (quoting from YAPF docs):

YAPF takes a different approach.... In essence, the algorithm takes the code and reformats it to the best formatting that conforms to the style guide, even if the original code didn’t violate the style guide.

If it makes changes to code that already follows the style guide (which I don't think is something other auto-formatters are likely to do?), we're more likely to have problems caused by changes on their side.

There's a couple other libraries that look promising:

  • pycodestyle, which doesn't do auto-formatting but just finds style issues (like a lighter-weight version of pylint)
  • autopep8, which does auto-formatting.

So we could either:

  • Suggest that people use YAPF to format their code, but only use pycodestyle in Travis instead of enforcing exactly what YAPF generates. This would let people use an auto-formatter of their choice if they preferred; as long as they end up with something pycodestyle accepts, it's fine.
  • Use autopep8 the way we were using YAPF.

I have a slight preference for the former because pycodestyle is a dependency of autopep8 anyway, plus it'd make it simpler for us to use YAPF to do our own formatting if we liked.

YAPF's the only library I know of that has an option specifically for Google style rather than the PEP 8 standard, but the differences between them don't look that significant (the docs explaining all those settings is here). We can set up a pycodestyle config to deal with the different line lengths, and I'm not sure the other differences in the YAPF setup are even really part of either style guide.

@gimite
Copy link
Contributor

gimite commented Apr 10, 2019

Update from PR #630: YAPF had a new release and broke our build, after being enabled for two files for four days, so I think we should reconsider enforcing it.

Quoting from @gimite from that PR:

If we cannot find a good alternative, does it make sense to stick to a specific version of yapf (and perform project-wide batch update when we feel the version is too old)?

I'd rather not make auto-reformatting the codebase a regular thing if we can help it. It'll be annoying to avoid merge conflicts, and it messes up file history too: if we reformat things once, then at least it's just one hop to get back before the reformat, but if we have to do it repeatedly it could make it really inconvenient to trace back through the code's history.

Oh I didn't mean to perform the "project-wide batch update" often. I imagined something like doing it every 3 years :) It of course assumes that the current snapshot of yapf is good enough for most cases though.

I think YAPF is more likely to have this sort of problem than other formatters, because of this (quoting from YAPF docs):

YAPF takes a different approach.... In essence, the algorithm takes the code and reformats it to the best formatting that conforms to the style guide, even if the original code didn’t violate the style guide.

If it makes changes to code that already follows the style guide (which I don't think is something other auto-formatters are likely to do?), we're more likely to have problems caused by changes on their side.

There's a couple other libraries that look promising:

  • pycodestyle, which doesn't do auto-formatting but just finds style issues (like a lighter-weight version of pylint)
  • autopep8, which does auto-formatting.

So we could either:

  • Suggest that people use YAPF to format their code, but only use pycodestyle in Travis instead of enforcing exactly what YAPF generates. This would let people use an auto-formatter of their choice if they preferred; as long as they end up with something pycodestyle accepts, it's fine.
  • Use autopep8 the way we were using YAPF.

I have a slight preference for the former because pycodestyle is a dependency of autopep8 anyway, plus it'd make it simpler for us to use YAPF to do our own formatting if we liked.

I have slight preference for the latter but it is up to you. We can try whatever option to see how well it works.

My concern for the former is that, if each person uses an auto-formatter of their choice, we end up changing format of a file for each commit (depending on who edits it), making it harder to review the diff, which would be mixture of the real diff and formatting diff. So my preference is to stick to a single formatter with stable output and force it for everyone, if we use auto formatter.

YAPF's the only library I know of that has an option specifically for Google style rather than the PEP 8 standard, but the differences between them don't look that significant (the docs explaining all those settings is here). We can set up a pycodestyle config to deal with the different line lengths, and I'm not sure the other differences in the YAPF setup are even really part of either style guide.

@nworden
Copy link
Contributor Author

nworden commented Apr 11, 2019

Oh I didn't mean to perform the "project-wide batch update" often. I imagined something like doing it every 3 years :) It of course assumes that the current snapshot of yapf is good enough for most cases though.

Ah, ok, that wouldn't be as bad as I thought. I think we might have to do it more often than that though, e.g. to keep up with new version of Python, or if there's a YAPF bug and we needed to upgrade to fix it. And it might be annoying to external folks if they had to specifically install an old version of YAPF.

I have slight preference for the latter but it is up to you. We can try whatever option to see how well it works.

My concern for the former is that, if each person uses an auto-formatter of their choice, we end up changing format of a file for each commit (depending on who edits it), making it harder to review the diff, which would be mixture of the real diff and formatting diff. So my preference is to stick to a single formatter with stable output and force it for everyone, if we use auto formatter.

Oh, that's a good point. It might not be as big a problem with most formatters as for YAPF (since they won't necessarily try to change things that already comply with the style rules like YAPF does), but it could get us into trouble if some of us use YAPF.

I'll try out autopep8 (maybe I'll run YAPF on a file first and then autopep8, to see if autopep8 changes anything) and report back. I'll also check out what some other open-source Python projects do to handle this.

@gimite
Copy link
Contributor

gimite commented Apr 11, 2019

Sounds good, thanks for working on this!

@nworden
Copy link
Contributor Author

nworden commented Apr 29, 2019

I tried a few different auto-formatters (yapf, autopep8, and a newish one called black), and it seems like they all change things they don't necessarily have to (though yapf is the most problematic in this department). I'm a little hesitant to make a long-term commitment to any particular auto-formatter.

Style-checking is an easier question though. There's kinda two major options, but they're really both the same:

  • pycodestyle (formerly called pep8), which checks code style. It's easy to configure, so we don't have to use PEP 8's default 79-character max line length, and I haven't seen it complain about anything it shouldn't complain about (I've run it over all our Django files and a couple other things).
  • flake8, which uses pycodestyle under the hood to check style, but also looks for other things like unused imports and stuff like that. flake8's really widely used in open-source Python (I think every project I looked at that had automatic style-checking did it with flake8). Ideally we'll use flake8 for everything at some point, but we have a lot of unused imports and the like, so that's not going to happen overnight.

My plan is to:

  1. Set up flake8 for the files that don't need too much work.
  2. Do an autopep8 or yapf run to get the whole codebase in good style, and use pycodestyle by itself over the whole codebase to avoid regressions.
  3. Once we clean up unused imports and the other stuff (imports seem to be the main problem) we can switch to using flake8 for everything.

@nworden nworden mentioned this issue Apr 29, 2019
@gimite
Copy link
Contributor

gimite commented May 7, 2019

The plan looks good to me!

it seems like they all change things they don't necessarily have to

Just curious, what is an example of "changing things they don't necessarily have to"? I'm personally OK if they change one format to another format where both are actually acceptable. It wouldn't be a problem if we once apply the formatter in batch and stick with it.

It would be a concern if it changes a good format to a bad format, or their formatting changes too often (as in yapf) though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants