-
-
Notifications
You must be signed in to change notification settings - Fork 8.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI: Check commit message compliance #5248
base: master
Are you sure you want to change the base?
Conversation
ad65ea7
to
000db7e
Compare
000db7e
to
0e4e42c
Compare
This is essentially a different version of #5000. Please see the concerns listed there. I don't see these concerns covered in the description:
Again, to me, the 50 character limit is not hard limit. There are some cases where making a commit subject fit within 50 character (excluding the prefix) makes it less comprehensible than if you had used 53 characters and still came in under 72 characters total. The 50 character limit is a suggestion or guideline that should be taken seriously, but the absolute hard limit is 72 characters total, as that is when GitHub will wrap the text. |
Agreed on the 50 characters being more of a suggestion. I frequently go over that trying to make a concise title, but definitely the 72 hard limit should apply. |
0bb53c2
to
e781c43
Compare
The concerns mentioned by @RytoEX are addressed in the commit message and the description of this PR. |
This is incorrect. The hard limit on maximum length of the subject, including the prefix, is 72 characters. GitHub's UI will wrap the subject if it goes beyond 72 characters. |
"De-escalate" is technically a correct spelling of this verb. "Deescalate" is also acceptable. I don't know that it's easily detectable if a hyphen used in the first word is correct or not. |
88ec435
to
b419832
Compare
The title line criteria is revised.
The revised script will accept words with a single hyphen at the middle of the word such as "De-escalate". Also accept single hyphens appear multiple times such as "Click-and-drag". If two or more hyphens appear at the same time, such as "De--escalate", it will be rejected. Description is also updated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, my method for testing this PR was to grab every commit since 27.0.0 (35c07bb) and run it through the script.
Of 289 recent commits
- Errors: 39
- Warnings: 26
- Info: 26
More specifically:
- 34 description lines were too long (~76/72 characters)
- 26 had titles too long (~52/50 characters)
- 26 had prefixes that were considered "not frequent"
- 5 had an "invalid" first word
Overall, I'm happy with the results and can see a benefit to having this workflow. Thank you so much for putting in the work! :)
My primary concern looking through the prefix results & the code that triggered them is that check_module_name()
is hardcoded and a subset of the actual prefixes that may appear.
Here are the ones that it complained about:
UI,obs-transitions:
decklink-output-ui:
flatpak:
graphics-hook:
libobs,
libobs,deps/media-playback:
libobs-d3d11:
libobs-winrt,
libobs-winrt:
mac-capture:
mac-syphon:
mac-virtualcam:
obs-ffmpeg,
obs-frontend-api:
obs-qsv11:
text-freetype2:
I personally think that this check could be as simple as checking whether items in the commit prefix (comma or slash separated) match any one of the following:
- the modified file's current directory
- the modified file's root-level directory
- the modified file's name without extension
- if within a plugin subdirectory, the plugin's directory name
While I realise this is an 'Info' level warning, it can also be noisy.
Finally, for the "first word" check, I think it might be a little too aggressive:
first word: FIx
first word: pthread_mutex_init_recursive
first word: DrawSrgbDecompressPremultiplied
first word: add
first word: NVENC
Catching "FIx" and "add" is good, but function names and component names are valid first words in a commit title (in my opinion).
I did have issues running this script outside of the GitHub Actions workflow, and was wondering if there's something we could do about that - specifically, my awk
is in /usr/bin/awk
and not /bin/awk
.
And I'd also like to see some comments in the awk script, specifically around the purposes/goals of some of the regular expressions.
b419832
to
166157b
Compare
As far as I understand it, this script isn't checking modified files at all. It's just checking the output of In any case, what about commits that modify files in multiple directories?
Technically, a function name is probably against the commit guidelines unless it's treated as a verb, but there's no easy way to detect "is this being used as a verb or not". All of these issues though continue to make me think that a lot of these either need to be non-blocking warnings, or question if the signal-to-noise ratio will be high enough to justify implementing this, per the previous PR thread.
I'll be honest, I kind of dislike that it's written in awk for a couple of reasons. Git for Windows doesn't come with awk installed, and while it can be installed on Windows, I think it's unlikely that most Windows-based devs would have it installed. I just don't know if there's an overwhelming reason to try to use a different tool than awk, because most other tools will have similar problems, aside from my opinion that awk is perhaps a bit esoteric and complicated compared to other possible tools. I agree that at minimum, this requires additional comments/documentation, because I don't expect every dev to understand how awk, or regex, works. |
Let me consider to retrieve to check the prefix from the changes of files.
I thought the function names are always noun.
Then, how about python3? I guess more developers are familiar with python. Visual Studio includes python. Regex module ( |
While I appreciate the idea in general and also the work put into this, I am not sure about added value: Workflows do not run on all PRs automatically - this has to be triggered manually after review by a maintainer, which means that a human review has taken place yielding the same information as this check. But it is specifically PRs by non-regulars which might run afoul of this. I share the reservations with regards to using Last, I maintain my reservation about introducing too many single-job workflows in parallel, we have too many of those running already, leading to an overcrowded "checks" area where one cannot see the forest for the trees. EDIT: I also see the unfortunate side-effect of hardcoding "good" commit prefixes, which would require adjusting this PR whenever we add a new plugin or component, which will happen once or twice, but will fall by the wayside every time after that. |
I will consider your valuable comments. |
a6a5cfc
to
40cd840
Compare
e6f2439
to
abd134e
Compare
I'm sorry for my late response. I've revised the script.
|
I'm afraid I accidentally clicked something and got removed request for PatTheMav. |
abd134e
to
fba9389
Compare
This commit adds these checks. - Title consists of module name(s) and subject or just subject. - If there are two or more module names, the module names has to be separated by a comma followed by an optional space. - First word of the subject is starting from an upper case letter and remaining characters in the first word are lower case letters. "Don't" and a word with a hyphen at the middle are also acceptable. - The title is 72 characters at max including module prefix. - If the subject exceed 50 characters excluding the module prefix, display a warning message. - Titles for commits of revert and merge are ignored. - Full description lines are 72 columns max. Links and co-author names are excluded from this check.
fba9389
to
58f0f2d
Compare
Added a small modification to the script to exclude links in the commit message like below.
|
It would probably make more sense to convert this into a repository action and run it as part of the |
Description
This PR adds these checks for commit messages in each pull request.
/
, each module name is searched into at most 3-level depths directories, submodules, and top-level file names./
, that hierarchy should exist at the top level. For example,data/locale
will be rejected.CONTRIBUTING
is correct butCONTRIBUTING.rst
is not correct.UI/updater
such as 47e441b?"Don't" and a word with a hyphen in the middle are also acceptable.
.
.Co-Authored-By:
will be ignored.https?://
for example?The empty checkboxes above might need further discussion.
I'd like to highlight there are some limitations:
.gitmodules
file.In the CI, the commit message is taken by using GitHub REST API. In local, you may run a command like this:
gh api repos/obsproject/obs-studio/pulls/5248/commits | ./CI/check-log-msg.py -v -j -
To check commit messages without using GitHub REST API, a revision range can be specified as
git log
can accept.For example, a command below will check the
HEAD
newer thanorigin/master
. This will be useful to check the message before making a PR.Motivation and Context
There are a lot of violations of the commit guidelines.
This PR will enable to check the commit messages before a PR is merged.
How Has This Been Tested?
The script is tested with old commit messages and confirmed no false errors except listed in the above limitations.
Errors later than 28.0.0 are as below.
clock_gettime_nsec_np
is used to convert from mach tick unitsTypes of changes
Checklist: