-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove VISITED_STAT #5261
base: master
Are you sure you want to change the base?
Remove VISITED_STAT #5261
Conversation
Obj linelist = ELM_PLIST(profileState.visitedStatements, nameid); | ||
|
||
if (LEN_PLIST(linelist) < line || !ELM_PLIST(linelist, line)) { | ||
AssPlist(linelist, line, True); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can use a blist instead of a plist to reduce memory usage by a factor of 64 or so? I think the main missing ingredient would be a function which takes a blist and efficiently increases its size by adding zero blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh and perhaps another helper which takes a blist and an index and returns TRUE
or FALSE
(not True
or False
) ... this helper then could also just return 0 for anything beyond the bounds. That would simplify the code here a bit, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you say, I started trying to use blists then discovered there is no way to extend a blist while having it remain a blist.
It probably wouldn't be a bad idea to add such extensions to blists (maybe even accessible from the GAP level), but that was a rabbit-hole I didn't want to fall down, as it could be done later / more generally (should we just add a generic "extend list, and use this member to fill the new values, and special case that for blists?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't try to make something clever or generic. Thinking again about it, I would indeed just aim for a purely kernel internal API which does exactly what we need. These could even be private inside profile.c
, no need to have them in blister.h
:
int SAFE_TEST_BIT_BLIST(Obj list, UInt pos)
{
if (pos > LEN_BLIST(list)
return 0;
return TEST_BIT_BLIST(list, pos);
}
void SAFE_SET_BIT_BLIST(Obj list, UInt pos)
{
if (SIZE_OBJ(list) < SIZE_PLEN_BLIST(pos))
ResizeBag(list, SIZE_PLEN_BLIST(pos));
SET_BIT_BLIST(list, pos);
//CLEAR_FILTS_LIST(list); // not really necessary for our application
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll have a quick look. I worry about not testing such functions (you didn't do SET_LEN_BLIST in SAFE_SET_BIT_BLIST, which I think is required), I will take a look at doing this with BLists, however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sure. I'm not even sure that code compiles, I just copy&pasted it quickly. Yeah, it should have something like if (pos > LEN_BLIST(list)) SET_LEN_BLIST(list, pos);
added
I've converted this to "draft" status so that it is clear that it should not be merged. You can turn mark it as "ready for review" as soon as |
So this now needs a rebase, but hopefully it won't be too hard. Also you wanted to update |
3eb09cb
to
f463762
Compare
40b1eac
to
39aaf19
Compare
I believe this PR is now complete. I haven't switched to using blists as the memory usage isn't that big, and I would prefer some general and tested code for extending blists, rather than a poorly tested bit of code straight in the profiling. This does make GAP as a whole a bit slower, but not massively, and that's probably unavoidable as we used to be able to just dig into the STAT, while now we have to go and look up through a couple of plists. After merging, and ensuring everything is working correctly, I might have a go at optimising. There is, however one issue -- this pushes the time taken to test HPC-GAP very close to 1 hour, which is (a) too long, and (b) close to where the test will fail if it hits the one-hour limit and times out. I'm going to investigate if this slowness is related to this PR, or a more general issue. |
Note that the 1 hour limit is self-imposed by us, we could increase it -- but I'd rather not. HPC-GAP has been exceptionally slow for quite some time now, much more so in the CI than on my laptop, and I don't know why |
src/profile.c
Outdated
** see this when we use ActivateProfileColour. However, without some | ||
** serious additional overheads, I can't see how to provide this | ||
** 3) Operating at just a line basis can sometimes be too course. However, | ||
** without some serious additional overheads, I can't see how to provide this | ||
** functionality in output (basically we would have to store | ||
** line and character positions for the start and end of every expression). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I do think we could provide that without a problem nor overhead, by replacing the line number in the stat by a byte offset into the file. Converting this into line number + offset into the line can then be deferred to a post-processing step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a bit more problem here ( I could describe it better).
We tend to set up statements we tend to (as you would expect) parse, then make a Stat
, so if we just grabbed the position when making the Stat
we would point to the end of the statement / loop, / whatever.
With just covering lines this (usually) isn't too bad, but if you want to capture expressions usefully you probably want to catch their start and end (as expressions are often nested), and then we'd need to catch the start of expressions.
You are right, I looked on my machine and profiling is a little slower, unfortunately for CI generating coverage is particularly hit, because once a line is covered it was very cheap to check if the stat was covered and exit. However, the slowdown isn't unreasonable. On my machine, HPC-GAP is about 50% slower than master, and coverage is about 50% slower than no coverage. I could make a PR where we do a much smaller set of HPC-GAP testing? We could consider stopping testing it, but I don't know if the code is being partially used for the Julia integration? |
This comment was marked as outdated.
This comment was marked as outdated.
7533674
to
308de18
Compare
I have rebased this now |
308de18
to
72734e7
Compare
Rebased again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks fine to me, except for some HPC-GAP related locking concerns. Also, do you understand why this makes HPC-GAP so much slower? Or does it also make regular GAP with profiling a lot slower?
NEW_PLIST(T_PLIST, 0)); | ||
} | ||
|
||
Obj linelist = ELM_PLIST(profileState.visitedStatements, fileid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess executing all this for every single file is quite excessive. A possible optimization: we do something like for PtrBody
and PtrLVars
, by tracking a STATE(ProfileLineList)
which essential contains linelist
from above and is updated by SWITCH_TO_OLD_LVARS
if profiling is active (well and also by the code which activate/deactivates profiling). The all of the above can be replaced by something like this:
Obj linelist = STATE(ProfileLineList);
if (!linelist)
return visited;
Or one could go one step further and store ADDR_OBJ(linelist)
(and then update that pointer in VarsAfterCollectBags
, too). Then of course LEN_PLIST, ELM_PLIST etc. below would need to be substituted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I have no idea if either of these would be worthwhile, but then again, this is executed for every single expression or statement, so we should measure and make it as fast as possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
markedVisited is only about 2% of the time taken in profiling (that's less than I expected, but it's what cachegrind reports). Almost all the time is spent in printing.
72734e7
to
2d08664
Compare
Rebased once again |
@@ -532,11 +530,36 @@ static inline void printOutput(int fileid, int line, BOOL exec, BOOL visited) | |||
} | |||
} | |||
|
|||
// Mark line as visited, and return true if the line has been previously | |||
// visited (executed) | |||
static BOOL markVisited(int fileid, UInt line) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is not thread safe, yet is not called from a critical section either. Can't that cause problems in HPC-GAP?
So perhaps it needs a lock? Or perhaps we can make profileState.visitedStatements
thread local -- it suffices to merge this into a single plist at the time we write out the profiles, no? (Though I guess that would break if a thread was killed before we do that...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(old reply!)
We only call this from one thread -- we lock profiling to a single thread (you can see this from the caller).
There could be issues to do with ownership of the datastructures here in HPC-GAP, if they are created in one thread and used in another. I'm not opposed to a fix for that, but it wouldn't shock me if various things don't work in HPC-GAP with profiling.
636c70f
to
c9fe133
Compare
c9fe133
to
971d2c0
Compare
Almost all the time taken in profiling is from the fprintf printing out JSON -- the main way to speed this up would be to switch to a binary format, or do something where we just print less output. Both of these are reasonable projects, but I'm not planning on working on either of them soon. |
@ChrisJefferson what are your intentions with this PR? It is still marked as draft. What is needed to complete it then? |
// `visited` starts out as 0 and is set to 1 if the statement or | ||
// expression has ever been executed while profiling is turned on | ||
unsigned int visited : 1; | ||
|
||
// `line` records the line number in the source file in which the | ||
// statement or expression started | ||
unsigned int line : 31; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well put that bit to use again (and possibly allow generation of slightly nicer code?)
unsigned int line : 31; | |
unsigned int line : 32; |
This remove VISITED_STAT and SET_VISITED_STAT, and the 'visited' bit from Stat, and keeps track of these things in profiling.c, and keep track of covered lines in profiling.c
This remove
ActivateProfileColour
, which allowed colouring functions with covered lines, hitting the exact parts of lines executed, as that was easy to do when we tracked what was covered per Stat, but I don't do that now (only by line), as I believe no-one was ever using this functionality!If this is fine, please don't merge it yet, as I'll have to release an updating to the profiling package, as to sync up with this change (in particular, remove tests which use
ActivateProfileColour
)