Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for GNU Make jobserver (alternative implementation) #104

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

bonzini
Copy link
Contributor

@bonzini bonzini commented Apr 19, 2024

Alternative implementation to #94.

The main advantage is that the integration with the build() event loop is very clean, as it simply uses a pipe to signal the availability of tokens. Interacting with the job server is entirely embedded within a new token.c file that implements a simple API:

int tokeninit(void);
bool tokenget(struct edge *e);
void tokenput(void);

and on top of this, the integration is about 20 lines of code.

On the other hand token.c uses pthreads, which perhaps could be considered less appealing. Waiting for reviews. :)

@bonzini bonzini force-pushed the jobserver branch 2 times, most recently from 4aabf9c to bdf99fa Compare May 3, 2024 10:18
@bonzini
Copy link
Contributor Author

bonzini commented May 3, 2024

(rebased on top of #106, which provides "give back tokens on signals" behavior for free)

@bonzini
Copy link
Contributor Author

bonzini commented Dec 6, 2024

Sorry to be very explicit, but I would like a clear answer on whether the project is still maintained.

Extract it from jobwork() so that build() can call it on a signal.

Signed-off-by: Paolo Bonzini <[email protected]>
Keep the system clean by propagating SIGTERM to all children,
and by not starting new jobs on both SIGTERM and SIGINT.

The only tricky bit is that previously fd[i].revents was used
to skip both jobs that are not in use and jobs that did not
have output; that's because negative file descriptors
do not cause POLLNVAL and therefore fd[i].revents is zero for
inactive jobs as well.  But because all jobs must be killed,
build() now has to check fd[i].fd == -1 explicitly.

While at it, also clean up jobdone() by clearing job[i].edge;
it's not nice to leave a dangling pointer in the jobs array,
even if it's harmless.

Signed-off-by: Paolo Bonzini <[email protected]>
GNU Make has a neat feature called the jobserver protocol, where
the top-level Make can allocate a specific number of job slots, and
child makes can take slots to do work in. This was designed to stop the
parallelisation problem where a top-level make -j10 may potentially spawn
10 separate sub-makes all with -j10 so there's now 100 parallel jobs.

However, it's also useful for resource control in systems which build
multiple pieces of software at once. For example, Bitbake can build
N different pieces of software at once, and each of those is passed a
-jM flag. If each of these N tasks is compiling then thats's N*M jobs
so you don't want N or M to be too high, but if only 1 of N is building
then you want M to be high.

With the job server protocol there are N slots in total for all sub
makes, so you can control the resource utilisation more accurately. By
supporting the jobserver protocol instead of just -j, Samurai can join
in the resource pooling and builds can be more efficient.

Signed-off-by: Paolo Bonzini <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant