Ask questionsAdd support for splitting linker invocation to a second execution of `rustc`
This issue is intended to track support for splitting a
rustc invocation that ends up invoking a system linker (e.g.
dylib, and even
staticlib in the sense that everything is assembled) into two different
rustc invocations. There are a number of reasons to do this, including:
This can improved pipelined compilation support. The initial pass of pipelined compilation explicitly did not pipeline linkable compilations because the linking step needs to wait for codegen of all previous steps. By literally splitting it out build systems could then synchronize with previous codegen steps and only execute the link step once everything is finished.
This makes more artifacts cacheable with caching solutions like
sccache. Anything involving the system linker cannot be cached by
sccache because it pulls in too many system dependencies. The output of the first half of these linkable compilations, however, is effectively an
rlib which can already be cached.
This can provide build systems which desire more control over the linker step with, well, more control over the linker step. We could presumably extend the second half here with more options eventually. This is a somewhat amorphous reason to do this, the previous two are the most compelling ones so far.
This is a relatively major feature of rustc, and as such this may even require an RFC. This issue is intended to get the conversation around this feature started and see if we can drum up support and/or more use cases. To give a bit of an idea about what I'm thinking, though, a strawman for this might be:
bincrate type by passing the
--do-not-linkflag, passing all the flags it normally does today.
rustcagain, only this time passing the
These two flags would indicate to
rustc what's happening, notably:
--do-not-link indicates that rustc should be creating a linkable artifact, such as a one of the ones mentioned above. This means that rustc should not actually perform the link phase of compilation, but rather it's skipped entirely. In lieu of this a temporary artifact is emitted in the output directory, such as
*.rlink. Maybe this artifact is a folder of files? Unsure. (maybe it's just an rlib!)
The converse of
--only-link, is then passed to indicate that the compiler's normal phases should all be entirely skipped except for the link phase. Note that for performance this is crucial in that this does not rely on incremental compilation, nor does this rely on queries, or anything like that. Instead the compiler forcibly skips all this work and goes straight to linking. Anything the compiler needs as input for linking should either be in command line flags (which are reparsed and guaranteed to be the same as the
--do-not-link invocation) or the input would be an output of the
--do-not-link invocation. For example maybe the
--do-not-link invocation emits an file that indicates where to find everything to link (or something like that).
The general gist is that
--do-not-link says "prepare to emit the final crate type, like
bin, but only do the crate-local stuff". This step can be pipelined, doesn't require upstream objects, and can be cached. This is also the longest step for most final compilations. The gist of
--only-link is that it's execution time is 99% the linker. The compiler should do the absolute minimal amount of work to figure out how to invoke the linker, it then invokes the linker, and then exits. To reiterate again, this will not rely on incremental compilation because engaging all of the incremental infrastructure takes quite some time, and additionally the "inputs" to this phase are just object files, not source code.
In any case this is just a strawman, I think it'd be best to prototype this in rustc, learn some requirements, and then perhaps open an RFC asking for feedback on the implementation. This is a big enough change it'd want to get a good deal of buy-in! That being said I would believe (without data at this time, but have a strong hunch) that the improvements to both pipelining and the ability to use
sccache would be quite significant and worthwhile pursuing.
Answer questions tmandry
What is even the benefit of performing codegen and linking on different machines by the way? It doesn't increase parallelism, as the linking has to wait on codegen anyway.
Linking, in theory, depends on a lot more artifacts than codegen does. Codegen should only require source code and the rmeta files from any crates you depend on. Linking requires all the generated code. In our sccache-like environment, this would mean uploading many rlib files and possibly system libraries to the worker. Network bandwidth becomes a bottleneck. So it's much better to send compile steps to the workers, hitting cache when possible, and do linking locally.
That would require making the linker arguments stable, as changes could break such thing.
Link args don't need to be stable, just the file format which contains them. I don't think the fact that the file contains references to implementation details like compiler-builtins is a problem, actually. As long as those details can change without changing the schema, a well-written tool should be able to consume them without breakage.
That said, stabilizing rlink seems more ambitious than having a rustc option which spits out the final linker line, allowing you to run it yourself.