Automatic removal of unused standard header#192800
Automatic removal of unused standard header#192800serge-sans-paille wants to merge 6 commits intollvm:mainfrom
Conversation
f39fd4f to
faef13e
Compare
Sorry if it's a stupid question - does the tool consider whether C/C++ library includes incidentally/indirectly are taking place? In which case it might appear to be unused only because some other prior include pulled it in (for this particular C/C++ library implementation, for this particular version of it etc). |
It's not :-)
The algorithm of diskarzhañ is stupidly simple: each standard header is associated with the set of symbols it defines (e.g. |
3b348e9 to
46581d3
Compare
aengelke
left a comment
There was a problem hiding this comment.
Why? Please add a motivation to the PR description. For larger changes/long-term efforts/adding CI checks, an RFC would be good.
From a compile-time perspective, standard headers are not the problem, because many expensive ones get pulled in almost everywhere through Support/ headers anyway. If anything, we should focus on reducing unneeded includes from LLVM headers, also by using forward declarations where possible. Just focusing on standard headers provides little value, IMO.
1ed39da to
34238db
Compare
The goal would be to add a CI step that prevents future regression on that topic. Agreed for the RFC.
Based on that change I can confirm that the amount of preprocessed lines before / after the patch was only shrunk by 24_507 lines, which is mostly negligible compare to the 395_187_372 original ones (only compiling LLVM). As such the only value of this PR would be to pave the way for automatic regression, if that's a path we want to follow. |
|
We have commits that do IWYU cleanup every so often - how well does diskarzhañ match with that? I'm worried we'll end up in an infinite loop of commits from different tools micro-optimizing include placement.... |
|
For reference the location of the tool in question: https://pypi.org/project/diskarzhan/. FYI'ing @kimgr and @bolshakov-a from IWYU |
|
FWIW looking at the source of diskarzhan we now have another set of header-to-symbol mappings in the cosmos of LLVM (LLVM itself, IWYU and the new tool in town). |
|
CC'ing @alejandro-colomar and @AaronBallman as they are involved with mapping stuff |
| #include "llvm/ADT/DenseMapInfo.h" | ||
| #include "llvm/Support/Compiler.h" | ||
|
|
||
| #include <cassert> | ||
|
|
||
| namespace clang { | ||
|
|
||
| namespace tok { |
There was a problem hiding this comment.
The commit message seems to be incomplete. It says removal, but for example in this file, we only have additions.
| #ifndef LLVM_PROFILEDATA_CTXINSTRCONTEXTNODE_H | ||
| #define LLVM_PROFILEDATA_CTXINSTRCONTEXTNODE_H | ||
|
|
||
| #include <stddef.h> | ||
| #include <stdint.h> | ||
| #include <stdlib.h> | ||
|
|
||
| namespace llvm { | ||
| namespace ctx_profile { |
There was a problem hiding this comment.
I'd like to learn the rules that diskharzan uses. I'd like to learn how they differ from iwyu(1) and why.
Why does diskharzan exist in the first place? Is there anything from iwyu(1) that you believe is better addressed by diskharzan?
+1 to this concern, also this is a significant amount of churn for an NFC change that likely will cause merge conflicts for downstreams. So definitely +1 to having some kind of RFC to see if there's community buy-in, but also CC @llvm/clang-vendors for awareness of this PR directly. |
|
Cc: @bolshakov-a , @kimgr |
|
I think the tool is generally good because, for standard library symbols, the simple textual lookup is usually sufficient for doing IWYU (but not always, see https://discourse.llvm.org/t/should-stacktrace-provide-string/89605). However, I've noticed that the mapping in diskarzhan is not full even for C++17 ( |
|
hey folks, As of why diskharzan exists, we tried to deploy clang-include-cleaner and iwyu in the context of firefox codebase, but we have a lot (a lot) of different configurations, leading to various preprocessor guards, and so the status of whether a file is needed or not depends on the configuration, which is fixable but requires a lot of work. @bolshakov-a I'll make sure to look at the database from iwyu, that will avoid duplicating effort. |
There already is flip-flopping with |
See cppcheck-opensource/cppcheck#8474. Having a short look at the |
|
Note that diskarzhan and iwyu are indeed incompatible. When IWYU encounters e.g. There are some more cases when a standard library header is required even when the type is not explicitly written: // Requires '#include <initializer_list>' despite std::initializer_list is not explicitly spelled.
for (int i : {1,2,3})
do_smth_with(i);
// Requires '#include <typeinfo>' despite std::type_info is not explicitly spelled.
const auto& ti = typeid(int);That said, iwyu and diskarzhan implement different approaches: while iwyu exposes some more paranoid behavior, diskarzhan requires a header only when some symbol from it is explicitly written and thus provides a reasonable approximation for most use cases. |
patch automatically generated through