Schema compilations are not de-duplicated, even though scenarios can easily reference the same unique files multiple times. For example, the Validator Configuration for XRechnung compiles 6 unique .xls files a total of 34 times. Both the computational overhead and the memory overhead are significant, as all redundantly compiled documents are being kept in memory.
I have implemented a small fix (~8 lines of code) that caches compiled schemas in ContentRepository. Here are my measurement results for cold starts on the same machine with the default usage example from https://github.com/itplr-kosit/validator-configuration-xrechnung (/usr/bin/time -v for measurement):
| Version |
Time |
Peak Memory |
| v1.6.2 |
~11s |
~800mb |
| v1.6.2 patched |
~7.3s |
~400mb |
Note that the Saxon documentation explicitly states that XsltExecutable is thread-safe by design. The cache utilizes this property well.
Schema compilations are not de-duplicated, even though scenarios can easily reference the same unique files multiple times. For example, the Validator Configuration for XRechnung compiles 6 unique .xls files a total of 34 times. Both the computational overhead and the memory overhead are significant, as all redundantly compiled documents are being kept in memory.
I have implemented a small fix (~8 lines of code) that caches compiled schemas in
ContentRepository. Here are my measurement results for cold starts on the same machine with the default usage example from https://github.com/itplr-kosit/validator-configuration-xrechnung (/usr/bin/time -vfor measurement):Note that the Saxon documentation explicitly states that
XsltExecutableis thread-safe by design. The cache utilizes this property well.