perf: fix performance regression when using CUDAGM with CPU Colvars#934
Conversation
It looks like the vector with custom allocator is copied instead of moved in the original code of cvm::atom_group::positions_shifted. This commit tries to fix the issue by avoiding any potential constructing and destroying new vectors in positions_shifted.
giacomofiorin
left a comment
There was a problem hiding this comment.
Looks good, I'm just quite puzzled by the fact that a full copy is happening despite the fact that the right-hand side is a temporary object:
shifted_pos_soa = atoms->positions_shifted(-1.0 * atoms_cog);
What is preventing the move-assignment operator from being used, and thus making a deep-copy assignment happening?
cvm::ag_vector_real_t is a std::vector of cvm::rvector objects (which are trivially copyable). Is CudaHostAllocator the problem?
| } | ||
|
|
||
| cvm::ag_vector_real_t cvm::atom_group::positions_shifted(cvm::rvector const &shift) const | ||
| int cvm::atom_group::positions_shifted(cvm::rvector const &shift, cvm::ag_vector_real_t& out_soa) const |
There was a problem hiding this comment.
Nitpick: if you're returning COLVARS_OKanyway, and never checking the return code, maybe this could be a void function?
There was a problem hiding this comment.
I am not sure why functions like calc_value() do not return Colvars error codes like others, so I guess you might refactor the code some day, and I make positions_shifted return the error code for the future.
There was a problem hiding this comment.
Because they are very old: the easier thing at the time was to throw an irrecoverable error, ending the simulation. Later, when we interfaced with VMD we started using error codes to raise errors without closing the user's process.
Maybe the custom allocator should be implemented to support move semantic but I don't know how to do it. |
|
@giacomofiorin It looks like if you assign a vector to a variable from a function return value, then at least the destructor is always called (see https://godbolt.org/z/njrnf9eGc). My understanding is that even if the compiler decides to move the return value into the variable, the underlying old object that the variable holds still needs to be deconstructed, and that's why the deconstructor is called. |
It looks like the vector with custom allocator is copied instead of moved in the original code of cvm::atom_group::positions_shifted. This commit tries to fix the issue by avoiding any potential constructing and destroying new vectors in positions_shifted.
Before this fix, nsys-ui shows:

After this fix, nsys-ui shows:
