Conversation
0b2d336 to
132d72a
Compare
…face and volumetric meshes directly from physicsnemo mesh files.
with multiple mesh based datapipes
Greptile SummaryThis PR introduces a mesh-centric datapipe infrastructure (
Important Files Changed
|
peterdsharpe
left a comment
There was a problem hiding this comment.
Adding another round of comments!
| # Unified External Aerodynamics Recipe | ||
|
|
||
| NOTE THIS README IS AI GENERATED AND YOU SHOULD USE IT WITH EXTREME CAUTION. | ||
| I WILL GO THROUGH IT CAREFULLY FOR ACCURACY BEFORE FINAL REVIEW OR MERGE. |
There was a problem hiding this comment.
Dropping a TODO here, just so we don't forget
| def __init__( | ||
| self, | ||
| axes: list[Literal["x", "y", "z"]] | None = None, | ||
| angle_range: tuple[float, float] = (-math.pi, math.pi), |
There was a problem hiding this comment.
Rather than parameterizing the rotation via:
- Random axis from ["x", "y", "z"]
- Random angle
It might be nicer to have a truly random rotation? With the current setup, the diversity of rotations is quite limited.
One possible algorithm:
- Generate a random vector -> normalize it -> that's your new
x'. - Generate another random vector -> normalize it -> project it into the plane normal to
x'; that's youry' z'is the cross product ofx'andy'
This generates a random rotation, while preserving angles + right-handed coordinate system (i.e., it's orthonormal).
And then you arrange all three of these (x', y', z') as columns of a 3x3 matrix, and do mesh.transform(matrix).
There was a problem hiding this comment.
I went with this alg, basically: https://kya8.github.io/p/uniform-sampling-of-so3-rotations/
But, there are definitely cases where we may want to restrict to physical rotations: a car left to right is fine but they don't drive upside down, for example. So I put in a mode switch too.
|
|
||
| points = mesh.points | ||
| if "L_ref" in gd: | ||
| points = points / gd["L_ref"].float() |
There was a problem hiding this comment.
One thing here:
- Nondimensionalization of the points is sometimes done in aerospace, but it's not super common. (By contrast, something like
$C_p$ nondimensionalization or$C_f$ nondimensionalization is pretty ubiquitous.) This is a good idea to do, but we should be sure to call this out in downstream use cases. This PR already does a good job of noting this in the docstring; just noting this for any downstream consumers of this.
There was a problem hiding this comment.
We can note it in the readme, yeah. And, I think it's ok to just set L_ref to 1.0 and "turn it off" so to speak, too.
| target = torch.cat([pressure, wss], dim=-1) # (N, 4) | ||
|
|
||
| points_list.append(pts) | ||
| embedding_list.append(torch.cat([pts, normals], dim=-1)) # (N, 6) |
There was a problem hiding this comment.
This collate function seems to be Transolver/GeoTransolver-specific, and in particular this part with the embeddings breaks equivariance. Do we want to have these controlled by a kwarg? Or, more broadly, is there a way to make this collate function less Transolver/GeoTransolver-specific?
There was a problem hiding this comment.
Yes, this is absolutely specific to geotransolver, and its something we could improve. Most datapipes use a generic collation and a challenge with the Mesh datapipes is it doesn't totally make sense to "collate" a few mesh objects ... IMO we might want to leave mesh datapipes to be typically batch_size=1 locally.
Then we're viewing this more as "how do I massage the output keys from the datapipe as input keys to the network" which is what's really going on here, honestly.
There was a problem hiding this comment.
tl;dr we should just call this map_data_to_model, not collate, and have a dictionary of mappings for the models we want to support?
| idx = 0 | ||
| for name, ftype in field_types.items(): | ||
| if ftype == "pressure": | ||
| out[..., idx] = out[..., idx] * q_inf + p_inf |
There was a problem hiding this comment.
This logic is technically duplicated w/ line 171-ish. It feels like there should be a way to deduplicate this by having an abstraction here, which would help long-term maintainaibility. This is just a suggestion though!
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
| name: drivaer_ml_surface | ||
|
|
||
| train_datadir: /path/to/your/PhysicsNeMo-DrivaerML/ | ||
| train_datadir: /lustre/fsw/portfolios/coreai/projects/coreai_modulus_cae/datasets/PhysicsNeMo-DrivaerML/ |
There was a problem hiding this comment.
Yes I had these purged, and then I broke all my training scripts. Arg.
| name: drivaer_ml_volume | ||
|
|
||
| train_datadir: /path/to/your/PhysicsNeMo-DrivaerML/ | ||
| train_datadir: /lustre/fsw/portfolios/coreai/projects/coreai_modulus_cae/datasets/PhysicsNeMo-DrivaerML/ |
|
|
||
| name: highlift_surface | ||
|
|
||
| train_datadir: /lustre/fsw/portfolios/coreai/projects/coreai_modulus_cae/datasets/PhysicsNeMo-HighLiftAeroML/ |
peterdsharpe
left a comment
There was a problem hiding this comment.
Good work overall! This is a thoughtful and large contribution and really cleans up the datapipes stack. I also think the unified training recipe will be well-received.
I'm approving to not hold this up further, and we've discussed most of this together. Some of the comments are suggestions, others are things that we should fix before merging.
Tests
Separately, if we can improve test coverage of some of these new features, that would be great. Here are some parts missing coverage - don't feel like you need to cover it all, but if there are any high-risk items that jump out, let's cover those:
- No tests for
**MultiDataset.set_generator/set_epoch** in[test/datapipes/core/test_multi_dataset.py](test/datapipes/core/test_multi_dataset.py). - No tests for
**Dataset.set_generator/set_epoch** in[test/datapipes/core/test_dataset.py](test/datapipes/core/test_dataset.py). - No tests for
**MeshDataset+ CUDA streams** path (analogue of the stream tests intest_dataset.py). - No tests for
**MultiDatasetmixingMeshDatasetsub-datasets** (the docstring says it is supported). - No tests for
**MeshReadersubsampling + RNG reproducibility end-to-end**, or for**DomainMeshReader.extra_boundaries**. - No tests for
**state_mixing_mode="concat_project"**. test_close_closes_allintest_multi_dataset.py:197-203has no assertions aftermulti.close()(passes trivially).test_scale_eachintest_mesh_readers.py:291-299does not assert scaling actually changed coordinates (would pass ifScaleMeshwere a no-op).- Several tests depend on private attributes (
dataset._prefetch_futures,_ensure_executor()).
| self, | ||
| axes: list[Literal["x", "y", "z"]] | None = None, | ||
| distribution: torch.distributions.Distribution | None = None, | ||
| mode: Literal["axis_aligned", "uniform"] = "axis_aligned", |
There was a problem hiding this comment.
I thought we were defaulting to uniform?
| """Apply this transform to a DomainMesh. | ||
|
|
||
| Default: broadcasts ``__call__`` to interior and all boundaries | ||
| via :meth:`DomainMesh._map_meshes`, leaving domain-level |
There was a problem hiding this comment.
This was changed from the private-method DomainMesh._map_meshes to the public-method DomainMesh.apply() as part of the now-merged #1558: https://github.com/NVIDIA/physicsnemo/pull/1558/changes#diff-11f619a0c0840c0bb17aec96535e01fce9c5ea67831bfe2e6d70f9345f09d6f4
For this PR, we probably want to: 1. merge from main (which will contain this), 2. update to use DomainMesh.apply instead, and 3. re-run tests just to double-check.
There was a problem hiding this comment.
Thanks for pointing this out, I reviewed your pr so I knew your api changed but hadn't adapted yet. I'll absolutely update to align to the API before merge.
| # Derived from the standard unit-quaternion rotation formula using | ||
| # w²+x²+y²+z² = 1 to rewrite 1-2(…) terms as sums of squared components. | ||
| # ww wx wy wz xw xx xy xz yw yx yy yz zw zx zy zz | ||
| self._q2r_map = torch.tensor( |
There was a problem hiding this comment.
Soft-recommendation (not a requirement) to reduce the "magic" of this hardcoded mapping, and instead use the Rodrigues formula (which exists in physicsnemo/mesh/transformations/geometric.py:202-216). The current 16x16 tensor is a little hard to verify correctness for by inspection.
Here, that might look like a) deleting this mapping, and then b) later on using something like:
@staticmethod
def _quaternion_to_rotation_matrix(
q: Float[torch.Tensor, "4"],
) -> Float[torch.Tensor, "3 3"]:
r"""Unit quaternion :math:`(w, \vec v)` to rotation matrix via Rodrigues' formula.
:math:`R = (2w^2 - 1)\,I + 2\,\vec v\vec v^\top + 2w\,[\vec v]_\times`,
where :math:`[\vec v]_\times` is the skew-symmetric cross-product matrix of
:math:`\vec v`.
Parameters
----------
q : torch.Tensor
Unit quaternion :math:`(w, x, y, z)`, shape :math:`(4,)`.
Returns
-------
torch.Tensor
Rotation matrix, shape :math:`(3, 3)`.
"""
w, x, y, z = q.unbind()
zero = torch.zeros_like(w)
v_cross = torch.stack(
[
torch.stack([zero, -z, y]),
torch.stack([z, zero, -x]),
torch.stack([-y, x, zero]),
]
)
return (
(2 * w * w - 1) * torch.eye(3, dtype=q.dtype, device=q.device)
+ 2 * torch.outer(q[1:], q[1:])
+ 2 * w * v_cross
)| transform_global_data=self.transform_global_data, | ||
| assume_invertible=True, | ||
| ) | ||
|
|
There was a problem hiding this comment.
elif self.mode == "axis_aligned"
and
else: raise...
or a match-case block would be safer here - if the user accidentally typos "uniform" we are silently falling back to axis_aligned rather than loudly warning.
| transform_global_data=self.transform_global_data, | ||
| assume_invertible=True, | ||
| ) | ||
|
|
There was a problem hiding this comment.
Same elif else here as comment near L453
| _NONDIM_TYPE_MAP = {"scalar": "pressure", "vector": "stress"} | ||
|
|
||
|
|
||
| def _to_physical( |
| return torch.mean((pred - target) ** 2.0) | ||
|
|
||
|
|
||
| def compute_rmse( |
There was a problem hiding this comment.
Typically RMSE is used for root mean squared error; could we name this relative_mse to reduce confusion perhaps?
| Mapping of field names to types. Order determines channel indices. | ||
| Example: {"pressure": "scalar", "velocity": "vector", "turbulence": "scalar"} | ||
| loss_type : Literal["huber", "mse", "rmse"], optional | ||
| Type of loss to compute. Default is "huber". |
There was a problem hiding this comment.
This says we default to Huber, below in the code we actually default to MSE
There was a problem hiding this comment.
Ah ok sure. Pretty sure all my yamls are set to huber so it's not too wrong but I'll fix the default below.
| # limitations under the License. | ||
|
|
||
| """ | ||
| Mesh readers - Load physicsnemo Mesh / DomainMesh from physicsnemo mesh format (.pt). |
There was a problem hiding this comment.
Throughout this file (~5 places) we refer to Mesh format as using .pt extensions rather than .pmsh
| def to(self, device: torch.device | str) -> Transform: | ||
| """ |
There was a problem hiding this comment.
Transform.to does not migrate Generators, but MeshTransform.to does - seems like a potential footgun?
PhysicsNeMo Pull Request
This brings in a new training recipe for external aerodynamics. We will build on this in the coming weeks, but right now it contains the following functionality:
Note that the dataloader built here is a hybrid of hydra and python initialization: the easiest way to make a flexible multi-dataset loader, with all of this, was to instantiate each with hydra but then package the whole business up in python.
There aren't significant changes to the GeoTransolver recipe other than these, and eventually I'll get transolver in here too. Next I want to intercept DomainMesh for volumetric learning. I've got a prototype datapipe reader for it, and the mesh-based data pipes should handle it, but it's not tested yet. I'm willing to prune this for the PR, if desired, until we stand up an example.
One thing I want to add to this example is the ability to set and log a random seed for the training runs. I think it will be necessary for the experiments.
Description
Checklist
Dependencies
Review Process
All PRs are reviewed by the PhysicsNeMo team before merging.
Depending on which files are changed, GitHub may automatically assign a maintainer for review.
We are also testing AI-based code review tools (e.g., Greptile), which may add automated comments with a confidence score.
This score reflects the AI’s assessment of merge readiness and is not a qualitative judgment of your work, nor is
it an indication that the PR will be accepted / rejected.
AI-generated feedback should be reviewed critically for usefulness.
You are not required to respond to every AI comment, but they are intended to help both authors and reviewers.
Please react to Greptile comments with 👍 or 👎 to provide feedback on their accuracy.