Skip to content

Numeric array as atom#3

Closed
bdlucas1 wants to merge 9 commits into
masterfrom
numeric-array-as-atom
Closed

Numeric array as atom#3
bdlucas1 wants to merge 9 commits into
masterfrom
numeric-array-as-atom

Conversation

@bdlucas1
Copy link
Copy Markdown
Owner

@bdlucas1 bdlucas1 commented Nov 6, 2025

WIP implementation of NumericArray as atom

Ignore tests - they are not correct yet

Comment thread test/builtin/test_numericarray.py Outdated
@@ -0,0 +1,68 @@
import numpy as np
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TESTS ARE WRONG - IGNORE FOR NOW

Copy link
Copy Markdown
Collaborator

@rocky rocky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some general comments.

Look at Mathics3#1505. And when that gets merged, Changes to mathics.list.constructing probably won't be needed.

A "SORT_KEY" will probably need to be defined. See https://github.com/Mathics3/mathics-core/blob/b71f487160472f46b227b37899815dabc3acda48/mathics/core/keycomparable.py#L8-L39 . A better name for this probably would have been "ELEMENT_ORDER". The WMA Builtin function "Order[]" can probably be used to figure out where this fits in the somewhat arbitrary heirarchy, e.g. Before or after ByteArray.

There is of course a lot of little cleanup to be done, some of which you've already noted.

@bdlucas1
Copy link
Copy Markdown
Owner Author

bdlucas1 commented Nov 8, 2025

And when that gets merged, Changes to mathics.list.constructing probably won't be needed.

OK. If I understand correctly, it means that NumericArray will have a lazily instantiated property .elements that returns a list or tuple of NumericArrays, each wrapping a (view on a) subarray of the numpy array, or a list or tuple of numeric types (Integer, etc.) in the case of the .elements for a 1-d array.

That's very cool from the perspective of maintainability, but I have a slight concern about performance for Normal as this requires an extra step of wrapping the subarrays in NumericArray, although Normal applied to a large NumericArray is (very) expensive anyway, so maybe it doesn't matter so much.

The bigger performance concern is that this must never happen incidentally, i.e. some code unwittingly looking at .elements, else performance will suffer (possibly enormously). This was one of the issues with NumpyArrayListExpression - there were several places in the code, even during construction, that innocently looked at .elements. Naming this e.g. .parts instead of .elements might help in that regard.

For Normal by the way this has to happen recursively for NumericArray, but I'm not sure it is in the code currently.

This same treatment could be applied to Part.

However from my perspective all of the above is nice-to-have but not really essential for now - NumericArray without Normal, Part, First etc. would still be useful for GraphicsComplex (and eventually CSG). I'll remove the Normal stuff from this PR and assume it will be handled somehow, somewhere, somewhen.

@bdlucas1
Copy link
Copy Markdown
Owner Author

bdlucas1 commented Nov 8, 2025

A "SORT_KEY" will probably need to be defined

There's an element_order property that was created by Codex and I just left it - that's not what is wanted?

@bdlucas1
Copy link
Copy Markdown
Owner Author

bdlucas1 commented Nov 8, 2025

wrapping a (view on a) subarray of the numpy array

The validity of using a view instead of a copy depends on a) whether long-term references to .elements are stored by anyone and b) whether we consider NumericArrays to be immutable.

On the latter point I didn't see anything in the WL documentation, but the implementation seems to disallow assignments like na[[1,1]]=17 (although the error message is weird - "Partspecification x〚1,1〛is longer than depth of object". I'm fine if we treat NumericArray as immutable (and I think there's a flag that can be set in numpy.ndarray to disallow writes).

Comment thread mathics/core/atoms.py

@property
def element_order(self) -> tuple:
return (BASIC_ATOM_STRING_OR_BYTEARRAY_SORT_KEY, *self._summary)
Copy link
Copy Markdown
Collaborator

@rocky rocky Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name BASIC_ATOM_SRING_OR_BYTEARRAY_SORT_KEY is changed in Mathics3#1505 so this will need to be adjust.

But the reason this is not correct is that, in Python, you can't compare a NumericArray with a String, even though they have similar structure (have elements).

Exprementation using the WMA function Order[] can be used to find out which of these comes (somewhat arbitrarily) first. And then how that compares with ByteArray, and this then gets set in mathics.core.keycomparable.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, right now I am thinking the SORT_KEY name to ELEMENT_ORDER across the board.

@bdlucas1 bdlucas1 force-pushed the numeric-array-as-atom branch from 1f95f16 to 1d92e34 Compare November 9, 2025 14:07
@rocky
Copy link
Copy Markdown
Collaborator

rocky commented Nov 9, 2025

And when that gets merged, Changes to mathics.list.constructing probably won't be needed.

OK. If I understand correctly, it means that NumericArray will have a lazily instantiated property .elements that returns a list or tuple of NumericArrays, each wrapping a (view on a) subarray of the numpy array, or a list or tuple of numeric types (Integer, etc.) in the case of the .elements for a 1-d array.

That's very cool from the perspective of maintainability, but I have a slight concern about performance for Normal as this requires an extra step of wrapping the subarrays in NumericArray, although Normal applied to a large NumericArray is (very) expensive anyway, so maybe it doesn't matter so much.

The bigger performance concern is that this must never happen incidentally, i.e. some code unwittingly looking at .elements, else performance will suffer (possibly enormously). This was one of the issues with NumpyArrayListExpression - there were several places in the code, even during construction, that innocently looked at .elements. Naming this e.g. .parts instead of .elements might help in that regard.

For Normal by the way this has to happen recursively for NumericArray, but I'm not sure it is in the code currently.

This same treatment could be applied to Part.

However from my perspective all of the above is nice-to-have but not really essential for now - NumericArray without Normal, Part, First etc. would still be useful for GraphicsComplex (and eventually CSG). I'll remove the Normal stuff from this PR and assume it will be handled somehow, somewhere, somewhen.

As you see with ByteArray, there are lots of little things in adding any Atom or new Builtin Function. It is impossbile and probably not desirable to get everything done in one mongo PR. Instead, we've been merging early and often.

@bdlucas1
Copy link
Copy Markdown
Owner Author

bdlucas1 commented Nov 9, 2025

I screwed this PR up so closing in favor of #9

@bdlucas1 bdlucas1 closed this Nov 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants