MNIST Trainer bug fix by longforu · Pull Request #12 · tinygrad/teenygrad

longforu · 2025-08-01T01:21:05Z

Potential bug fix for MNIST trainer.

When cloning the repo and running it on the latest numpy, this error occurs:

Traceback (most recent call last):
  File "/Users/longtran/Programs/teenygrad/mnist.py", line 90, in <module>
    train(model, X_train, Y_train, optimizer, steps=100)
    ~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/longtran/Programs/teenygrad/mnist.py", line 22, in train
    loss = lossfn(out, y)
  File "/Users/longtran/Programs/teenygrad/mnist.py", line 10, in <lambda>
    def train(model, X_train, Y_train, optim, steps, BS=128, lossfn=lambda out,y: out.sparse_categorical_crossentropy(y),
                                                                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
  File "/Users/longtran/Programs/teenygrad/teenygrad/tensor.py", line 794, in sparse_categorical_crossentropy
    loss_mask = Y != ignore_index
                ^^^^^^^^^^^^^^^^^
  File "/Users/longtran/Programs/teenygrad/teenygrad/tensor.py", line 753, in __ne__
    def __ne__(self, x) -> Tensor: return (self<x) + (self>x)   # type: ignore
                                           ^^^^^^
  File "/Users/longtran/Programs/teenygrad/teenygrad/tensor.py", line 749, in __lt__
    def __lt__(self, x) -> Tensor: return mlops.Less.apply(*self._broadcasted(x, False))
                                                            ~~~~~~~~~~~~~~~~~^^^^^^^^^^
  File "/Users/longtran/Programs/teenygrad/teenygrad/tensor.py", line 661, in _broadcasted
    y = Tensor(y, device=self.device, requires_grad=False, dtype=self.dtype if self.dtype != dtypes.bool and self.dtype.__class__ is not ImageDType else dtypes.float32)
  File "/Users/longtran/Programs/teenygrad/teenygrad/tensor.py", line 62, in __init__
    data = LazyBuffer.loadop(LoadOps.CONST, tuple(), dtype or Tensor.default_type, device, data)
  File "/Users/longtran/Programs/teenygrad/teenygrad/lazy.py", line 35, in loadop
    elif op == LoadOps.CONST: return LazyBuffer(np.full(shape, arg, dtype=dtype.np))
                                                ~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/longtran/Programs/teenygrad/env/lib/python3.13/site-packages/numpy/_core/numeric.py", line 387, in full
    multiarray.copyto(a, fill_value, casting='unsafe')
    ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OverflowError: Python integer -1 out of bounds for uint8

This is because there is a comparison between an unsigned int type tensor with -1 which cannot be broadcasted into a signed int. This can be fixed with a cast.

The other change is to the loss function. Cross entropy loss should be - logprob. Without this change the trainer did not converge for me.

syzygy137 · 2025-08-11T05:13:24Z

I got this error too

0danylo

This fixed it for me as well. Could help for people looking at this repo.

Fix mnist trainer

73323da

0danylo approved these changes Nov 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MNIST Trainer bug fix#12

MNIST Trainer bug fix#12
longforu wants to merge 1 commit into
tinygrad:mainfrom
longforu:fix-mnist-trainer

longforu commented Aug 1, 2025

Uh oh!

syzygy137 commented Aug 11, 2025

Uh oh!

0danylo left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

longforu commented Aug 1, 2025

Uh oh!

syzygy137 commented Aug 11, 2025

Uh oh!

0danylo left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants