This line on the dataloader create a segmentation fault while using the dataloader on mac/mps platform
scratch = torch.tensor(tokens, dtype=torch.int64, pin_memory=True)
changing pin_memory to False works as a temporary fix.
I'm going to investigate more if this setting create problems