Hi Andrej and NanoGPT community,
I wanted to share a beginner-friendly NanoGPT fork called Learn-nanoGPT by niloydebbarma-code. It’s focused on Shakespeare character-level text generation and offers several practical features:
-
Automatically detects whether you’re using a GPU or CPU and adjusts the model size and training parameters accordingly.
-
Supports multiple tokenization methods, including character-level, BPE, and compressed formats.
-
Provides clear documentation and robust error handling to help newcomers get started easily.
-
Training times are significantly reduced compared to the original NanoGPT—about 15-20 minutes on an NVIDIA P100 GPU and approximately 1 to 1.5 hours on CPU for 1000 iterations.
-
Includes Flash Attention support on GPU but works smoothly on CPU without extra configuration.
-
Optimized for educational use and quick experimentation, making it ideal for those with limited hardware resources.
I believe this fork could be useful for the community, especially for users looking for a simpler, hardware-adaptive way to train GPT models on classic text datasets.
Thanks for the amazing work on NanoGPT.