Feat/scaling laws notebook setup #612

philinhphan · 2025-06-03T22:53:41Z

No description provided.

This commit completes the setup of the "scaling_laws.ipynb" notebook as per the issue requirements. Key changes include: - Creation of a dummy "gutenberg_poetry.txt" and its processed version in "data/gutenberg_poetry/" for initial experimentation. - Added a notebook cell to calculate and display non-embedding parameters (N) for predefined GPT models, following Chinchilla guidelines (N = Total - WTE - WPE, accounting for tied lm_head). - Added notebook cells to derive the formula for training steps (S) and to calculate S and total tokens (D) for various model sizes and compute budgets. - Replaced the original Task 3 placeholder with a comprehensive markdown guide detailing how to: - Perform model training using `train.py`. - Record final validation losses. - Plot L vs. N. - Extract N_opt and D_opt. - Fit scaling laws (N_opt vs. C, D_opt vs. C) to derive parameters N0, a, D0, and b. The notebook is now structured for you to replace the dummy dataset, run the preparatory calculations, and follow the guide to perform the full scaling law analysis.

philinhphan and others added 2 commits June 3, 2025 23:57

task file added

863b333

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat/scaling laws notebook setup #612

Feat/scaling laws notebook setup #612

philinhphan commented Jun 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Feat/scaling laws notebook setup #612

Are you sure you want to change the base?

Feat/scaling laws notebook setup #612

Conversation

philinhphan commented Jun 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant