Skip to content

Conversation

@derpyplops
Copy link
Collaborator

@derpyplops derpyplops commented Jul 13, 2023

Solves NOT-291

This is quite a complex change, but this basically aims to train a reporter model per prompt, then evaluate it both on each individual prompt as well as with the mean credence. I should probably add on some tests for the new file structure as well.

the new flag is --probe_per_prompt added on Run

To test you can do like elk elicit gpt2 imdb --num_gpus 2 --probe_per_prompt with and without the flag.
elk eval should also work.

@derpyplops derpyplops force-pushed the not-291-train-probe-per-prompt branch from daec121 to 2420ae0 Compare July 14, 2023 15:40
@derpyplops derpyplops marked this pull request as ready for review July 20, 2023 15:44
Copy link
Collaborator

@lauritowal lauritowal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll run the sweeps first and see if it improves anything. If yes we can review this + merge it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants