Skip to content

Running out of GPU memory after several minutes training  #7

@ganzhi

Description

@ganzhi

Hi,

I got a CUDA out of memory issue after several minutes training. Is there a way to fix it?

(py38) C:\Src\GitHub\MadMario>python main.py
Loading model at checkpoints\2021-02-20T16-13-06\trained_mario.chkpt with exploration rate 0.1
Episode 0 - Step 660 - Epsilon 0.1 - Mean Reward 2990.0 - Mean Length 660.0 - Mean Loss 0.0 - Mean Q Value 0.0 - Time Delta 10.198 - Time 2021-02-20T16:29:03
Episode 20 - Step 5262 - Epsilon 0.1 - Mean Reward 1311.095 - Mean Length 250.571 - Mean Loss 0.0 - Mean Q Value 0.0 - Time Delta 61.936 - Time 2021-02-20T16:30:05
Episode 40 - Step 9888 - Epsilon 0.1 - Mean Reward 1149.829 - Mean Length 241.171 - Mean Loss 0.0 - Mean Q Value 0.0 - Time Delta 62.843 - Time 2021-02-20T16:31:08
Episode 60 - Step 13407 - Epsilon 0.1 - Mean Reward 1072.361 - Mean Length 219.787 - Mean Loss 0.0 - Mean Q Value 0.0 - Time Delta 47.898 - Time 2021-02-20T16:31:56
Episode 80 - Step 19197 - Epsilon 0.1 - Mean Reward 1144.407 - Mean Length 237.0 - Mean Loss 0.0 - Mean Q Value 0.0 - Time Delta 77.715 - Time 2021-02-20T16:33:14
Episode 100 - Step 22474 - Epsilon 0.1 - Mean Reward 1060.12 - Mean Length 218.14 - Mean Loss 0.0 - Mean Q Value 0.0 - Time Delta 44.237 - Time 2021-02-20T16:33:58
Episode 120 - Step 26864 - Epsilon 0.1 - Mean Reward 1015.29 - Mean Length 216.02 - Mean Loss 0.0 - Mean Q Value 0.0 - Time Delta 58.86 - Time 2021-02-20T16:34:57
Episode 140 - Step 32109 - Epsilon 0.1 - Mean Reward 1094.56 - Mean Length 222.21 - Mean Loss 0.0 - Mean Q Value 0.0 - Time Delta 71.322 - Time 2021-02-20T16:36:08
Traceback (most recent call last):
File "main.py", line 59, in
action = mario.act(state)
File "C:\Src\GitHub\MadMario\agent.py", line 57, in act
state = torch.FloatTensor(state).cuda() if self.use_cuda else torch.FloatTensor(state)
RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 10.00 GiB total capacity; 7.56 GiB already allocated; 0 bytes free; 7.74 GiB reserved in total by PyTorch)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions