Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use pre-made model #3

Closed
ccallahan opened this issue Mar 19, 2021 · 1 comment
Closed

How to use pre-made model #3

ccallahan opened this issue Mar 19, 2021 · 1 comment

Comments

@ccallahan
Copy link

Hey! Admittedly I just came across this, but how hard would it be to use a model I've already generated?

@johnnymcmike
Copy link

Hi, was in your same boat for a while until I figured it out. Assuming that you're like me and trying to use the bot with a model you've already finetuned, you've gotta clone the repo, start from scratch, and then copy your .tar file of the checkpoint to that directory. Decompress it so that it this directory now has a "checkpoint" folder in it.

From here you want to download the model that your pretrained model is based on. This will be 124M, 355M, etc. To do this you want to (in that same directory) enter the python interpreter and then run gpt2.download_gpt2(model_name="124M") for example. After this, you want to exit python and then change the "MODEL_NAME" variable in main.py to whatever you just downloaded in that command.

If all goes well, after this you'll be able to run python main.py -t and test out your model. Not sure this is the cleanest or easiest or correctest way of doing this, but it worked for me and I'll respond to this if I discover something better.

@NickBrisebois NickBrisebois pinned this issue May 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants