Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a asset #160

Open
russellkim opened this issue Mar 30, 2024 · 2 comments · May be fixed by #167
Open

Add a asset #160

russellkim opened this issue Mar 30, 2024 · 2 comments · May be fixed by #167

Comments

@russellkim
Copy link

SOLAR - https://arxiv.org/abs/2312.15166

name: SOLAR
organization: Upstage.ai
description:

We present a methodology for scaling LLMs called depth up-scaling (DUS) , which encompasses architectural modifications and continued pretraining. In other words, we integrated Mistral 7B weights into the upscaled layers, and finally, continued pre-training for the entire model.

SOLAR-10.7B has remarkable performance. It outperforms models with up to 30B parameters, even surpassing the recent Mixtral 8X7B model. For detailed information, please refer to the experimental table. Solar 10.7B is an ideal choice for fine-tuning. SOLAR-10.7B offers robustness and adaptability for your fine-tuning needs. Our simple instruction fine-tuning using the SOLAR-10.7B pre-trained model yields significant performance improvements (SOLAR-10.7B-Instruct-v1.0).

created date: 2023
url: https://arxiv.org/abs/2312.15166
model card: https://huggingface.co/upstage/SOLAR-10.7B-v1.0
modality: text
analysis:
size: 10.7B

@rishibommasani
Copy link
Contributor

Thanks, this looks great - could you add a PR @russellkim?

@russellkim
Copy link
Author

@rishibommasani Thanks, please, review it. #167

@rishibommasani rishibommasani linked a pull request Jan 17, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants