GitHub - HROlive/Large-Language-Models-on-Supercomputers: Comprehensive exploration of LLMs, including cutting-edge techniques and tools such as parameter-efficient fine-tuning (PEFT), quantization, zero redundancy optimizers (ZeRO), fully sharded data parallelism (FSDP), DeepSpeed, and Huggingface accelerate.

Description

During the last three years, interest in Large Language Models (LLMs) has experienced a meteoric rise, leaving virtually no domain untouched. The complexity of the models themselves, however, has increased to such an extent, that access to powerful computing resources has become a requirement for anyone wanting to develop products with this novel approach.

In this intensive two half-day course, participants will dive into the world of LLMs and their development on supercomputers. From covering the fundamentals to hands-on implementations, this course offers a comprehensive exploration of LLMs, including cutting-edge techniques and tools such as parameter-efficient fine-tuning (PEFT), quantization, zero redundancy optimizers (ZeRO), fully sharded data parallelism (FSDP), DeepSpeed, and Huggingface accelerate.

By the end of this course, participants will have gained the understanding, knowledge and practical skills to develop LLMs effectively on supercomputers, empowering them to tackle challenging natural language processing tasks across various domains.

This course is jointly organized by the VSC Research Center, TU Wien, and EuroCC Austria.

Information

The overall goals of this course were the following:

Introduction to LLMs (Overview, Huggingface Ecosystem, Transformer Anatomy, Tokenization & Embeddings);

Memory-efficient Training (Quantization, PEFT, unsloth, Hands-on example);

Distributed Training (Huggingface Accelerate, ZeRO, FSDP & DeepSpeed);

Evaluation (Methods & Metrics. Monitoring, Inference);

More detailed information and links for the course can be found on the course website.

License

License: CC BY-SA 4.0 (Attribution-ShareAlike), see https://creativecommons.org/licenses/by-sa/4.0/legalcode

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
images		images
slides		slides
D1_01_Overview.ipynb		D1_01_Overview.ipynb
D1_02_Huggingface_Ecosystem.ipynb		D1_02_Huggingface_Ecosystem.ipynb
D1_03_Transformer_Architecture.ipynb		D1_03_Transformer_Architecture.ipynb
D1_04_Tokenization_and_Embeddings.ipynb		D1_04_Tokenization_and_Embeddings.ipynb
D1_05_Hands-on_1.ipynb		D1_05_Hands-on_1.ipynb
D1_06_Quantization.ipynb		D1_06_Quantization.ipynb
D1_07_PEFT.ipynb		D1_07_PEFT.ipynb
D1_08_Unsloth.ipynb		D1_08_Unsloth.ipynb
D1_09a_Hands-on_2.ipynb		D1_09a_Hands-on_2.ipynb
D2_01_DDP_example.ipynb		D2_01_DDP_example.ipynb
D2_02_Accelerate.ipynb		D2_02_Accelerate.ipynb
D2_03_DeepSpeed.ipynb		D2_03_DeepSpeed.ipynb
D2_04_FSDP_example.ipynb		D2_04_FSDP_example.ipynb
D2_05_Evaluation_and_Metrics.ipynb		D2_05_Evaluation_and_Metrics.ipynb
D2_06_Inference.ipynb		D2_06_Inference.ipynb
D2_07_Gradio_example.ipynb		D2_07_Gradio_example.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of Contents

Description

Information

License

About

Releases

Packages

Languages

HROlive/Large-Language-Models-on-Supercomputers

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

Description

Information

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages