Β
Then pretraining and fine-Tuning the Large Language Model for Classification and Instruction.
Β
This repository contains code and documentation for the Large Language Model I built from scratch. I then fine-tunied the LLM after pretraining the Transformer-based Large Language Models (LLMs). It covers essential topics such as:
Β
Β
βββ dataprocessing.ipynb # Processing the data for the LLM
βββ transformer.ipynb # Transformer architecture implementation
βββ LLMcore.py # Core classes and functions for the LLM
βββ gpt_download.py # Dowload the gpt pretrained model parameters
βββ pretraining.ipynb # Pretraining a Transformer from scratch
βββ weightloading.ipynb # Load the weights fom the pretrained model
βββ finetuningclassification.ipynb # Fine-tuning on classification tasks
βββ finetuninginstruction.ipynb # Fine-tuning for instruction following
βββ README.md # Documentation
Β
Ensure you have the required dependencies installed before running the notebooks.
pip install torch transformers datasets
πΉ Fine-Tuning Classification ModelThe classification fine-tuning was performed using GPT-2 124M parameters, with pretrained weights loaded before fine-tuning. Run the finetuningclassification.ipynb notebook to train a Transformer-based classifier. πΉ Fine-Tuning for Instruction FollowingThe instruction fine-tuning was performed on GPT-2 300M medium parameters, using pretrained weights loaded before fine-tuning. Use the finetuninginstruction.ipynb notebook to fine-tune an LLM for instruction-following tasks.
Β
Β
Run theΒ finetuningclassification.ipynb
Β notebook to train a Transformer-based classifier.
Β
Use theΒ finetuninginstruction.ipynb
Β notebook to fine-tune an LLM for instruction-following tasks.
Β
To pretrain a Transformer model from scratch, execute theΒ pretraining.ipynb
Β notebook.
Β
TheΒ transformer.ipynb
Β notebook provides an in-depth implementation of Transformer blocks, including:
Β
Β
β Accuracy Scores for the classification fine tuned LLM:
Training Accuracy:Β 94.81%
Validation Accuracy:Β 96.53%
Test Accuracy:Β 92.57%
β Accuracy Scores for the instruction fine tuned LLM:
Accuracy Score: *45.84Β as adjudicated by ‘gpt 3.5 turbo’ LLM model.
Room for improvement via modulation of the hyperparameters- learning rate, batch size, cosine decay and LoRA and model size.
π Pretraining Loss Curve:
π₯ Temperature Scaling in Pretraining:
π Loss Curves for Classification Fine-Tuning:
π Classification Fine-Tuning Performance:
π Instruction Fine-Tuning Results:
Β
The learning rate is adjusted using cosine decay for stable convergence.
Β
To prevent instability, gradients are clipped during backpropagation.
Β
LoRA is implemented to enable efficient fine-tuning with minimal computational cost.
Β
Β
π Contributions are welcome! Please feel free to submit issues or pull requests.
Β
This project is licensed under theΒ MIT License.
Β© 2025 Syed Faizan. All Rights Reserved.