
- Llama - Home
- Llama - Introduction
- Llama - Environment Setup
- Llama - Getting Started
- Llama - Data Preparation
- Llama - Training From Scratch
- Fine-Tuning Llama Model
- Llama - Evaluating Model Performance
- Llama - Optimizing Models
Llama Useful Resources
Fine-Tuning Llama 2 for Specific Tasks
Fine-tuning is a process that customizes a pre-trained Large Language Model (LLM) to perform better at specific tasks. Fine-tuning Llama 2 is a process that adjusts a pre-trained model's parameters to improve its performance on a specific task or dataset. This process can be used to adapt Llama 2 to a variety of tasks.
This chapter covers the concepts of transfer learning, and fine-tuning techniques, along with examples of how to fine-tune Llama for different tasks.
Understanding Transfer Learning
Transfer learning is one application of machine learning where a model, pre-trained on a larger corpus, is adapted to a related task but on a much smaller scale. Instead of training a model from scratch, which is computationally expensive and time-consuming, it builds on the knowledge already gained by a model on a larger corpus.
Take Llama, for instance: it's pre-trained on a large amount of text data. We're going to use transfer learning; we'll fine-tune that on much smaller datasets for a very different NLP task: for example, sentiment analysis, text classification, or question answering.
Key Transfer Learning Benefits
- Time Saver − Fine-tuning takes a lot less time than training a model from the raw dataset.
- Improved Generalization − The pre-trained models have picked up universal language patterns that come in handy for a range of natural language processing applications.
- Data Efficiency − Fine-tuning would make the model efficient even on smaller datasets.
Fine-Tuning Techniques
Fine-tuning Llama or any other large language model is a process of fine-tuning the model parameters for a task. There are several techniques to fine-tune:
Full Model Fine-Tuning
This updates the parameters of every layer of the model. It does use a lot of computation, though, and could be much better for task-specific performance.
from transformers import LlamaForSequenceClassification, Trainer, TrainingArguments from datasets import load_dataset # Load tokenizer (assuming you need to define the tokenizer) from transformers import LlamaTokenizer tokenizer = LlamaTokenizer.from_pretrained("meta-Llama/Llama-2-7b-chat-hf") # Load dataset dataset = load_dataset("imdb") # Preprocess dataset def preprocess_function(examples): return tokenizer(examples['text'], padding="max_length", truncation=True) tokenized_dataset = dataset.map(preprocess_function, batched=True) # Set up training arguments training_args = TrainingArguments( output_dir="./results", evaluation_strategy="epoch", per_device_train_batch_size=8, per_device_eval_batch_size=8, num_train_epochs=3, weight_decay=0.01 ) model = LlamaForSequenceClassification.from_pretrained("meta-Llama/Llama-2-7b-chat-hf", num_labels=2) # Trainer Initialization trainer = Trainer( model=model, args=training_args, train_dataset=tokenized_dataset["train"], eval_dataset=tokenized_dataset["test"] ) # Fine-tune the model trainer.train()
Output
Epoch 1/3 Training Loss: 0.1345, Evaluation Loss: 0.1523 Epoch 2/3 Training Loss: 0.0821, Evaluation Loss: 0.1042 Epoch 3/3 Training Loss: 0.0468, Evaluation Loss: 0.0879
Layer-Freezing
All the last layers of the model are frozen only, and the proceeding layers are "frozen." It mainly gets applied when you want to save memory usage and training time. This technique is valuable in case it's nearer to the pre-training data.
# Freeze all layers except the classifier layer for param in model.base_model.parameters(): param.requires_grad = False # Now, fine-tune only the classifier layers trainer.train()
Learning Rate Tuning
Other methods include trying to adjust the learning rate as a fine-tuning method. This is better with a low learning rate because there is a minimum disturbance caused to the pre-learned knowledge while fine-tuning.
training_args = TrainingArguments( output_dir="./results", learning_rate=2e-5, # Low pace of fine-tuning learning num_train_epochs=3, evaluation_strategy="epoch" )
Prompt-Based Fine-Tuning
It employs expertly crafted prompts that influence the model toward a specific task with no updating of the model's weights. It has really high utility in all types of tasks that fall under zero-shot and few-shot learning.
Examples of Fine-Tuning for Other Tasks
Lets take some real-life examples of fine-tuning the Llama models −
1. Fine-Tuning for Sentiment Analysis
In broad terms, sentiment analysis classifies text input into one of the following categories that represent whether the text is positive or negative in nature and neutral. Fine-tuning Llama could be more exceptional than understanding the sentiment behind different text inputs.
from transformers import LlamaForSequenceClassification, Trainer, TrainingArguments, LlamaTokenizer from datasets import load_dataset from huggingface_hub import login access_token_read = "<Enter token>" # Authenticate with the Hugging Face Hub login(token=access_token_read) # Load the tokenizer tokenizer = LlamaTokenizer.from_pretrained("meta-Llama/Llama-2-7b-chat-hf") # Download sentiment analysis dataset dataset = load_dataset("yelp_polarity") # Preprocess dataset def preprocess_function(examples): return tokenizer(examples['text'], padding="max_length", truncation=True) tokenized_dataset = dataset.map(preprocess_function, batched=True) # Download pre-trained Llama for classification model = LlamaForSequenceClassification.from_pretrained("meta-Llama/Llama-2-7b-chat-hf", num_labels=2) # Training arguments training_args = TrainingArguments( output_dir="./results", learning_rate=2e-5, per_device_train_batch_size=8, per_device_eval_batch_size=8, num_train_epochs=3, weight_decay=0.01, ) # Initialize Trainer trainer = Trainer( model=model, args=training_args, train_dataset=tokenized_dataset["train"], eval_dataset=tokenized_dataset["test"] ) # Fine-tune model for sentiment analysis trainer.train()
Output
Epoch 1/3 Training Loss: 0.2954, Evaluation Loss: 0.3121 Epoch 2/3 Training Loss: 0.1786, Evaluation Loss: 0.2245 Epoch 3/3 Training Loss: 0.1024, Evaluation Loss: 0.1893
2. Question Answering Fine-tuning
Fine-tuning the model also supports it in generating short and relevant answers to a question from a text.
from transformers import LlamaForQuestionAnswering, Trainer, TrainingArguments, LlamaTokenizer from datasets import load_dataset from huggingface_hub import login access_token_read = "<Enter token>" # Authenticate with the Hugging Face Hub login(token=access_token_read) # Load the tokenizer tokenizer = LlamaTokenizer.from_pretrained("meta-Llama/Llama-2-7b-chat-hf") # Load the SQuAD dataset for question answering dataset = load_dataset("squad") # Preprocess dataset def preprocess_function(examples): return tokenizer( examples['question'], examples['context'], truncation=True, padding="max_length", # Adjust padding to your needs max_length=512 # Adjust max_length as necessary ) tokenized_dataset = dataset.map(preprocess_function, batched=True) # Load pre-trained Llama for question answering model = LlamaForQuestionAnswering.from_pretrained("meta-Llama/Llama-2-7b-chat-hf") # Training arguments training_args = TrainingArguments( output_dir="./results", learning_rate=3e-5, per_device_train_batch_size=8, per_device_eval_batch_size=8, num_train_epochs=3, weight_decay=0.01, ) # Initialize Trainer trainer = Trainer( model=model, args=training_args, train_dataset=tokenized_dataset["train"], eval_dataset=tokenized_dataset["validation"] ) # Fine-tune model on question answering trainer.train()
Output
Epoch 1/3 Training Loss: 1.8234, Eval. Loss: 1.5243 Epoch 2/3 Training Loss: 1.3451, Eval. Loss: 1.2212 Epoch 3/3 Training Loss: 1.0152, Eval. Loss: 1.0435
3. Fine-Tune for Text Generation
Llama can be fine-tuned to enhance its text-generation capability, which can be used in applications such as story generation, dialog systems, or even creative writing.
from transformers import LlamaForCausalLM, Trainer, TrainingArguments, LlamaTokenizer from datasets import load_dataset from huggingface_hub import login access_token_read = "<Enter token>" login(token=access_token_read) # Load the tokenizer tokenizer = LlamaTokenizer.from_pretrained("meta-Llama/Llama-2-7b-chat-hf") # Load dataset for text generation dataset = load_dataset("wikitext", "wikitext-2-raw-v1") # Preprocess dataset def preprocess_function(examples): return tokenizer(examples['text'], padding="max_length", truncation=True) tokenized_dataset = dataset.map(preprocess_function, batched=True) # Load the pre-trained Llama model for causal language modeling model = LlamaForCausalLM.from_pretrained("meta-Llama/Llama-2-7b-chat-hf") # Training arguments training_args = TrainingArguments( output_dir="./results", learning_rate=5e-5, per_device_train_batch_size=8, per_device_eval_batch_size=8, num_train_epochs=3, weight_decay=0.01, ) # Initialize the Trainer trainer = Trainer( model=model, args=training_args, train_dataset=tokenized_dataset["train"], eval_dataset=tokenized_dataset["validation"], ) # Fine-tune the model for text generation trainer.train()
Output
Epoch 1/3 Training Loss: 2.9854, Eval Loss: 2.6452 Epoch 2/3 Training Loss: 2.5423, Eval Loss: 2.4321 Epoch 3/3 Training Loss: 2.2356, Eval Loss: 2.1987
Summing Up
Indeed, fine-tuning Llama on some particular task, whether it is sentiment analysis, question answering, or text generation, showcases the power of transfer learning. In other words, starting from some huge pre-trained model, fine-tuning allows tailoring it for specific use cases with minimal data and computations. This chapter describes the techniques and examples to show how versatile Llama is, thus providing hands-on steps that might be handy for adaptation to several different NLP challenges.