Environmental impact of running BERT

Analysis of data

Notebooks

power_monitor_analysis/ExtractReading - get reading from power monitor for a time interval. The power monitor writes data in a database every 3 seconds.
Fine-tuningAnalysis - extracts data from nvidia-smi, power monitor and combine for analysis. Time-based models are compared to empirical values from power monitor. Carbon footprint is calculated. Also plotted the dataset size relationship with energy and time
RunInference - run inference on MRPC, CoLA and STS-B models fine-tuned earlier.
InferenceAnalysis - extracts data from nvidia-smi and power monitor for inference and combine for analysis. Time-based models are compared to empirical values from power monitor. Overall carbon footprint is calculated and combined with pre-training and fine-tuning.
CompareTimeModels - compare the time-based models.
Merges all data from pre-training, fine-tuning and inference to test scaling with time for models compared to analytical models
nvidia-smi data exploration - extract data from nvidia-smi for fine-tuning tasks and initial exploration.
Time series data stationary test - data exploration and test with ADF

Data collection by training and inference

Requirements:

Python 3.6+
TensorFlow 2.2.0
Pytorch 1.5.0
Cuda 10.2

Virtual environment

download miniconda and set the paths
conda update conda
conda create -n venv python=3.7
conda install -n venv jupyter scipy numpy matplotlib tensorflow-gpu tensorflow-hub seaborn
conda activate venv

Pre-train

Version issues :(

converted tf1 code to tf2 with tf_upgrade_v2.
tensorflow/tensorflow#26854

Steps:

Download model sh download_uncased_base.sh
Get wiki data from https://github.com/pytorch/examples/tree/master/word_language_model/data
Preprocess data sh pretrain_data.sh
Run training sh pre_train.sh
OR
train and record power data
sh pretrain_and_record_power.sh

Pre-train with more data

google-research/bert#341
https://github.com/dsindex/bert

Download wiki dump
Extract using https://github.com/attardi/wikiextractor
python ../wikiextractor/WikiExtractor.py /media/data/wikidownload.xml.bz2 --output /media/data/wikidump --processes 1 -q
Clean using
bash create_pretraining_data.sh

May need to install and import nltk
pip install nltk
import nltk
nltk.download('punkt')
Run pretraining sh pretrain_large.sh

Data collection for fine-tune training

Runs fine-tune training and record power draw and utilisation with nvidia-smi

sh train_and_record_power.sh task batchsize maxSeqLength model(cased/uncased)

sh train_and_record_power.sh CoLA 32 128 bert-base-cased

Example to fine-tune on MRPC:

Get model - download from https://github.com/google-research/bert
Get data using download_glue_data.py

python download_glue_data.py --data_dir data --tasks MRPC
Prepare fine tune data using sudo sh fine_tune.sh
(edit fields)
Run python bert_finetune.py

https://github.com/tensorflow/models/tree/master/official/nlp/bert
https://github.com/tensorflow/models/tree/master/official/nlp/bert#process-datasets

huggingface transformer example

For pytorch implementation

pip install statsmodels
git clone https://github.com/huggingface/transformers

cd transformers

pip install .
pip install -r ./examples/requirements.txt

(git pull
pip install --upgrade .)
download data as in tensorflow example. No need to download model separately
cd ..
sh fine_tune_example.sh MRPC 32

Task argument can be CoLA, SST-2, MRPC, STS-B, QQP, MNLI, QNLI, RTE, WNLI
Second argument, batch size can 16, 32, 64, etc
Record gpu utilization details
sh nvidiasmi.sh

Language modelling

Download wikitext2 from https://blog.einstein.ai/the-wikitext-long-term-dependency-language-modeling-dataset/

sh mlm_fine_tune_bert.sh

Name		Name	Last commit message	Last commit date
Latest commit History 163 Commits
TrainFromScratch		TrainFromScratch
bert-master_v2		bert-master_v2
inferences		inferences
logs		logs
modelsFT		modelsFT
nvidia		nvidia
nvidia_inference		nvidia_inference
nvidia_july		nvidia_july
nvidia_pretrain		nvidia_pretrain
power_monitor_analysis		power_monitor_analysis
viz		viz
wikitext-2-raw		wikitext-2-raw
.gitignore		.gitignore
BERT cola.ipynb		BERT cola.ipynb
Bert finetuning on MRPC data.ipynb		Bert finetuning on MRPC data.ipynb
CompareTimeModels.ipynb		CompareTimeModels.ipynb
Fine-tuningAnalysis.ipynb		Fine-tuningAnalysis.ipynb
InferenceAnalysis.ipynb		InferenceAnalysis.ipynb
NLP model energy estimates.ipynb		NLP model energy estimates.ipynb
Pretrain_large_dataset.ipynb		Pretrain_large_dataset.ipynb
README.md		README.md
RunInference.ipynb		RunInference.ipynb
Time series data stationary test.ipynb		Time series data stationary test.ipynb
bert_finetune.py		bert_finetune.py
create_finetuning_data.py		create_finetuning_data.py
create_pretraining_data.py		create_pretraining_data.py
create_pretraining_data.sh		create_pretraining_data.sh
download_glue_data.py		download_glue_data.py
download_squad.sh		download_squad.sh
download_uncased_base.sh		download_uncased_base.sh
downloadwikidump.sh		downloadwikidump.sh
fine_tune.sh		fine_tune.sh
fine_tune_example.sh		fine_tune_example.sh
finetune_tf1.sh		finetune_tf1.sh
july-nvidia-smi-data.ipynb		july-nvidia-smi-data.ipynb
nvidia-smi data exploration.ipynb		nvidia-smi data exploration.ipynb
nvidiasmi.sh		nvidiasmi.sh
pre_train.sh		pre_train.sh
preprocess.py		preprocess.py
pretrain_and_record_power.sh		pretrain_and_record_power.sh
pretrain_data.sh		pretrain_data.sh
pretrain_for_different_time.sh		pretrain_for_different_time.sh
pretrain_large.sh		pretrain_large.sh
pretraining.ipynb		pretraining.ipynb
requirements.txt		requirements.txt
run_glue.py		run_glue.py
run_pretraining.py		run_pretraining.py
run_pretraining_old.py		run_pretraining_old.py
tokenization.py		tokenization.py
train_and_record_power.sh		train_and_record_power.sh
train_for_different_time.sh		train_for_different_time.sh
upgrade.ipynb		upgrade.ipynb
wikitext.txt		wikitext.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Environmental impact of running BERT

Analysis of data

Notebooks

Data collection by training and inference

Virtual environment

Pre-train

Pre-train with more data

Data collection for fine-tune training

Example to fine-tune on MRPC:

huggingface transformer example

Language modelling

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

MScDissertation/NLP

Folders and files

Latest commit

History

Repository files navigation

Environmental impact of running BERT

Analysis of data

Notebooks

Data collection by training and inference

Virtual environment

Pre-train

Pre-train with more data

Data collection for fine-tune training

Example to fine-tune on MRPC:

huggingface transformer example

Language modelling

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages