Skip to content

utkarsh-284/LLM-Learning-Journey

Repository files navigation

My AI/LLM Learning Journey for Finance & Consultancy

The Transformer Architecture

The Transformer Architecture: This diagram, from the original "Attention Is All You Need" paper by Vaswani et al. (2017), illustrates the key components of the Transformer model, including the Encoder and Decoder stacks, Multi-Head Attention layers, and Positional Encoding.

This repository documents my learning journey into the world of AI and Large Language Models (LLMs), with a specific focus on applications in Finance and Consultancy. The structure of this journey is based on the "AI/LLM Learning Plan for Finance & Consultancy Roles" document.

Here, I will share my notes, projects, and implementations as I progress through the learning plan.

Phase 1: Deep LLM Foundations (Weeks 1-4)

Week 1: Transformer Architecture Mastery

Learning Objectives:

  • Master transformer architecture from first principles
  • Understand attention mechanisms mathematically
  • Grasp positional encodings, layer normalization, and residual connections

Projects & Learnings:

  • Transformers from Scratch: This project implements the Transformer architecture from the ground up, as detailed in the seminal paper "Attention Is All You Need". The implementation is done using TensorFlow and provides a detailed, step-by-step guide to understanding the core components of a Transformer. This notebook serves as a practice guide for the concepts covered in the Sequence Models course from the DeepLearning.AI Natural Language Processing Specialization on Coursera.
    • Jupyter Notebook: Transformers-from-scratch.ipynb
    • Requirements: The requirements.txt file in the Phase 1 folder contains the necessary packages for this notebook.
    • Key Concepts Covered:
      • Positional Encodings
      • Masking (Padding and Look-Ahead)
      • Self-Attention (Scaled Dot Product Attention)
      • Encoder (Encoder Layer and Full Encoder)
      • Decoder (Decoder Layer and Full Decoder)
      • Transformer Assembly

Week 2: Build a Mini-Transformer

Learning Objectives:

  • Implement an Encoder-only Transformer model for a classification task.
  • Apply the model to a real-world financial dataset.
  • Evaluate the model's performance and identify areas for improvement.

Projects & Learnings:

  • Mini-Transformer for Financial Sentiment Analysis: This project involves building a smaller version of the Transformer model, using only the Encoder layer from the previously developed transformers_model.py, to perform sentiment analysis on financial news headlines. The model is trained on the financial_phrasebank dataset from HuggingFace.
    • Jupyter Notebook: Mini-Transformer.ipynb
    • Key Concepts Covered:
      • Using a pre-built Transformer Encoder.
      • Sentiment analysis as a classification task.
      • Data preprocessing for financial text.
      • Training and evaluating a Transformer-based model.
      • Analyzing model performance and suggesting improvements.

Week 3: Fine-tuning Mastery

Learning Objectives:

  • Understand different pre-training objectives (MLM, CLM, etc.)
  • Master fine-tuning strategies and when to use each
  • Learn about parameter-efficient fine-tuning (LoRA, Adapters)

Projects & Learnings:

  • Fine-tuning BERT for Financial Text Classification: This project explores two methods for fine-tuning a pre-trained BERT model for a financial sentiment analysis task: full fine-tuning and Parameter-Efficient Fine-Tuning (PEFT) using Low-Rank Adaptation (LoRA).
    • Jupyter Notebook: Fine_tune_BERT.ipynb
    • Performance Comparison Report: performance_comparison_report.md
    • Key Concepts Covered:
      • Full fine-tuning of a pre-trained BERT model.
      • Parameter-Efficient Fine-Tuning (PEFT) with LoRA.
      • Comparison of performance, training time, and trainable parameters between the two methods.
      • Cost analysis of the two fine-tuning approaches.
    • Results: The project demonstrates that LoRA can achieve performance comparable to full fine-tuning with significantly fewer trainable parameters, leading to faster training times and lower computational costs. The following image summarizes the comparison:
      • Financial BERT Comparison

Week 4: Advanced Fine-tuning Project

Learning Objectives:

  • Apply fine-tuning to complex financial NLP task
  • Master evaluation metrics for NLP models
  • Understand overfitting prevention in fine-tuning

Projects & Learnings:

  • (Add your notes and project links here)

Phase 2: Specialized NLP for Finance (Weeks 5-8)

Week 5-6: Financial NLP Tasks & Domain Adaptation

Learning Objectives:

  • Master financial text preprocessing and domain-specific challenges
  • Understand financial entity recognition and relationship extraction
  • Learn about financial document summarization and key information extraction

Projects & Learnings:

  • (Add your notes and project links here)

Week 7-8: Advanced NLP Applications

Learning Objectives:

  • Master question-answering systems for financial documents
  • Understand retrieval-augmented generation (RAG) systems
  • Learn about conversational AI for financial advisory

Projects & Learnings:

  • (Add your notes and project links here)

Phase 3: Large Language Models & Production (Weeks 9-12)

Week 9-10: Modern LLMs & Prompt Engineering

Learning Objectives:

  • Understand GPT family evolution (GPT-1 to GPT-4+)
  • Master prompt engineering techniques and best practices
  • Learn about in-context learning and few-shot prompting

Projects & Learnings:

  • (Add your notes and project links here)

Week 11-12: Model Deployment & MLOps for LLMs

Learning Objectives:

  • Learn LLM deployment strategies and optimization
  • Understand model serving, caching, and scaling
  • Master monitoring and evaluation of LLM applications

Projects & Learnings:

  • (Add your notes and project links here)

Phase 4: Portfolio Development & Job Preparation (Weeks 13-16)

Week 13-14: Capstone Project & Portfolio Enhancement

Learning Objectives:

  • Integrate all learned concepts into a comprehensive project
  • Create professional documentation and presentation materials
  • Optimize existing projects for maximum impact

Projects & Learnings:

  • (Add your notes and project links here)

Week 15-16: Interview Preparation & Industry Connections

Learning Objectives:

  • Master technical interviews for AI/ML roles
  • Understand business case studies relevant to finance/consulting
  • Build industry connections and personal brand

Projects & Learnings:

  • (Add your notes and project links here)

About

This repo will include all my learning notes and projects for LLM and NLP

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors