LinkedIn Post Generator

This tool uses AI and Large Language Models (LLMs) like LLaMA and Groq to scrape LinkedIn posts and create new, engaging content. It learns from successful posts using few-shot learning and generates posts that match their style. With natural language processing (NLP), it understands the text and context, making the content relevant and personalized in multiple languages. Powered by LLaMA3 and Groq, it creates human-like posts for your LinkedIn needs.

Overview

This project consists of two main components:

LinkedIn Scraper: Collects posts from LinkedIn profiles
Post Generator: Analyzes existing posts and generates new ones using LLM (Groq/Llama3)

The system uses few-shot learning techniques to understand what makes LinkedIn posts engaging, and generates new content following similar patterns.

Features

LinkedIn Post Scraping:
- Automated login and data collection
- Extraction of post text, engagement metrics (likes, comments, shares)
- Profile activity analysis
- Data export in multiple formats (JSON)
Post Analysis:
- Automatic language detection (English, Indonesian, Mixed)
- Content tagging and categorization
- Length analysis (Short, Medium, Long)
- Engagement metrics processing
AI-Powered Post Generation:
- Topic-based content creation
- Multi-language support (English, Indonesian, Mixed)
- Configurable length and style
- Example-based learning from successful posts

Project Structure

linkedin-post-generator/
├── data/
│   ├── processed_posts.json  # Enriched post data with metadata
│   └── raw_posts.json        # Raw scraped posts
├── linkedin-scraper/
│   ├── credentials.py        # LinkedIn login credentials (not included in repo)
│   ├── linkedin-scraper.py   # Scraping logic
│   └── [output files]        # Generated during scraping
├── .env                      # Environment variables for API keys
├── few_shot.py               # Few-shot learning implementation
├── llm_helper.py             # LLM integration utilities
├── main.py                   # Application entry point
├── post_generator.py         # Post generation logic
├── preprocess.py             # Data preprocessing utilities
└── requirements.txt          # Python dependencies

Installation

Clone the repository:

git clone https://github.com/caernations/linkedin-post-generator
cd linkedin-post-generator

Install dependencies:
```
pip install -r requirements.txt
```
Create a .env file with your API key:
```
GROQ_API_KEY=your_groq_api_key_here
```
For LinkedIn scraping, create a credentials.py file in the linkedin-scraper directory:
```
username = "your_linkedin_email"
password = "your_linkedin_password"
```

Usage

Scraping LinkedIn Posts

Update the profile URL in linkedin-scraper.py:

profile_url = 'https://www.linkedin.com/in/target_profile/'

Run the scraper:

python linkedin-scraper/linkedin-scraper.py

The scraper will:
- Login to LinkedIn using your credentials
- Navigate to the specified profile
- Scroll through their activity to load posts
- Extract and save post data to JSON files

Processing Raw Posts

To enrich raw posts with metadata:

python preprocess.py

This will:

Extract language, line count and tags from each post
Unify and categorize tags
Save the processed data to data/processed_posts.json

Generating New Posts

To generate a LinkedIn post:

from post_generator import generate_post

# Parameters:
# - Length: "Short" (1-5 lines), "Medium" (6-10 lines), "Long" (11-15 lines)
# - Language: "English", "Indonesia", "Mixed"
# - Tag: Topic category like "Personal", "Technology", etc.

post = generate_post("Medium", "Mixed", "Software Engineering")
print(post)

Running the Application

Start the main application:

streamlit run main.py

Requirements

To run this project, ensure you have Python 3.8+ installed and the following Python packages:

selenium – for web automation and scraping
beautifulsoup4 – for parsing HTML content
pandas – for data manipulation and analysis
python-dateutil – for flexible date parsing
langchain==0.2.14 – for building with LLMs
langchain-core==0.2.39 – core components for LangChain
langchain-community==0.2.12 – community-contributed LangChain modules
langchain-groq==0.1.9 – integration with Groq LLMs
streamlit==1.35.0 – for building interactive frontends
python-dotenv – to manage environment variables

Notes

The LinkedIn scraper requires a valid LinkedIn account
Excessive scraping may lead to LinkedIn rate limiting or account restrictions
For better post generation, collect at least 50-100 posts as examples
The quality of generated posts depends on the quality and quantity of scraped examples

Acknowledgements

This project uses Llama3-70B via the Groq API

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LinkedIn Post Generator

Overview

Features

Project Structure

Installation

Usage

Scraping LinkedIn Posts

Processing Raw Posts

Generating New Posts

Running the Application

Requirements

Notes

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
linkedin-scraper		linkedin-scraper
.gitignore		.gitignore
README.md		README.md
few_shot.py		few_shot.py
image.png		image.png
llm_helper.py		llm_helper.py
main.py		main.py
post_generator.py		post_generator.py
preprocess.py		preprocess.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

LinkedIn Post Generator

Overview

Features

Project Structure

Installation

Usage

Scraping LinkedIn Posts

Processing Raw Posts

Generating New Posts

Running the Application

Requirements

Notes

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages