Shark Tank AI Analyzer

A comprehensive AI-powered analysis system for Shark Tank pitches using LangGraph and Streamlit. This system provides intelligent investment analysis, pitch evaluation, and success prediction based on historical Shark Tank data from US, India, and Australia.

Features

Multi-Agent Analysis System

Query Classifier: Routes queries to appropriate analysis paths
Country Analyzer: Analyzes patterns across different countries
Shark Profiler: Profiles individual shark investment strategies
Industry Analyzer: Identifies hot industries and success patterns
Pitch Evaluator: Evaluates specific pitches and ideas
ML Analyzer: Advanced machine learning predictions
Success Predictor: Predicts deal success probability
Recommendation Engine: Generates actionable recommendations

Advanced Analytics

Machine Learning Models: Random Forest classifier for success prediction
Interactive Visualizations: Plotly charts for data exploration
Real-time Analysis: Live processing of pitch queries
Comprehensive Reports: Detailed markdown reports with insights

Interface

Natural Language Queries: Ask questions in plain English
File Upload Support: Upload pitch decks, business plans, or data files
Chat History: Track all previous analyses
Export Functionality: Download reports and analysis data

Project Structure

Sharktank_GPT_Streamlit/
├── streamlit_app.py              # Main Streamlit application
├── langgraph_workflow.py         # LangGraph workflow implementation
├── groq_integration.py           # Groq LLM integration
├── advanced_analysis.py          # ML and advanced analytics
├── config.py                     # Configuration settings
├── requirements.txt              # Python dependencies
├── README.md                     # This file
├── Shark Tank US dataset.csv     # US dataset
├── Shark Tank India.csv          # India dataset
├── Shark Tank Australia dataset.csv  # Australia dataset
└── shark_tank_merged.csv         # Merged dataset

Prerequisites

Python 3.8 or higher
Required CSV dataset files in the project directory
Groq API key (get from https://console.groq.com/)

Installation

Option 1: Direct Installation

Clone or download the project files:

git clone https://github.com/yourusername/Sharktank_GPT_Streamlit.git
cd Sharktank_GPT_Streamlit

Install dependencies:
```
pip install -r requirements.txt
```

Set up environment variables by creating a .env file in the project root:

GROQ_API_KEY=your_groq_api_key_here
LANGFUSE_SECRET_KEY=sk-lf-your_secret_key_here  # Optional
LANGFUSE_PUBLIC_KEY=pk-lf-your_public_key_here  # Optional
LANGFUSE_BASE_URL=https://cloud.langfuse.com    # Optional
LANGFUSE_ENABLED=false                          # Optional

Ensure all CSV dataset files are in the project root directory:
- Shark Tank US dataset.csv
- Shark Tank India.csv
- Shark Tank Australia dataset.csv
- shark_tank_merged.csv
Run the application:
```
streamlit run streamlit_app.py
```
Open your browser to http://localhost:8501

Option 2: Virtual Environment (Recommended)

Create a virtual environment:
```
python -m venv venv
```
Activate the virtual environment:
- On Windows:
```
venv\Scripts\activate
```
- On Mac/Linux:
```
source venv/bin/activate
```
Install dependencies:
```
pip install -r requirements.txt
```
Follow steps 3-6 from Option 1

Configuration

Environment Variables

The app requires environment variables to be set. Create a .env file in the root directory:

GROQ_API_KEY=your_groq_api_key_here
LANGFUSE_SECRET_KEY=sk-lf-...  # Optional, for observability
LANGFUSE_PUBLIC_KEY=pk-lf-...  # Optional, for observability
LANGFUSE_BASE_URL=https://cloud.langfuse.com  # Optional
LANGFUSE_ENABLED=false  # Set to true to enable Langfuse

Important: Never commit your .env file to version control. It's already in .gitignore.

Langfuse Setup (Optional): To enable observability with Langfuse:

Sign up at https://cloud.langfuse.com
Create a new project
Get your API keys from Settings → API Keys
Add them to your .env file and set LANGFUSE_ENABLED=true

Customization

Edit config.py to customize:

Analysis thresholds
Visualization colors
File upload limits
Report settings

Usage Examples

Sample Queries

Pitch Analysis:

"I want to pitch a food tech startup asking for $500k for 15% equity"

Industry Research:

"What are the most successful industries in Shark Tank?"

Shark Comparison:

"Compare investment patterns between US and India sharks"

Success Prediction:

"Analyze my pitch: AI-powered fitness app, $1M ask, 20% equity"

File Upload

Upload CSV files with pitch data
Upload text files with business descriptions
Upload markdown files with pitch decks

Analysis Features

Success Prediction

Probability Score: 0-100% success likelihood
Confidence Level: Model confidence in prediction
Success Level: High/Medium/Low classification
ML Models: Random Forest with feature importance

Investment Patterns

Country Analysis: Success rates by country
Industry Trends: Hot industries and success patterns
Shark Profiles: Individual investment strategies
Gender Analysis: Investment patterns by gender

Risk Assessment

Equity Analysis: Optimal equity ranges
Valuation Checks: Reasonable ask amounts
Industry Risks: Sector-specific challenges
Market Factors: External risk considerations

Visualizations

Interactive Charts: Plotly-powered visualizations
Country Comparison: Success rates and metrics
Industry Analysis: Performance by sector
Shark Profiles: Investment patterns
Trend Analysis: Historical patterns

Reports

Executive summary with key metrics
Detailed country and industry analysis
Shark investment profiles
Success factors and risk assessment
Actionable recommendations
Downloadable markdown format

Deployment to Streamlit Cloud

Prerequisites

GitHub account with repository set up
Streamlit Cloud account (sign up at https://share.streamlit.io)
Groq API key

Step 1: Push Code to GitHub

Initialize git repository (if not already done):
```
git init
```
Add all files:
```
git add .
```

Commit changes:

git commit -m "Ready for Streamlit deployment"

Create a new repository on GitHub (if not exists):
- Go to https://github.com and sign in
- Click "+" icon → "New repository"
- Repository name: Sharktank_GPT_Streamlit
- Choose Public (for free Streamlit Cloud) or Private
- DO NOT initialize with README, .gitignore, or license
- Click "Create repository"

Connect and push to GitHub:

git remote add origin https://github.com/YOUR_USERNAME/Sharktank_GPT_Streamlit.git
git branch -M main
git push -u origin main

Note: If you get authentication errors, use a GitHub Personal Access Token:

GitHub → Settings → Developer settings → Personal access tokens → Tokens (classic)
Generate new token with repo permissions
Use token as password when pushing

Step 2: Deploy to Streamlit Cloud

Go to https://share.streamlit.io and sign in with GitHub
Click "New app"
Configure the app:
- Select your repository: Sharktank_GPT_Streamlit
- Select branch: main (or master)
- Main file path: streamlit_app.py

Add Secrets (API keys):

Go to app settings → Secrets

Add your environment variables:

GROQ_API_KEY = "your_groq_api_key_here"
LANGFUSE_SECRET_KEY = "sk-lf-..."  # Optional
LANGFUSE_PUBLIC_KEY = "pk-lf-..."  # Optional
LANGFUSE_BASE_URL = "https://cloud.langfuse.com"
LANGFUSE_ENABLED = "false"

Click "Deploy" and wait 2-5 minutes
Your app will be live at: https://your-app-name.streamlit.app

Important Notes for Deployment

Dataset Files

Make sure your CSV files are committed to the repository:

Shark Tank US dataset.csv
Shark Tank India.csv
Shark Tank Australia dataset.csv
shark_tank_merged.csv

These files should be in the root directory of your repository and are required for the app to function.

Requirements.txt

Your requirements.txt file is already configured with all necessary dependencies. Streamlit Cloud will automatically install them.

Memory and Performance

Streamlit Cloud provides free tier with 1GB RAM
For heavy ML workloads, consider upgrading to paid tier
The app loads datasets at startup, so initial load may take a few seconds

Environment Variables

The app uses .env files locally, but on Streamlit Cloud, use the Secrets feature instead. The python-dotenv package will read from Streamlit secrets automatically.

Updating Your App

To update your deployed app:

Make changes to your code

Commit and push to GitHub:

git add .
git commit -m "Update description"
git push origin main

Streamlit Cloud will automatically redeploy
You can also manually trigger redeploy from the app settings

Troubleshooting

Common Issues

Import Errors:

pip install --upgrade -r requirements.txt

Groq API Errors:
- Check your internet connection
- Verify the API key in .env file or Streamlit Cloud secrets
- Get your API key from https://console.groq.com/
File Not Found Errors:
- Verify CSV files are in the repository root directory
- Check file names match exactly (case-sensitive)
- Ensure files are committed to GitHub
API Key Errors:
- Verify secrets are set correctly in Streamlit Cloud
- Check that keys don't have extra spaces or quotes
- Ensure .env file exists locally with correct keys
Memory Issues:
- Dataset files might be too large for free tier
- Consider using data caching (already implemented in the code)
- Close other applications if running locally

Port Already in Use:

streamlit run streamlit_app.py --server.port 8502

Virtual Environment Issues:
- Ensure virtual environment is activated
- Reinstall packages: pip install -r requirements.txt
- Verify Python version: python --version (should be 3.8+)
Git Authentication Issues:
- Use GitHub Personal Access Token instead of password
- Or use SSH: git@github.com:YOUR_USERNAME/Sharktank_GPT_Streamlit.git

Checking Logs (Streamlit Cloud)

Go to your app in Streamlit Cloud
Click the menu (three dots) in the top right
Select "Manage app"
View logs for error messages

Large Files

If CSV files are too large (>100MB):

Consider using Git LFS: git lfs install && git lfs track "*.csv"
Or upload datasets to cloud storage and load from URL

Security Best Practices

DO:

Use Streamlit Secrets for all API keys on Streamlit Cloud
Keep your .gitignore updated
Never commit .env files
Review your code before pushing to GitHub
Use environment variables instead of hardcoded values

DON'T:

Hardcode API keys in your code
Commit sensitive data to GitHub
Share your repository secrets publicly
Use production API keys in development

System Requirements

Python 3.8 or higher
4GB RAM minimum (recommended)
Internet connection for Groq API
Modern web browser

Dependencies

All required packages are listed in requirements.txt:

streamlit
pandas
numpy
plotly
langgraph
langchain
langchain-groq
groq
scikit-learn
xgboost
shap
python-dotenv
langfuse

Contributing

Fork the repository
Create a feature branch
Make your changes
Test thoroughly
Submit a pull request

License

This project is open source and available under the MIT License.

Support

For issues or questions:

Check the troubleshooting section above
Review the code comments
Open an issue on GitHub
Check Streamlit Cloud documentation: https://docs.streamlit.io/streamlit-cloud
Visit Streamlit Community: https://discuss.streamlit.io

Future Enhancements

Real-time data updates
Additional ML models
API integration
Mobile app version
Advanced NLP features
Custom dashboard creation

Built with LangGraph, Streamlit, and Python

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
Shark Tank Australia dataset.csv		Shark Tank Australia dataset.csv
Shark Tank India.csv		Shark Tank India.csv
Shark Tank US dataset.csv		Shark Tank US dataset.csv
advanced_analysis.py		advanced_analysis.py
config.py		config.py
demo_groq.py		demo_groq.py
demo_langfuse_dashboard.py		demo_langfuse_dashboard.py
groq_integration.py		groq_integration.py
langgraph_workflow.py		langgraph_workflow.py
requirements.txt		requirements.txt
requirements_frozen.txt		requirements_frozen.txt
run_app.py		run_app.py
setup_venv.py		setup_venv.py
shark_tank_merged.csv		shark_tank_merged.csv
start_app.bat		start_app.bat
streamlit_app.py		streamlit_app.py
test_groq_workflow.py		test_groq_workflow.py
test_langfuse_integration.py		test_langfuse_integration.py
test_system.py		test_system.py

Folders and files

Latest commit

History

Repository files navigation

Shark Tank AI Analyzer

Features

Multi-Agent Analysis System

Advanced Analytics

Interface

Project Structure

Prerequisites

Installation

Option 1: Direct Installation

Option 2: Virtual Environment (Recommended)

Configuration

Environment Variables

Customization

Usage Examples

Sample Queries

File Upload

Analysis Features

Success Prediction

Investment Patterns

Risk Assessment

Visualizations

Reports

Deployment to Streamlit Cloud

Prerequisites

Step 1: Push Code to GitHub

Step 2: Deploy to Streamlit Cloud

Important Notes for Deployment

Dataset Files

Requirements.txt

Memory and Performance

Environment Variables

Updating Your App

Troubleshooting

Common Issues

Checking Logs (Streamlit Cloud)

Large Files

Security Best Practices

System Requirements

Dependencies

Contributing

License

Support

Future Enhancements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages