Skip to content

thunderbit-operations/craigslist-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Craigslist Scraper

A fast CLI + Python tool to scrape Craigslist listings — title, price, location, and listing URL — to CSV or JSON. No API key, no browser automation.

Scrape apartments, cars, jobs, or any "for sale" category from any Craigslist city in one command. Built on the stable static HTML search page, so there's no flaky JSON decoding or headless browser to babysit.

Features

  • Listing data — title, price, location, and listing URL for every result
  • Any city, any category — pass a city subdomain (seattle, newyork, …) and a category code (apa, sss, jjj, cta, …)
  • No API key, no account, no Selenium — plain HTTP + HTML parsing
  • CSV / JSON / JSONL output — pipe-friendly, ready for Excel, pandas, or a database
  • Built-in rate limiting — polite request delay by default
  • Python API — use it as a library in your own scripts
  • Minimal dependenciesrequests + beautifulsoup4

Installation

pip install craigslist-listings-scraper

Requires Python 3.10+.

Quick Start

# Apartments for rent in Seattle → pretty JSON
craigslist seattle apa

# Cars & trucks in New York → CSV file
craigslist newyork cta --format csv -o ny-cars.csv

# First 10 "for sale" listings in Los Angeles
craigslist losangeles sss --limit 10

# Just count the results
craigslist seattle apa --count

Example JSON record:

{
  "title": "Stunning 1 BR - Quartz Counters, Fitness Studio, Clubhouse",
  "price": "$1,995",
  "location": "Seattle",
  "url": "https://seattle.craigslist.org/see/apa/d/seattle-stunning-1-br/7891234568.html"
}

CLI Reference

craigslist [OPTIONS] CITY [CATEGORY]
Argument / Flag Default Description
CITY Craigslist city subdomain, e.g. seattle, newyork, losangeles
CATEGORY apa Category code (see table below)
--limit N all Max listings to return
--delay SECONDS 2.0 Wait before the request (rate limiting)
--format, -f json Output format: csv, json, jsonl
--output, -o FILE stdout Write to file
--count off Print only the listing count

Common category codes:

Code Category
apa apartments / housing for rent
sss all for sale
jjj jobs
cta cars & trucks
hhh housing (all)
ggg gigs
bbb services
rrr resumes

You can find more category codes in the URL of any Craigslist search — they're the /search/{code} segment.

Python API

from craigslist_scraper import scrape, parse_listings, search_url

# Scrape apartments in Seattle
listings = scrape("seattle", "apa", limit=20)
for item in listings:
    print(item["price"], item["location"], item["title"])

# Build a search URL without fetching
url = search_url("newyork", "cta")
# → https://newyork.craigslist.org/search/cta

# Parse HTML you've already downloaded (no network)
with open("saved_page.html") as f:
    rows = parse_listings(f.read())

How it works

Craigslist renders its search results server-side as a list of <li class="cl-static-search-result"> elements — each containing the listing title, a .price, a .location, and an <a href> to the detail page. This tool fetches that page with a real Chrome fingerprint and parses those fields with BeautifulSoup. No JavaScript engine, no JSON decoding, no headless browser.

Limitations

This is a lightweight tool that reads publicly available listing data. Be aware:

  • No posting dates. The static HTML search page does not include the post date/time for each listing — only title, price, location, and URL. (Fetching each detail page would add the date but also a request per listing.)
  • Rate limiting is real. Craigslist aggressively throttles datacenter and cloud IPs — keep it to roughly ≤ 5 requests/minute. The default --delay 2.0 helps; for larger jobs you'll need to slow down further and/or rotate residential IPs.
  • One page per request. This returns the listings on a single search page. For deep pagination across thousands of results you'll need to handle paging and IP rotation yourself.
  • Markup can change. If Craigslist changes their static-result markup, the selectors may need updating.

💡 Don't want to write code or handle rate limits? Thunderbit is an AI web scraper Chrome extension that scrapes Craigslist (and any site) in 2 clicks, no code — it handles pagination, subpages, and anti-bot for you. Built for sales, marketing & ops teams who don't code.

Development

git clone https://github.com/thunderbit-operations/craigslist-scraper.git
cd craigslist-scraper
pip install -e ".[dev]"
pytest

Tests run against a saved HTML fixture and require no network access.

Related tools

Legal

Scrape responsibly and at a polite rate. Only collect publicly available data, and review Craigslist's Terms of Use and your local regulations before use.

License

MIT — Built by Thunderbit, AI-powered web scraper & data extraction tools.

About

Scrape Craigslist listings (title, price, location, URL) to CSV/JSON — Python CLI + API, no API key.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages