Are LLMs Price Sensitive?

A conjoint experiment measuring whether open-source LLMs exhibit price sensitivity and coherent economic preferences when choosing between hotel rooms.

Key Finding

LLMs have dramatically different economic "personalities." Same task, same prompt --- very different price-quality tradeoffs.

Qwen3 4B is an aggressive price optimizer with near-constant elasticity --- a doubling of price always hurts the same amount, whether from $50 to $100 or $2k to $4k. Textbook log-price consumer.

Gemma3 4B shows a threshold effect: it's insensitive to price below ~$300/night, then drops sharply. A $50 room and a $200 room are treated as essentially equivalent.

Llama 3.2 3B sits in between, with moderate sensitivity kicking in around $200.

Why It Matters

AI agents are starting to book travel, compare products, and make purchases on behalf of users. Even with guardrails ("stay under $300, at least 3 stars"), the model's default preferences determine the marginal choice among qualifying options.

We benchmark LLMs on intelligence (MMLU, reasoning, code). We don't benchmark them on what they choose when there's no right answer --- just tradeoffs. That's exactly the situation a shopping agent faces.

Method

Conjoint Design

3,600 binary hotel room comparisons, each varying four attributes:

Attribute	Range	Distribution
Price	$20 -- $10,000/night	Log-uniform
Star rating	2, 3, 4, 5	Uniform
Room size	150 -- 800 sq ft	Uniform (25 ft steps)
Bed type	single, double, queen, king	Uniform

Every task is scored twice per model (original + swapped option order) to cancel position bias, yielding 7,200 observations per model.

Estimation

The utility function is estimated via binary logit. We use non-parametric price deciles --- instead of assuming a functional form (log or linear), we assign prices to decile bins and let each bin's coefficient speak for itself. This reveals the shape of the price response without imposing it.

We also estimate parametric specifications (log-price and linear-price) for comparison.

Models

All models run locally via Ollama with deterministic settings (temperature=0, top_k=1, seed=42).

Model	Params	Quantization	Developer
Qwen3 4B	4B	Q8_0	Alibaba
Gemma3 4B	4B	Default	Google DeepMind
Llama 3.2 3B	3B	Default	Meta
Mistral 7B	7B	FP16	Mistral AI
Qwen3 4B (Q4)	4B	Q4	Alibaba

Prompt Template

You are booking a hotel room in New York, New York for a one-night stay.
You must choose between the following two options.

Option A:
- Star rating: {a_stars} stars
- Room size: {a_sqft} square feet
- Bed type: {a_bed}
- En-suite bathroom: Yes
- Price per night: ${a_price}

Option B:
- Star rating: {b_stars} stars
- Room size: {b_sqft} square feet
- Bed type: {b_bed}
- En-suite bathroom: Yes
- Price per night: ${b_price}

Which option do you choose? Reply with only the letter A or B.

Reproducing

Prerequisites: Python (numpy, pandas, statsmodels, matplotlib), Ollama

# 1. Generate tasks (or use existing conjoint_tasks.csv)
python generate_conjoint_tasks.py

# 2. Score a model (original + swap)
python run_conjoint_llm.py --model gemma3:latest
python run_conjoint_llm.py --model gemma3:latest --swap

# 3. Register in config.py, then estimate and plot
python estimation/parametric.py
python estimation/nonparametric.py
python plots/decile_coefficients.py
python plots/parametric_vs_nonparametric.py
python plots/by_star_tier.py
python plots/by_room_size.py

All scripts auto-adapt to the number of models registered in config.py.

Project Structure

config.py                            # Model registry & shared helpers
generate_conjoint_tasks.py           # Creates conjoint_tasks.csv
run_conjoint_llm.py                  # Scores tasks via Ollama
estimation/
  parametric.py                      # Log-price & linear-price logit
  nonparametric.py                   # Price decile dummies logit
plots/
  decile_coefficients.py             # Price response curves (all models)
  parametric_vs_nonparametric.py     # Parametric vs non-parametric overlay
  by_star_tier.py                    # Split by 2-3 vs 4-5 star hotels
  by_room_size.py                    # Split by small vs large rooms
  hero.py                            # Polished dark-theme plot (3 models)
figures/                             # Committed output plots

Full Results (All 5 Models)

The plots below include all evaluated models. Mistral 7B exhibits zero sensitivity to any attribute --- it picks "Option A" 99.9% of the time regardless of content (pure position bias). The Q4 quantization of Qwen3 shows substantially attenuated price sensitivity compared to Q8, with the curve plateauing at higher prices.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
estimation		estimation
figures		figures
plots		plots
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
STUDY.md		STUDY.md
config.py		config.py
conjoint_tasks.csv		conjoint_tasks.csv
generate_conjoint_tasks.py		generate_conjoint_tasks.py
requirements.txt		requirements.txt
run_conjoint_llm.py		run_conjoint_llm.py
validate_conjoint_balance.py		validate_conjoint_balance.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Are LLMs Price Sensitive?

Key Finding

Why It Matters

Method

Conjoint Design

Estimation

Models

Prompt Template

Reproducing

Project Structure

Full Results (All 5 Models)

Non-parametric price coefficients

Parametric vs. non-parametric fit

By star tier (2-3 star vs. 4-5 star)

By room size (small vs. large)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Are LLMs Price Sensitive?

Key Finding

Why It Matters

Method

Conjoint Design

Estimation

Models

Prompt Template

Reproducing

Project Structure

Full Results (All 5 Models)

Non-parametric price coefficients

Parametric vs. non-parametric fit

By star tier (2-3 star vs. 4-5 star)

By room size (small vs. large)

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages