Practical Assignment – Machine Learning 2025

Available documentation in Romanian here

Authors: Bechea Flavia-Ioana, Radu Marian-Sebastian
Date: January 2026

1. Introduction

This project addresses a ranking problem for upsell from the perspective of a restaurant owner. The goal is to increase sales by recommending relevant additional products (specifically sauces) to customers based on their current basket.

The core objective is to construct a hierarchy of candidate products (sauces) ordered by their estimated relevance to the customer and their potential revenue impact. Formally, for each candidate product $p$, we estimate the probability $P(p \mid \text{cart})$ using Machine Learning algorithms and transform this into a ranking score:

$$ \text{Score}(p \mid \text{cart}) = P(p \mid \text{cart}) \cdot \text{price}(p) $$

We evaluate the system by constructing a partial cart (removing a target sauce) and verifying if the top $K$ recommendations include the removed item.

2. Dataset & Preprocessing

The dataset consists of restaurant receipts from September to December 2025.

Preprocessing Strategy

Receipt Grouping: Raw data was grouped by id_bon (receipt ID) so that each row represents a single transaction.
Feature Engineering:
- Binary Product Vectors: Columns for each product (e.g., Crazy Schnitzel, Fries) acting as binary indicators (1 if present, 0 otherwise).
- Temporal Features: Extracted day_of_week (1-7) and hour from the timestamp to capture time-based preferences (e.g., weekend vs. weekday patterns).
- Cart Statistics: Total value of the cart and the number of items.
Target Variable: A binary variable indicating the presence of a specific sauce (e.g., Crazy Sauce, Garlic Sauce) in the receipt.

3. Methodology & Algorithms

We explored several classification algorithms to predict the probability of a sauce being ordered:

ID3 (Decision Tree)
Naive Bayes
Logistic Regression
AdaBoost

Manual Implementation

Based on initial experiments, Naive Bayes and Logistic Regression showed the most promise, outperforming tree-based ensembles on this specific sparse, binary dataset. Consequently, we implemented these two algorithms from scratch to deepen our understanding of their mechanics.

Naive Bayes: Implemented with Laplace smoothing to handle zero-frequency problems (unseen features in training). It assumes feature independence, which, while theoretically strong, works surprisingly well for sparse transaction data.
Logistic Regression: Implemented using Gradient Descent with L2 Regularization to prevent overfitting. It models the probability using the sigmoid function: $P(y=1|x) = \sigma(w^T x + b)$.

4. Experimental Results

We compared our models against a Popularity Baseline, which simply recommends sauces based on their global frequency (ignoring the specific cart context).

Performance Metrics

The primary metric is Hit Rate @ K (for $K \in {1, 3, 5}$), measuring the percentage of test cases where the hidden sauce appeared in the top $K$ recommendations.

Overall Comparison

As seen in the chart below, both Logistic Regression and Naive Bayes significantly outperform the baseline, especially at $K=1$ and $K=3$.

Figure 1: Hit Rate comparison between Manual Logistic Regression, Manual Naive Bayes, and the Baseline.

Logistic Regression provides the most stable and accurate ranking, benefiting from its ability to model dependencies between features without overfitting.
Naive Bayes is a close second, proving robust to noise.
ID3 and AdaBoost (not shown in the manual comparison above, but analyzed in preliminary tests) tended to overfit or struggle with the class imbalance inherent in the dataset.

Detailed Analysis per Algorithm

Logistic Regression

Logistic Regression successfully captures the linear relationship between cart items and sauce preferences. The confusion matrix for Crazy Sauce (a popular item) shows a strong ability to correctly identify positive cases (True Positives) while maintaining a reasonable false positive rate.

Figure 2: Confusion Matrix for Crazy Sauce using Logistic Regression.

Naive Bayes

Naive Bayes excels in speed and robustness. Despite the "naive" assumption of independence, it correctly identifies patterns in the binary data.

Figure 3: Confusion Matrix for Crazy Sauce using Naive Bayes.

5. Conclusion

Our analysis shows that Logistic Regression is the most effective model for this upsell ranking task. It successfully identifies patterns between cart items and sauce preferences, significantly outperforming the simple popularity baseline.

While Naive Bayes also performed well, tree-based models (ID3, AdaBoost) were less effective due to the dataset's sparsity. Ultimately, leveraging Machine Learning to understand the cart context provides much more accurate recommendations than relying on global popularity alone.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
results		results
.gitignore		.gitignore
Practical Assignment – Machine Learning 2025.pdf		Practical Assignment – Machine Learning 2025.pdf
README.md		README.md
ap_dataset.csv		ap_dataset.csv
evaluation.py		evaluation.py
features.py		features.py
logistic_regression.py		logistic_regression.py
main.py		main.py
models.py		models.py
naivebayes.py		naivebayes.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Practical Assignment – Machine Learning 2025

1. Introduction