Skip to content

fix: restrict starter code download to allowed file extensions#438

Open
atul-upadhyay-7 wants to merge 4 commits into
komalharshita:mainfrom
atul-upadhyay-7:fix/file-extension-validation
Open

fix: restrict starter code download to allowed file extensions#438
atul-upadhyay-7 wants to merge 4 commits into
komalharshita:mainfrom
atul-upadhyay-7:fix/file-extension-validation

Conversation

@atul-upadhyay-7
Copy link
Copy Markdown

Description

The resolve_starter_file() function in utils/file_server.py had no file extension validation — any file present in the starter_code/ directory could be served as a download, even if it wasn't a legitimate code file.

Changes

  1. Added ALLOWED_EXTENSIONS — a set defining which file extensions are permitted: .py, .js, .html, .css, .json, .md, .txt

  2. Extension check in resolve_starter_file() — after extracting the basename, the file extension is compared against the allowed list. Non-matching files return None, which causes the endpoint to return a 404.

Security Impact

Previously, any file placed in starter_code/ (e.g. .env, .gitignore, private keys) could be downloaded via the /project/<id>/download endpoint. The extension check acts as a defense-in-depth measure alongside the existing os.path.basename() path traversal protection.

Verification

  • All 30 existing tests pass
  • .py, .html, .css files resolve normally
  • .env, no-extension, and other non-code extensions return None
  • Backward compatible with all existing starter code files

Closes #378

Atul Upadhyay added 4 commits May 22, 2026 08:35
…lly works

The cache variable and clear_cache() were defined but never actually
used by load_all_projects() — it always read from disk. This wires
the cache in so repeated calls avoid redundant I/O.
…vascript

'web dev' in SKILL_ALIASES mapped to only 'javascript', which meant
projects listing HTML and CSS (without JS) were completely excluded
from results. This changes the alias to a list so all three core web
skills get matched, and updates parse_skills() to handle list-valued
aliases by extending rather than appending.
…ebreaker

When multiple projects scored the same, the sort was unstable —
equal-scoring projects appeared in arbitrary order that could change
between runs or after JSON edits. Adding project id as a secondary
sort key guarantees consistent results for identical inputs.
resolve_starter_file() only checked file existence but not the file
extension, meaning any file sitting in starter_code/ could be served
as a download. Added an ALLOWED_EXTENSIONS set and a check that
rejects files with extensions outside the expected code formats.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 22, 2026

Someone is attempting to deploy a commit to the komalsony234-1530's projects Team on Vercel.

A member of the Team first needs to authorize it.

@github-actions github-actions Bot added gssoc-2026 type:bug Something isn't working type:performance type:security and removed type:bug Something isn't working gssoc-2026 labels May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CRITICAL] Arbitrary File Read vulnerability in starter code download endpoint

1 participant