Brief tour of the codebase and the public API. It is intended for users and those working on the interpreter itself.
This document is maintained by AI following the writing style guide. You will find a copy of the official Starlark language spec in the same directory.
These are the user-visible places this implementation diverges from the
Bazel Java reference and from starlark-go. Everything else aims for
exact behavioral and string-output equivalence.
- Integers are Python
int. Arbitrary precision; no overflow. The Java reference uses aStarlarkIntunion of int32 / int64 / BigInteger and surfaces overflow at the boundary. - Strings are indexed by Unicode code point. The Java reference indexes by UTF-16 code unit, which produces surprising results on non-BMP characters (a single emoji is one index in our world, two indices in Java's). The spec leaves this implementation-defined.
- No 32-bit-range checks for
range()/*repeat / etc. Nativeintarithmetic doesn't overflow, so these checks would be artificial. We instead cap container allocations at 16M elements with a less specific error message. ifandforare allowed at top level. This matchesstarlark-goin-globalreassignmode and the.bzlfile dialect; it diverges from BUILD-file mode, which forbids them. We don't currently distinguish dialects — the host is expected to apply a stricter pre-check if it cares (Bazel does this viaFileOptions).whileand recursion are forbidden. Same as both references in their default mode.load()is host-mediated. The host supplies aLoadercallable (Callable[[str], Module]); without one,load()raises. There is no filesystem access by default.print()writes to stderr and ends with a newline. This matches both references; cross-validation intests/test_cross_validation.pyasserts byte-equal stderr+stdout output.
src/starlark/
__init__.py Public API: eval, exec_file, compile, EvalError,
Module, Thread, plus re-exports of values/syntax names.
values.py Public host integration surface: Dict, StarlarkList,
Mutability, to_value, from_value, namespace, etc.
program.py compile() and Program (parse-once / run-many).
cmd.py Argparse CLI ('starlark-python' console script & 'python -m starlark').
__main__.py Trampoline so 'python -m starlark' works.
syntax/ Source -> AST. Mirrors net.starlark.java.syntax.
tokens.py TokenKind enum + Token dataclass.
location.py FileLocations: offset -> (line, column) lookup.
errors.py SyntaxError record + StarlarkSyntaxException.
lexer.py Stream-style scanner with INDENT/OUTDENT.
ast.py Dataclass nodes for every grammar construct.
parser.py Recursive-descent parser; produces a StarlarkFile.
resolver.py Classifies each Identifier (LOCAL/FREE/GLOBAL/...);
computes per-function locals + freevars; structural
checks (break outside loop, etc.).
eval/ Runtime. Mirrors net.starlark.java.eval.
errors.py EvalError + CallFrame.
mutability.py The per-Module Mutability token.
module.py Module: globals dict + Mutability.
values.py Native types (None/bool/int/float/str/tuple) +
wrappers (StarlarkList, Dict, StarlarkSet, Range,
BuiltinFunction). Helpers: starlark_type, truth,
equal, less_than, repr_starlark, str_starlark,
check_hashable.
function.py StarlarkFunction (def / lambda) + bind_arguments.
evaluator.py Tree-walking interpreter. Frame, Thread, eval_file,
call.
methods.py Per-type method-table dispatch (string / list /
dict / set).
string_methods.py All string methods.
collection_methods.py All list / dict / set methods.
builtins.py Universal builtins (len, range, sorted, sum, ...).
json_module.py json.encode / decode / encode_indent / indent.
test_driver.py Bazel ScriptTest-style predeclared helpers
(assert_eq, assert_, assert_fails, freeze, struct).
loader.py Loader protocol + FileLoader for load() statements.
tests/ Pytest suite, ~400 tests.
test_lexer.py / test_parser.py / test_resolver.py / test_values.py /
test_eval.py / test_methods.py / test_builtins.py / test_load.py /
test_json.py
test_lexer_conformance.py / test_parser_conformance.py /
test_resolver_conformance.py
test_conformance.py Parameterized over conformance/*.star, splits
chunks on '\\n---\\n', honors '### regex' error
markers exactly as Bazel ScriptTest.java does.
test_cross_validation.py Optional cross-check against starlark-go's
binary; skipped cleanly when not on PATH.
conformance/ 38 .star files copied verbatim from Bazel.
Everything lives at the package root. The whole surface is:
import starlarkParse, resolve, and evaluate source as a single expression. Returns
the value. **env is added to the universal namespace on top of the
built-in set (len, range, print, json, …).
starlark.eval("1 + 2") # 3
starlark.eval("len('hello')") # 5
starlark.eval("json.encode([1, 2])") # '[1,2]'Equivalent to starlark.compile(source, mode="expression").eval(**env).
Parse source once into a reusable Program. The same Program can be
invoked many times with different host environments — useful for
data-transformation pipelines that apply one script to many inputs.
mode is "auto" (default; tries expression first, falls back to
file), "expression", or "file". The mode="file" override exists
because a single-line do_something() parses as a valid expression
and as a one-statement file; the host has to say which it meant.
program = starlark.compile("[x * 2 for x in data]")
program.eval(data=[1, 2, 3]) # [2, 4, 6]
program.eval(data=[10]) # [20]Program.eval(*, predeclared=None, universal=None, max_steps=..., loader=..., **env) -> Any evaluates an expression Program.
Program.exec(*, predeclared=None, universal=None, max_steps=..., loader=...) -> Module executes a file Program. Each call gets a
fresh Module and Thread; only the parsed AST is reused.
A Program is per-thread. Each .eval() / .exec() entry takes a
non-blocking re-entrant lock; concurrent cross-thread use raises
RuntimeError("Program is in use by another thread; ...") instead
of silently racing on Identifier.binding. Same-thread re-entry
(e.g. a host builtin that calls back into the same Program) works
fine. To run in parallel, compile() once per thread. The
top-level eval() / exec_file() entry points are unaffected:
each call compiles internally and shares no AST state.
starlark.exec_file(source, filename="<file>", *, predeclared=None, universal=None, loader=None, max_steps=None, on_max_steps=None, max_allocs=None, on_max_allocs=None) -> Module
Parse, resolve, and execute source as a Starlark file. Returns the
populated Module. predeclared adds host-supplied names visible to
this file only; universal adds names to the read-only universe;
loader resolves load() statements. The max_* / on_max_* kwargs
configure the opt-in resource limits — see "Resource limits" below.
Equivalent to starlark.compile(source, mode="file").exec(...).
m = starlark.exec_file('''
def fact(n):
result = 1
for i in range(1, n + 1):
result *= i
return result
z = fact(5)
''')
m.globals["z"] # 120
m.freeze() # all values created in this module are now read-onlyA container of module-global bindings (module.globals: dict[str, Any]) plus
the Mutability token (module.mutability) shared by every mutable value
created during execution. module.freeze() is O(1) and locks every owned
value.
The runtime state for a single evaluation: the executing module, the
predeclared/universal envs, the call stack, and an optional loader.
Most users construct one indirectly via exec_file/eval.
Raised for any runtime semantic error (type mismatch, division by zero,
frozen mutation, undefined name, etc.). Carries message: str and
frames: list[CallFrame] for traceback rendering.
StepLimitExceeded and AllocLimitExceeded both subclass
ResourceLimitExceeded, which subclasses EvalError. Existing
except EvalError handlers catch them; hosts that want to distinguish
"DoS-style abort" from a normal Starlark error catch
ResourceLimitExceeded. See "Resource limits" below.
Raised by parse, parse_expression, compile, eval, exec_file,
and Program.eval/exec whenever the source has lex, parse, or
resolve errors. The exception carries a .errors: list[StarlarkSyntaxError]
attribute; each entry is a frozen dataclass with position and
message fields. The dataclass is exported as StarlarkSyntaxError
to avoid shadowing Python's builtin SyntaxError on
from starlark import *; the original name is still available as
starlark.syntax.SyntaxError.
For error-recovery tooling that wants to keep going past the first
error, call the lower-level Parser class directly:
Parser(Lexer(source)).parse_file() returns a StarlarkFile with
.errors populated and never raises.
The starlark.values submodule exposes the runtime types and the
helpers hosts use to move data into and out of Starlark.
from starlark.values import (
Dict, StarlarkList, StarlarkSet, Range,
BuiltinFunction, Mutability, IMMUTABLE,
Namespace,
)Dict, StarlarkList, and StarlarkSet are mutable containers that
share a Mutability token. Range is the immutable lazy integer
sequence built by range(). BuiltinFunction wraps a Python callable
exposed to Starlark code. Namespace is the value type returned by
namespace() (see below).
Python primitives (None, bool, int, float, str, bytes,
tuple, datetime types) cross the Starlark boundary as themselves —
no wrapping needed. Note: bytes is not a Starlark type. Python
bytes values pass through the evaluator as opaque objects, usable
only by host-provided builtins.
Recursively wrap a Python value for use in Starlark. Containers
(dict, list) become Dict / StarlarkList; tuples stay as
tuples. If mutability is None, a fresh frozen Mutability is
created — the resulting tree is read-only. Pass module.mutability
explicitly if Starlark code needs to mutate the input.
sv = starlark.to_value({"x": [1, 2, 3]}) # frozen tree
mod = starlark.Module("host")
mut = starlark.to_value([1, 2], mutability=mod.mutability) # mutable until mod.freeze()Recursively unwrap a Starlark value to plain Python data. Dict → dict, StarlarkList → list, Range → list, tuple → list. Sets
raise UnsupportedTypeError (their order is insertion-defined; convert
with sorted(s) or list(s) in Starlark first). Functions and
arbitrary host objects also raise.
Build a struct-like value exposing fields as attributes. Python
callables in fields are auto-wrapped as BuiltinFunction(name= f"{name}.{key}", impl=fn); non-callable values are stored verbatim.
The returned object follows the same fields-dict + _starlark_type
protocol that the built-in json namespace uses, so json.encode(ns)
serializes a namespace as a sorted-key object.
helpers = starlark.namespace("remarshal", {
"bytes_to_str": lambda b, encoding="utf-8": b.decode(encoding),
"version": "1.0",
})
starlark.eval("remarshal.bytes_to_str(data)", remarshal=helpers, data=b"hi")
# 'hi'For hosts that apply the same Starlark transform to many inputs:
program = starlark.compile("[x * 2 for x in data]")
def transform(doc):
mut = starlark.Mutability("transform")
try:
return starlark.from_value(
program.eval(data=starlark.to_value(doc, mutability=mut))
)
finally:
mut.freeze()compile() parses (and validates) once. Each program.eval(...) call
re-resolves the AST against the supplied env and runs against a fresh
Module — Module.globals doesn't leak between runs.
Off by default. Hosts that accept untrusted Starlark configure them
via exec_file / eval kwargs:
m = starlark.exec_file(
src,
max_steps=10_000_000, # CPU bound (Starlark operations)
max_allocs=64 * 1024 * 1024, # memory bound, approximate
on_max_steps=lambda t: log(f"step cap at {t.steps}"),
on_max_allocs=lambda t: log(f"alloc cap at {t.allocs} bytes"),
)
print(m.thread.steps, m.thread.allocs) # readable after a successful runThread.steps is a monotonic counter; Thread.max_steps is the cap
(None = unlimited). On excess, raises StepLimitExceeded. Charged at
three sites: top of every statement (_exec_stmt), top of every
expression node (_eval_expr), and entry of every call (call()).
The unit is intentionally coarse — Starlark operations, not Python
instructions or bytecode — and matches starlark-java's documented
choice. Sub-expressions tick recursively, so
sum([i for i in range(N)]) is bounded by O(N), not O(1).
It is not a hard CPU bound: a single big builtin like
sorted(huge_list) does O(N log N) Python-level work for one step
charge. Combine with resource.setrlimit or a subprocess for a hard
ceiling.
Thread.allocs is a monotonic byte counter; Thread.max_allocs is
the cap (None = unlimited). On excess, raises AllocLimitExceeded.
Charged in every container constructor, every mutating
append/extend/update/add, and every +/* that produces a
new container or string. Sizes are approximate constants in
eval/limits.py.
The counter is charge-only: values that go out of scope are not
refunded. The bound it expresses is cumulative allocation, not
live memory. A program that allocates 64 MB in scratch values and
lets the GC reclaim them will still report 64 MB used. Size
max_allocs at 2–4× the expected steady-state working set.
Each on_max_* callback is invoked once, before the
corresponding *LimitExceeded is raised. The callback can:
- Return normally (the default raise still fires).
- Raise its own exception (pre-empts the default raise; the host sees the custom exception).
- Mutate the
Thread(e.g. log, increment a host metric).
Subsequent overruns within the same evaluation do not re-fire the callback — it's a one-shot.
security/threat-model.md documents
the full sandbox boundary: what the interpreter defends against (no
filesystem / network / subprocess / Python introspection, concurrent
use is safe), what the opt-in counters do and don't promise, and the
recommended host-side belt-and-braces (run in a subprocess with
resource.setrlimit).
Loader is just Callable[[str], Module]. Pass it as loader= to
exec_file (or set Thread.loader directly).
from starlark.eval.loader import FileLoader
loader = FileLoader(exec_file=starlark.exec_file, search_paths=[".", "lib"])
starlark.exec_file(open("main.star").read(), loader=loader)FileLoader caches modules by path and freezes them on first load.
The Bazel conformance suite uses predeclared functions: assert_eq,
assert_, assert_fails, freeze, struct, mutablestruct,
int_mul_slow. They live in eval/test_driver.py and aren't installed
unless the host explicitly passes them via predeclared=.
from starlark.eval.test_driver import make_predeclared, with_reporter
with with_reporter() as reporter:
starlark.exec_file("assert_eq(1, 2)", predeclared=make_predeclared())
print(reporter.errors) # ['assert_eq: 1 != 2']push_reporter / pop_reporter exist for backwards compatibility but
with_reporter is the preferred form — both are thread-safe under
nesting.
For a universal builtin available to every Starlark file:
# src/starlark/eval/builtins.py
def b_double(x):
if not isinstance(x, int) or isinstance(x, bool):
raise EvalError(f"double() requires int, got {starlark_type(x)}")
return x * 2
# In make_universal():
("double", b_double),For a method on an existing type (string, list, dict, set):
# src/starlark/eval/string_methods.py (or collection_methods.py)
def s_shout(self, suffix=""):
return self.upper() + suffix
# In register_all():
("shout", s_shout),For builtins that need to call back into Starlark (e.g. key= callbacks)
use _call_starlark(fn, *args) from eval.builtins. It reads the
current Thread from the _CURRENT_THREAD context variable, which the
evaluator sets for the duration of every call. Concurrent evaluations
in different host threads are isolated automatically.
For builtins that allocate mutables (lists, dicts), call _mut() to get
the current Module's Mutability:
from .builtins import _mut
return StarlarkList(items, _mut())- Add a
@dataclass(slots=True)subclass ofExpressionorStatementinsyntax/ast.py. - Teach the parser to emit it (
syntax/parser.py). - Teach the resolver to recurse through it (
syntax/resolver.py). - Teach the evaluator (
eval/evaluator.py::_exec_stmtor_eval_expr). - Add unit tests in
tests/test_parser.py,tests/test_resolver.py,tests/test_eval.py.
-
Run
poe test— failures appear intests/test_conformance.py. -
To debug one file, drop into a Python REPL:
import starlark from starlark.eval.test_driver import make_predeclared, push_reporter, pop_reporter src = open("conformance/dict.star").read() r = push_reporter() try: for chunk in src.split("\n---\n"): try: starlark.exec_file(chunk, predeclared=make_predeclared()) except Exception as e: print("EXC:", e) for msg in r.errors: print(msg) finally: pop_reporter()
-
Once a file passes, remove it from
XFAIL_FILESintests/test_conformance.py.