Skip to content

Add BenchmarkDotNet benchmarks for RunAllTestsInPackage and AdderProgram#22

Open
Copilot wants to merge 22 commits intomasterfrom
copilot/benchmark-slow-parts-analysis
Open

Add BenchmarkDotNet benchmarks for RunAllTestsInPackage and AdderProgram#22
Copilot wants to merge 22 commits intomasterfrom
copilot/benchmark-slow-parts-analysis

Conversation

Copy link
Contributor

Copilot AI commented Feb 23, 2026

RunAllTestsInPackage takes ~12s and AdderProgramTests takes ~2.2s with no existing benchmark baseline to measure against or track regressions.

Changes

  • TestExecutorTests: Add [Benchmark] + [Category("Slow")] to RunAllTestsInPackage; refactor executor to inline readonly field (eliminates [SetUp]); add BenchmarkCompare manual runner; decorate class with [MemoryDiagnoser] + [SimpleJob(warmupCount: 1, iterationCount: 10)]

  • AdderProgramTests: Same treatment on AddTotalsForThreeNumbers; inline readonly BytecodeInterpreter (safe — Execute() calls Clear() on entry)

Both follow the pattern established in BooleanTests. BenchmarkDotNet flows transitively from Strict.Language.Tests; no new package refs needed.

[MemoryDiagnoser]
[SimpleJob(RunStrategy.Throughput, warmupCount: 1, iterationCount: 10)]
public sealed class TestExecutorTests
{
    private readonly TestExecutor executor = new();

    [Test]
    [Category("Slow")]
    [Benchmark]
    public void RunAllTestsInPackage() => executor.RunAllTestsInPackage(TestPackage.Instance);

    //ncrunch: no coverage start
    [Test]
    [Category("Manual")]
    public void BenchmarkCompare() => BenchmarkRunner.Run<TestExecutorTests>();
}

🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

Co-authored-by: BenjaminNitschke <1650127+BenjaminNitschke@users.noreply.github.com>
Copilot AI changed the title [WIP] Analyze slow parts using BenchmarkDotNet for performance improvements Add BenchmarkDotNet benchmarks for RunAllTestsInPackage and AdderProgram Feb 23, 2026
…onary type not be generic in the second run for some strange reason, have to slowly investigate. not super important, but long term good to have it more stable for subsequent runs and also allows us to profile with a warmup step
…n value all the way back to the Evaluator.Execute entry method, added Statistics to help figure out how many things are called (kind of our little mini profiler), we already know that boxing is the main issue, but it helps to know what is called how many times before optimizing ..
…before the fix of all the boxing/unboxing, which comes next), memory consumption will go down a lot and speed increase even more.

Total execution time: 00:00:00.0016387
Allocated bytes (current thread): 536816
Packages: 2
Types: 46
Methods: 152
Statistics { MethodCount = 444, TestsCount = 456, ExpressionCount = 2787, ValueInstanceCount = 2474, BooleanCount = 399, NumberCount = 1174, TextCount = 212, ListCount = 272, DictionaryCount = 16, BodyCount = 223, IfCount = 72, SelectorIfCount = 0, ForCount = 64, MethodCallCount = 708, MemberCallCount = 186, ListCallCount = 24, BinaryCount = 412, ArithmeticCount = 120, CompareCount = 274, LogicalOperationCount = 18, UnaryCount = 20, FromCreationsCount = 164, ToConversionCount = 14, ReturnCount = 4, VariableDeclarationCount = 20, VariableCallCount = 308, MutableDeclarationCount = 0, MutableUsageCount = 0, FindTypeCount = 1484, FindVariableCount = 484 }
…ing/unboxing, still a mess with many issues ..
… lot of manual work, but llms are all confused and stupid as fuck making a way bigger mess. I guess huge refactorings stay messy (it is not possible to stay in TDD loop when moving a type to a lower namespace when it is used in hundreds of places, it just takes a lot of time to go through all the code and fix all the little issues) .. some things just stay hard and time consuming (I remember doing this 30 years ago already and since then all the time when big refactoring is on the menu
… be made and a lot of changes. Now we just have to fix the higher level projects
…s still going on as well a lot of unneeded nullables. plus a crazy amount of dictionary and list creation, some can be avoided, but need more careful thinking. first we need to get the tests working again.
…ake common case compares faster (noneInstance, trueInstance, falseInstance, etc.)
@BenjaminNitschke BenjaminNitschke marked this pull request as ready for review February 26, 2026 16:49
Copilot AI review requested due to automatic review settings February 26, 2026 16:49
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds BenchmarkDotNet benchmarks and manual runners intended to establish performance baselines for expensive test scenarios, but the PR also contains substantial runtime refactors (new ValueInstance implementation + Executor API changes + new statistics counters).

Changes:

  • Add BenchmarkDotNet attributes and manual benchmark runners for RunAllTestsInPackage and AdderProgramTests.
  • Add NCrunch config to ignore manual/slow benchmark tests and add a manual Program.cs runner.
  • Introduce broad HighLevelRuntime refactors: new Strict.Expressions.ValueInstance, Executor constructor/API changes, and execution statistics counters.

Reviewed changes

Copilot reviewed 51 out of 51 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
Strict.sln.DotSettings Adds a custom dictionary word entry (appears to include a typo).
Strict.VirtualMachine.Tests/Strict.Runtime.Tests.v3.ncrunchproject Ignores benchmark/manual AdderProgram tests in NCrunch.
Strict.VirtualMachine.Tests/AdderProgramTests.cs Adds BenchmarkDotNet attributes and a manual benchmark runner.
Strict.Validators/ConstantCollapser.cs Updates constant folding to work with updated expression/value representation.
Strict.Validators.Tests/ConstantCollapserTests.cs Adjusts assertions to match updated ToString() output.
Strict.TestRunner/TestExecutor.cs Adds counters and changes which types are executed; now depends on new Executor construction.
Strict.TestRunner.Tests/TestExecutorTests.cs Adds BenchmarkDotNet attributes + manual runner; adds additional slow/manual test variants.
Strict.TestRunner.Tests/Strict.TestRunner.Tests.v3.ncrunchproject Ignores manual/slow benchmark tests in NCrunch.
Strict.TestRunner.Tests/Strict.TestRunner.Tests.csproj Disables generated Program file (but adds a manual Program).
Strict.TestRunner.Tests/Program.cs Manual timing/allocation runner for RunAllTestsInPackage.
Strict.Language/Type.cs Adds type helpers and adjusts dictionary/list type checks; tweaks debug caller info.
Strict.Language/Repositories.cs Conditional compilation for caller-info arguments in DEBUG only.
Strict.Language/Package.cs Conditional compilation for caller-info constructor in DEBUG only; tweaks exception message.
Strict.Language/Context.cs Tracks type-lookup count globally via static counter.
Strict.Language.Tests/ExpressionParserTests.cs Tweaks test parser’s ParseExpression behavior.
Strict.HighLevelRuntime/* Major runtime API refactors: Executor construction + execution flow + statistics tracking.
Strict.HighLevelRuntime.Tests/* Updates tests to use new value creation patterns (but several call sites still reference removed APIs).
Strict.Expressions/* Adds new ValueInstance (struct) and related instance containers; moves EqualsExtensions namespace.
Strict.Expressions.Tests/ValueInstanceTests.cs Reintroduces tests targeting the new Strict.Expressions.ValueInstance.
Comments suppressed due to low confidence (1)

Strict.HighLevelRuntime/Executor.cs:102

  • CreateExecutionContext always sets This = instance, even when the call has no instance (you pass noneInstance for former null). This changes semantics for code paths that rely on ctx.This == null (e.g., member access validation) and can hide errors. Consider keeping This null for "no instance" calls (e.g., when instance is noneInstance).
		var context =
			new ExecutionContext(method.Type, method) { This = instance, Parent = parentContext };
		if (!runOnlyTests)

Comment on lines 14 to 15
private readonly Executor executor = new(TestBehavior.TestRunner);

Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestExecutor constructs Executor with new(TestBehavior.TestRunner), but Executor now requires an initial Package in its constructor. As-is this won't compile and also makes it unclear which package should supply the cached base types (None/Boolean/Number). Update TestExecutor to pass the appropriate package (e.g., TestPackage.Instance) or reintroduce a compatible constructor overload.

Copilot uses AI. Check for mistakes.
Comment on lines 8 to 16
public List(Body bodyForErrorMessage, List<Expression> values, bool isMutable = false) : base(
values[0].ReturnType.GetListImplementationType(
GetCommonBaseType(values.Select(v => v.ReturnType).ToList(), bodyForErrorMessage)),
values, values[0].LineNumber, isMutable) =>
//impossible to know what the ValueInstances would be here: values
new List<ValueInstance>(), values[0].LineNumber, isMutable) =>
Values = values;

public List(Type type, int lineNumber = 0) : base(type, Array.Empty<Expression>(), lineNumber,
public List(Type type, int lineNumber = 0) : base(type, new List<ValueInstance>(), lineNumber,
true) =>
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List now passes an empty List<ValueInstance> into the Value base constructor, which means the constant payload for a list literal no longer contains its elements. Since Executor.RunExpression returns v.Data directly for Value expressions, list literals will evaluate as empty lists. The list value instance should be derived from Values (expressions) during evaluation, or Data must contain the actual list items.

Copilot uses AI. Check for mistakes.
Comment on lines +233 to +236
[Test]
[Benchmark]
public void RunAllTestsInPackage() => executor.RunAllTestsInPackage(TestPackage.Instance);

Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says RunAllTestsInPackage is decorated with both [Benchmark] and [Category("Slow")], but in this file the benchmark method only has [Benchmark] (the [Category("Slow")] attribute is on RunAllTestsInPackageTwice). Either update the attributes or adjust the description so slow tests are categorized consistently.

Copilot uses AI. Check for mistakes.
Comment on lines 47 to 52
using var t = CreateCalcType();
var method = t.Methods.Single(m => m.Name == "Add");
var result = executor.Execute(method, null,
[new ValueInstance(TestPackage.Instance.FindType(Base.Number)!, 5)]);
[ValueInstance.Create(TestPackage.Instance.FindType(Base.Number)!, 5d)]);
Assert.That(Convert.ToDouble(result.Value), Is.EqualTo(6));
}
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test uses the old Executor.Execute(method, instance, args) API, but Executor in this PR only exposes Execute(Method) and its constructor signature changed as well. This file’s updated tests won’t compile until they’re migrated to the new execution API (or compatibility overloads are restored).

Copilot uses AI. Check for mistakes.
Comment on lines +7 to +15
/// <summary>
/// Optimized for size, always just contains 2 values, a pointer to the type, string, list,
/// dictionary, or type instance and if it is a primitive (most common, most lines just return
/// None or True) the None, Boolean, or Number data. This is also
/// </summary>
public readonly struct ValueInstance : IEquatable<ValueInstance>
{
public ValueInstance(Type noneReturnType) => value = noneReturnType;
/// <summary>
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR title/description focus on adding BenchmarkDotNet benchmarks, but this change set also introduces a new Strict.Expressions.ValueInstance implementation and large API changes to Executor/runtime evaluation. If these refactors are intentional, they should be called out explicitly in the PR description (or ideally split into separate PRs) because they significantly raise review/merge risk beyond benchmarking.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +12
using Strict.TestRunner.Tests;

var tests = new TestExecutorTests();
// Warm up
tests.RunAllTestsInPackage();
var allocatedBefore = GC.GetAllocatedBytesForCurrentThread();
var startTicks = DateTime.UtcNow.Ticks;
tests.RunAllTestsInPackage();
var endTicks = DateTime.UtcNow.Ticks;
var allocatedAfter = GC.GetAllocatedBytesForCurrentThread();
Console.WriteLine("Total execution time: " + TimeSpan.FromTicks(endTicks - startTicks));
Console.WriteLine("Allocated bytes (current thread): " + (allocatedAfter - allocatedBefore));
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This project is a test library, but adding Program.cs with top-level statements typically requires an executable output type; otherwise compilation fails (CS8805). If you want a manual runner, consider moving this into a separate console project, or set the project OutputType appropriately and ensure it doesn't interfere with the test runner.

Copilot uses AI. Check for mistakes.
Comment on lines 7 to 10
<Nullable>enable</Nullable>
<IsPackable>false</IsPackable>
<GenerateProgramFile>false</GenerateProgramFile>
</PropertyGroup>
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<GenerateProgramFile>false</GenerateProgramFile> does not address the main issue of having top-level statements in Program.cs inside a library test project. If Program.cs remains, the project likely needs to be OutputType=Exe (or Program.cs excluded/conditionally compiled) to avoid build errors.

Copilot uses AI. Check for mistakes.
Comment on lines 60 to 62
//If this is Range or Number, we should not call GetIteratorValue, index is what we use!
var value = iterator.IsNumber || iterator.IsRange ? indexInstance : iterator.GetIteratorValue(itemType, index);
if (itemType.Name == Base.Text && value is char character)
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExecuteForIteration uses iterator.IsNumber and iterator.IsRange, but ValueInstance (as introduced in this PR) does not define these members, so this won't compile. Either add the corresponding APIs to ValueInstance or change this logic to use the available representation (e.g., check the effective type/name).

Copilot uses AI. Check for mistakes.
Comment on lines +13 to +16
executor = new Executor(TestBehavior.Disabled);
one = new ValueInstance(TestPackage.Instance.GetType(Base.Number), 1);
two = new ValueInstance(TestPackage.Instance.GetType(Base.Number), 2);
var numType = TestPackage.Instance.GetType(Base.Number);
one = ValueInstance.Create(numType, 1d);
two = ValueInstance.Create(numType, 2d);
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixture still constructs Executor via new Executor(TestBehavior.Disabled) and calls executor.Execute(method, instance, args), but in this PR Executor's constructor requires an initial Package and the public Execute overloads were removed/changed. This file won't compile until the test setup and call sites are updated (or compatibility overloads are added).

Copilot uses AI. Check for mistakes.
<s:Boolean x:Key="/Default/UserDictionary/Words/=Dragable/@EntryIndexedValue">True</s:Boolean>
<s:Boolean x:Key="/Default/UserDictionary/Words/=edcba/@EntryIndexedValue">True</s:Boolean>
<s:Boolean x:Key="/Default/UserDictionary/Words/=enu/@EntryIndexedValue">True</s:Boolean>
<s:Boolean x:Key="/Default/UserDictionary/Words/=eturn/@EntryIndexedValue">True</s:Boolean>
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added user-dictionary entry eturn looks like a typo (possibly meant to be return). Adding it will suppress spellchecking for that misspelling across the solution; consider removing it unless it’s intentional.

Copilot uses AI. Check for mistakes.
…r stupid and that is what takes a lot of time to check, don't really want to reenable all code, have bad performance and then clean up from there. instead I will start with optimized code paths, profile and fix complicated tests as needed (for a full run it needs to be green anyway)
…d of doing stupid things the generated code did. also optimized Type to not longer have to check for common primitve or base types, TypeKind gives us an easy fast way to find out.
…ries and main Package handling, huge refactoring coming in to have simpler folder structure that matches better how it works, we will just use folder structure, not namespaces that have to match folders, that is not really needed. logic will stay the same, but it is Strict/Number now, or Strict/Math/Vector2D, this is exactly where the .strict files live and we use the name anyway (Vector2D, List, Number, etc.)
…ed zip downloading and instead just downloading .strict files from github repos (not needed once we have all local strict files here). next up is merging all the strict repos with this one for this new logic
…h and cleaner design, everything Strict related is now in here, no need to clone multiple things, this contains all base libraries, all things to get it running, grammar, parsing, transpiling, compiling, bytecode, vm, runtime, etc.
…, namespace or folder, fixed all compilation errors, but at runtime there are still problems loading too many things (subfolders), which need to be changed now that we don't use repo zip files anymore.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants