Write benchmarks by larrytamnjong · Pull Request #630 · G-Research/ParquetSharp

larrytamnjong · 2026-03-26T17:10:59Z

No description provided.

…ling and add new commands for various reading scenarios

…djust table headers for clarity

Co-authored-by: Adam Reeve <adreeve@gmail.com>

…into write-benchmarks

…ompression options

larrytamnjong · 2026-03-26T17:18:34Z

Hi @adamreeve , added initial code to read the decompressed file and write it to Parquet. I started with the num_plasma dataset:

./spdp.exe < "num_plasma.sp.spdp" > num_plasma.bin

The binary is then read and saved as Parquet with different configurations. Please let me know if this is on track.

adamreeve

Thanks Larry, I've left a couple of suggestions

…encoding and compression options

…egies

larrytamnjong · 2026-04-09T10:05:28Z

Hi @adamreeve , I ran the tests with the three files and drafted some documentation of the findings. Please let me know what you think.

adamreeve

Thanks @larrytamnjong, I've left some suggestions.

Co-authored-by: Adam Reeve <adreeve@gmail.com>

…ression insights

larrytamnjong · 2026-04-10T12:10:02Z

@adamreeve I’ve resolved the suggestions you left, please take another look and let me know what you think.

adamreeve

These latest changes look good thanks Larry, I just have one small suggested change.

For the next step, it would be good to add measurements of how the write configuration affects the read time, when using consistent read options. Using the logical-chunked configuration would make sense as you found this was optimal in your previous tests. You might want to allow specifying the file to read when using the read benchmark commands to support this.

Co-authored-by: Adam Reeve <adreeve@gmail.com>

…marks.md

larrytamnjong · 2026-04-14T10:51:04Z

@adamreeve I’ve added read measurements and moved the PR out of draft. Please let me know if everything looks okay.

…d times for each configuration

larrytamnjong · 2026-04-15T12:11:43Z

@adamreeve as discussed I have updated the read results to be an average of 5 runs.

adamreeve · 2026-04-15T19:02:36Z

👍 Thanks for all your work on this @larrytamnjong

larrytamnjong and others added 20 commits February 25, 2026 15:15

Add Memory-Optimized Reading Benchmarks documentation

561d463

Improve clarity in Memory-Optimized Reading Benchmarks documentation

b083908

Add Memory Benchmark Samples guide for ParquetSharp

757b871

Update section numbering in Memory Benchmark Samples guide

19334cf

Remove MemoryBenchmarkSamples.md; add reference in main doc

9550ea4

Add ParquetSharp config benchmarks project

f455d7f

Increase benchmark dataset size and row groups

98160ab

Update and expand memory benchmark results and analysis

f047aad

Add benchmark to documentation

23078d5

Update benchmark results

73d148e

Refactor benchmark program to use System.CommandLine for command hand…

0eb36e4

…ling and add new commands for various reading scenarios

Update MemoryBenchmarks.md to include results from two run sets and a…

16ade66

…djust table headers for clarity

Update docs/guides/MemoryBenchmarks.md

97d30a4

Co-authored-by: Adam Reeve <adreeve@gmail.com>

Update docs/guides/MemoryBenchmarks.md

a009ff3

Co-authored-by: Adam Reeve <adreeve@gmail.com>

Add note about benchmark test environment

d731972

Update benchmark throughput calculations to use raw data size

966876f

Add results for Logical Column reader 5 runs

e9df720

Summarize logical column reader run results

bd79cc3

Merge branch 'master' of https://github.com/larrytamnjong/ParquetSharp …

e7a45ef

…into write-benchmarks

Add commands for generating Parquet files with various encoding and c…

8223d8f

…ompression options

adamreeve reviewed Mar 30, 2026

View reviewed changes

Comment thread csharp.config.benchmarks/Program.cs Outdated

Comment thread csharp.config.benchmarks/ParquetSharpConfigBenchmarks.cs Outdated

larrytamnjong and others added 4 commits March 31, 2026 10:45

Refactor command structure for generating Parquet files with dynamic …

413437d

…encoding and compression options

Merge branch 'master' into write-benchmarks

2be957a

Add Write Benchmarks documentation for encoding and compression strat…

4357080

…egies

Merge branch 'master' into write-benchmarks

91a0cbe

adamreeve reviewed Apr 10, 2026

View reviewed changes

larrytamnjong and others added 2 commits April 10, 2026 07:40

Update csharp.config.benchmarks/ParquetSharpConfigBenchmarks.cs

ffe36fb

Co-authored-by: Adam Reeve <adreeve@gmail.com>

Refactor encoding and compression handling in benchmarks

efd9d31

Enhance WriteBenchmarks documentation with detailed encoding and comp…

784a5ee

…ression insights

adamreeve reviewed Apr 13, 2026

View reviewed changes

Comment thread docs/guides/WriteBenchmarks.md Outdated

larrytamnjong and others added 4 commits April 13, 2026 08:14

Update docs/guides/WriteBenchmarks.md

b06da98

Co-authored-by: Adam Reeve <adreeve@gmail.com>

Add logical chunked file reader and file validation methods

9dbcc1e

Add read time measurements and update benchmark results in WriteBench…

166e30e

…marks.md

Merge branch 'master' into write-benchmarks

5438c73

larrytamnjong marked this pull request as ready for review April 14, 2026 10:51

Update benchmark results in WriteBenchmarks.md to include average rea…

ced84bf

…d times for each configuration

adamreeve approved these changes Apr 15, 2026

View reviewed changes

adamreeve merged commit 44f29d5 into G-Research:master Apr 15, 2026
49 checks passed

larrytamnjong deleted the write-benchmarks branch April 16, 2026 13:27

Conversation

larrytamnjong commented Mar 26, 2026

Uh oh!

larrytamnjong commented Mar 26, 2026

Uh oh!

adamreeve left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

larrytamnjong commented Apr 9, 2026

Uh oh!

adamreeve left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

larrytamnjong commented Apr 10, 2026

Uh oh!

adamreeve left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

larrytamnjong commented Apr 14, 2026

Uh oh!

larrytamnjong commented Apr 15, 2026

Uh oh!

adamreeve commented Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants