Skip to content

Comments

Add MXDOTP operation group with FP4/FP6/FP8 source format support#17

Open
gamzeisl wants to merge 16 commits intopulp-platform:pulpfrom
gamzeisl:feature/mxdotp_multi
Open

Add MXDOTP operation group with FP4/FP6/FP8 source format support#17
gamzeisl wants to merge 16 commits intopulp-platform:pulpfrom
gamzeisl:feature/mxdotp_multi

Conversation

@gamzeisl
Copy link

Summary

Introduces the MXDOTP (Microscaling Dot Product) operation group to CVFPU, implementing scaled dot-product-accumulate over low-precision MX element formats as described in the OCP MX specification.

New formats

Three new floating-point formats are added to fpnew_pkg, expanding NUM_FP_FORMATS from 6 to 9:

Format Encoding
FP6 E3M2
FP6ALT E2M3
FP4 E2M1

New operation group

A sixth operation group (MXDOTP) is added. It supports two operations:

  • MXDOTPF – Scaled dot-product and accumulate for FP source elements (FP4, FP6, FP6ALT, FP8, FP8ALT)
  • MXDOTPI – Scaled dot-product and accumulate for INT source elements (INT8)

Both produce results in FP32 or FP16ALT and accept two 8-bit shared exponent scale factors (one per operand vector).

New source files

File Description
src/mxdotp/fpnew_mxdotp_multi_pkg.sv Local parameters derived from base FP8 format
src/mxdotp/fpnew_mxdotp_multi_modules.sv 14 datapath modules (classify, multiply, shift, accumulate, normalize, round)
src/fpnew_mxdotp_multi.sv Top-level MXDOTP unit integrating all modules
src/fpnew_mxdotp_multi_wrapper.sv Operand unpacking, FP6 3-step lane extraction, NaN-boxing, scale extraction

Changes to existing files

  • fpnew_pkg.sv: New formats, new opgroup, new operations, updated format masks (6→9 bits), updated default FPU configurations
  • fpnew_classifier.sv: MX-specific special cases (FP8ALT: no Inf; FP6/FP6ALT/FP4: no Inf, no NaN)
  • fpnew_opgroup_multifmt_slice.sv: MXDOTP lane generation and wrapper instantiation; elaboration-time checks for mandatory formats
  • fpnew_opgroup_block.sv, fpnew_top.sv: Parameter propagation for the new opgroup
  • fpnew_sdotp_multi_wrapper.sv: Widened FpSrcFmtConfig masks from 6 to 9 bits
  • Bender.yml, src_files.yml: New source files added

Known limitations

  • MXDOTP has been tested with all supported element formats simultaneously enabled, but has not been exhaustively tested with different combinations of supported formats.
  • Additional known limitations are documented as TODO comments in the source files.

Extended fpnew_pkg.sv with new floating-point formats and MXDOTP operation
group for MX dot product operations:

- New formats: FP6(E3M2), FP6ALT(E2M3), FP4(E2M1)
- Increased NUM_FP_FORMATS from 6 to 9
- Added MXDOTP operation group (6th group)
- New operations: MXDOTPF (FP), MXDOTPI (INT)
- Updated all format masks from 6-bit to 9-bit
- Added bias_constant() helper function for MXDOTP
- Updated FPU configurations (DEFAULT_NOREGS, DEFAULT_SNITCH)
Introduces fpnew_mxdotp_multi_pkg.sv with parameterized configuration for
MXDOTP operations supporting mixed-precision
arithmetic with low precision formats.

Configuration:
- Source formats: FP4, FP6, FP6ALT, FP8, FP8ALT, INT8
- Destination formats: FP32, FP16ALT
Add core MXDOTP implementation supporting
very low-precision floating-point formats (FP4, FP6, FP8) and INT8.

New files:
- fpnew_mxdotp_multi_modules.sv: 14 modules implementing
  the MXDOTP datapath (classification, multiplication, shifting,
  accumulation, normalization, rounding)
- fpnew_mxdotp_multi.sv: Top-level MXDOTP unit integrating all modules
New file:
- fpnew_mxdotp_multi_wrapper.sv: Wrapper handling operand unpacking,
  FP6 extended operand processing (3-step with unroll factor), NaN-boxing,
  and scale extraction

Changes to core module:
- Add NumPipeRegs and PipeConfig as module parameters
- Compute NUM_INP_REGS, NUM_MID_REGS, NUM_OUT_REGS from parameters
Add MX parameter and format-specific classification logic to support low-precision formats used in MXDOTP operations.

Changes:
- Add MX parameter (default 1) to enable MX-specific classification
- FP8ALT (E4M3): No infinity, NaN when exp=all1s and man=all1s
- FP6/FP6ALT/FP4 (E3M2/E2M3/E2M1): No infinity or NaN
- Other formats: Standard IEEE-754 classification
- Add elaboration-time checks: fatal for Width!=64, missing FP32,
  missing FP8/INT8; warnings for inactive FP6/FP6ALT/FP4
- Add NUM_MX_LANES localparam and lane generation for MXDOTP
- Instantiate fpnew_mxdotp_multi_wrapper with FpFmtConfig and IntFmtConfig
- Widen FpSrcFmtConfig bitmasks from 6b to 9b to match the extended
  NUM_FP_FORMATS (FP6, FP6ALT, FP4 added but masked off for SDOTP)
Relax format validation in fpnew_opgroup_multifmt_slice to require only
FP8 and FP8ALT as mandatory base formats, allowing INT8 to be disabled.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request introduces the MXDOTP (Microscaling Dot Product) operation group to FPnew, implementing support for the OCP MX specification's low-precision floating-point formats. The implementation adds three new FP formats (FP6, FP6ALT, FP4) and enables scaled dot-product-accumulate operations over these formats plus existing FP8/FP8ALT/INT8 formats.

Changes:

  • Add FP6 (E3M2), FP6ALT (E2M3), and FP4 (E2M1) floating-point formats, expanding NUM_FP_FORMATS from 6 to 9
  • Introduce MXDOTP operation group with MXDOTPF (FP sources) and MXDOTPI (INT8 sources) operations, producing FP32/FP16ALT results
  • Extend fpnew_classifier with MX parameter to handle MX-specific special cases (no infinity/NaN for FP6/FP6ALT/FP4)

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/fpnew_pkg.sv Add new formats, operations, opgroup; expand format masks to 9 bits; add MxFpFmtMask/MxIntFmtMask configuration fields
src/fpnew_classifier.sv Add MX parameter for MX-specific special case handling (FP8ALT: no inf; FP6/FP6ALT/FP4: no inf/NaN)
src/mxdotp/fpnew_mxdotp_multi_pkg.sv Define MXDOTP-specific parameters, types, and helper functions
src/mxdotp/fpnew_mxdotp_multi_modules.sv Implement 14 datapath modules for classification, multiplication, shifting, accumulation, normalization, and rounding
src/fpnew_mxdotp_multi.sv Top-level MXDOTP unit with pipeline stages and datapath integration
src/fpnew_mxdotp_multi_wrapper.sv Handle operand unpacking and FP6 3-step lane extraction with stateful counter
src/fpnew_opgroup_multifmt_slice.sv Add MXDOTP lane generation with elaboration-time format validation checks
src/fpnew_opgroup_block.sv Propagate MxFpFmtMask and MxIntFmtMask parameters
src/fpnew_top.sv Pass MX format masks to opgroup blocks
src/fpnew_sdotp_multi_wrapper.sv Update FpSrcFmtConfig masks from 6 to 9 bits
Bender.yml, src_files.yml Add new MXDOTP source files to build system
docs/README.md Document new operations, formats, and operation group
docs/CHANGELOG-PULP.md Record additions and changes for MXDOTP feature

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@gamzeisl
Copy link
Author

Verified extended FPU in:

@gamzeisl gamzeisl marked this pull request as ready for review February 20, 2026 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant