Skip to content

Duplicated transaction #45

@0xKD

Description

@0xKD

This is a bug in pdfminer/mupdf but I thought It would be useful to document (since the implications are somewhat critical if you rely on the output of casparser).

If you have pages that like look this across page boundaries, it seems to count the transaction at start of page two in the previous page as well. For me, it counts the *** Stamp Duty*** transaction at the start of the second page twice (once as part of the previous page 4, and again for the actual first time it is encountered - in page 5).

parsingbug

My guess is the mediabox (used by pdfminer to determine page boundaries) of the page is larger than necessary and extends into the second one.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions