This is a bug in pdfminer/mupdf but I thought It would be useful to document (since the implications are somewhat critical if you rely on the output of casparser).
If you have pages that like look this across page boundaries, it seems to count the transaction at start of page two in the previous page as well. For me, it counts the *** Stamp Duty*** transaction at the start of the second page twice (once as part of the previous page 4, and again for the actual first time it is encountered - in page 5).

My guess is the mediabox (used by pdfminer to determine page boundaries) of the page is larger than necessary and extends into the second one.
This is a bug in pdfminer/mupdf but I thought It would be useful to document (since the implications are somewhat critical if you rely on the output of casparser).
If you have pages that like look this across page boundaries, it seems to count the transaction at start of page two in the previous page as well. For me, it counts the
*** Stamp Duty***transaction at the start of the second page twice (once as part of the previous page 4, and again for the actual first time it is encountered - in page 5).My guess is the
mediabox(used by pdfminer to determine page boundaries) of the page is larger than necessary and extends into the second one.