Skip to content

Parser is skipping the first page #748

@datawench

Description

@datawench
  • PHP Version: 8.2.20
  • PDFParser Version: 2.11.0

Description:

I'm parsing the PDF which can be found here: https://oag.ca.gov/system/files/Maxar%20-%20Adult%20CA%20Sample%20Ltr_Redacted.pdf

The parser appears to be skipping the first page, and only extracting text from the last two.

PDF input

See link above.

Expected output & actual output

I would expect the output to start with "MAXAR SPACE SYSTEMS", or perhaps "I write on behalf of." Instead, this is what I get:

"not been delayed due to any law enforcement investigation. We are also taking additional actions as required..." with interspersed tabs.

Code

I'm using the simplest possible code:

$parser = new Parser();
$pdf = $parser->parseFile($filePath);
$text = $pdf->getText();

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions