Look at https://github.com/ChrizH/pdfstructure - it implements a pdfminer based solution that checks the font style of each lines and checks for prepended chapter numbers. Here is an article about the solution: https://medium.com/@_chriz_/development-of-a-structure-aware-pdf-parser-7285f3fe41a9
Look at https://github.com/ChrizH/pdfstructure - it implements a pdfminer based solution that checks the font style of each lines and checks for prepended chapter numbers.
Here is an article about the solution:
https://medium.com/@_chriz_/development-of-a-structure-aware-pdf-parser-7285f3fe41a9