-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A sentence from columns #95
Comments
@vanushkin please look at |
@MarcinKosinski I would love to try this solution, but tabulizer has been removed from CRAN and it has a java jar dependency whose execution is blocked by default on the computers in my office. No chance to have the sysadmins unblock it. |
Actually this is not stored in the pdf inner markup: https://ropensci.org/blog/2018/12/14/pdftools-20 |
@jeroen I've tried with a PDF file generated by Illustrator (see attached file). Despite the layout's relative complexity, Acrobat recognizes the order of the frames I've defined. This flow order must be stored somewhere, otherwise this would not be possible. Acrobat cannot just guess this on the fly. Perhaps some inner markup elements specific to Acrobat products? |
Dear developers,
I'm having a following issue: when processing pdfs that have text formatted in columns I'm getting a sentence that consists of several lines combined from those columns. It just makes a mess out of text. Is there any solution to this problem? Or a hint how I can retain the structure of initial text?
The text was updated successfully, but these errors were encountered: