I am very new to the coding world and have been stuck on this one problem for 3 days now, searching everywhere for an answer, so any help will be greatly appreciated. I am needing to extract a small amount of text from a url-located Pdf file. I’m using
sessions.get(chart_PDF)as the driver for locating the URL where
chart_PDFis the example url below.
I know I am able to write it to my local drive but I don’t want to do that, I want to be able to do it remotely, since I only need a couple of numbers from it.
I have tried finding the password from the url page for decrypting, couldn’t find. I’ve tried to use
pikepdf (probably not well).
I only need to retrieve two numbers near the bottom of the PDF that can be used for the rest of my code. Please help, even if it is a simple fix, I’m new to all this and need some help. Thanks.
from io import BytesIO from pikepdf import Pdf as PDF from pdfminer import high_level chart_PDF = https://www.airservicesaustralia.com/aip/pending/dap/PADGN01-166_09SEP2021.pdf retrieve = s.get(chart_PDF) content = retrieve.content response =urllib.request.urlopen(chart_PDF) p = BytesIO(content) p.getbuffer() check = PDFPage.get_pages(p, check_extractable=False) extract = high_level.extract_text(p)
PDFTextExtractionNotAllowedWarning: The PDF <_io.BytesIO object at 0x000001B007ABEC20> contains a metadata field indicating that it should not allow text extraction. Ignoring this field and proceeding.warnings.warn(warning_msg, PDFTextExtractionNotAllowedWarning)
Ten-tools.com may not be responsible for the answers or solutions given to any question asked by the users. All Answers or responses are user generated answers and we do not have proof of its validity or correctness. Please vote for the answer that helped you in order to help others find out which is the most helpful answer. Questions labeled as solved may be solved or may not be solved depending on the type of question and the date posted for some posts may be scheduled to be deleted periodically. Do not hesitate to share your response here to help other visitors like you. Thank you, Ten-tools.