Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
requirements.txt		requirements.txt
tool.gpt		tool.gpt
tool.py		tool.py

README.md

Google PDF Tool

Overview

This tool uses Google's Vision API to extract text from PDF files. Each page of the PDF is processed individually and the extracted text is then consumed by the LLM.

Usage

Use gcloud command to do gcloud auth application-default login to authenticate your Google Cloud account.
Run the tool with the PDF file you want to process.
The tool will output the extracted text for each page of the PDF.

gcloud auth application-default login

gptscript eval --tools github.com/gptscript-ai/pdf-tool/google "use /path/to/pdf/file.pdf and report the contents of the file"

Detailed Description

tool.gpt

Name: google_pdf_vision_ocr
Description: Convert PDF to images and use Google Vision OCR to parse out text info.
Params:
- file_path: Path to the PDF file to analyze.
- max_tokens: The number of tokens to have created by the LLM. Default 300.
- key_words: Comma-separated list of key words that you want extracted as key-value pairs.

tool.py

The tool.py script performs the following steps:

Convert PDF to Images: Each page of the PDF is converted to an image using the fitz library (PyMuPDF).
Encode Image: The image is encoded to a base64 string.
Send Image to Google Vision: The base64 image is sent to Google Vision API for analysis.
Extract Text Annotations: The text annotations are extracted from the Vision API response.
Find Key-Value Pairs: Key-value pairs are found in the text annotation based on the provided key words.
Extract Handwritten Responses: Handwritten responses are extracted based on key-value pairs.
Output Extracted Data: The extracted text, key-value pairs, and handwritten responses are printed to the console.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

google

google

README.md

Google PDF Tool

Overview

Usage

Detailed Description

tool.gpt

tool.py

Files

google

Directory actions

More options

Directory actions

More options

Latest commit

History

google

Folders and files

parent directory

README.md

Google PDF Tool

Overview

Usage

Detailed Description

tool.gpt

tool.py