PDF Agent Tool
π Key Concepts
The PDF Agent Tool lets your agents read, index, and query content from PDF files stored in your projectβs file system. It's ideal for use cases like:
- Uploading reports, whitepapers, contracts, or manuals
- Asking questions over academic literature
- Creating knowledge agents that extract answers directly from documents
Once configured, the tool builds a semantic index over the provided PDFs, making them queryable through LLMs.
π Key Definitions
Term | Description |
---|---|
PDF Files | The .pdf documents you upload to your workspace for indexing. |
Vector Index | A semantic structure used to embed and retrieve content via similarity search. |
QueryEngineTool | A LlamaIndex interface that routes natural language queries to indexed PDF data. |
Agent Tool | A callable module used by INTELLITHING agents to process specific query types. |
βοΈ Setup Guide: Using the PDF Agent Tool
To configure and use the PDF agent in your workflow:
1. Upload and Reference Files
- Upload PDF files via the file upload interface in the UI or drag-and-drop.
-
Once uploaded, use the file names in the
files
field during configuration (no path needed). -
Example:
["report_q3.pdf", "case_study.pdf"]
π PDF files are expected to reside in the
/data
directory.
2. Configure Tool Parameters
Field | Purpose | Example |
---|---|---|
name |
Internal name of the tool | "Report Reader" |
description |
Used by the agent router to match the tool to queries | "Answers questions from the uploaded quarterly report" |
files |
List of PDF file names (must match uploads) | ["report_q3.pdf"] |
π How It Works
- The tool reads the file list from the
/data
directory. - It loads the content using
SimpleDirectoryReader
, handling multiple PDFs. - A
VectorStoreIndex
is created from the parsed content. - A
QueryEngineTool
is returned, which enables LLM agents to search the indexed documents.
This allows the agent to answer questions like:
- βWhatβs the annual leave policy?β
- βSummarize the main points of the employee handbook.β
- βHow do we handle vendor onboarding?β
β Best Practices
- Name descriptively: Use meaningful
name
anddescription
to help agents route correctly. - Limit file size: For better performance, avoid uploading excessively large or scanned image-only PDFs.
- Scope content: Use focused PDFs (e.g., one document per topic) for higher-quality retrieval.
- Use multi-file config when necessary: You can include multiple PDFs in the
files
list.
π Example Use Case
To create a tool that answers policy questions from HR PDFs:
- Upload
hr_policy.pdf
andleave_policy.pdf
via the UI. -
Configure the tool:
-
Name:
"HR Docs Reader"
- Description:
"Fetches answers from internal HR policy documents."
- Files:
["hr_policy.pdf", "leave_policy.pdf"]
-
Ask your agent:
-
βWhatβs our maternity leave policy?β
- βDo we offer sabbatical leave?β