File & Document Processing Nodes
Work with files, PDFs, and images in your workflows
File & Document Processing Nodes
Work with files, PDFs, and images. Read uploads, generate documents, parse content, transform images. Everything you need to handle files in your workflows.
File Operations Node
When to use: When you need to read, write, copy, or delete files from disk or cloud storage.
The File Operations Node gives you filesystem access. Read files for processing, write results to disk, move files around, list directories.
Configuration
Configuration:
operation: read | write | copy | move | delete | list
path: "/data/{{ input.filename }}"
cloud_storage: s3 | gcs | azure_blob # Optional
File Reading Examples
Example 1: Read CSV File
When to use: Process uploaded CSV files for import, analysis, etc.
Incoming data:
{
"filename": "customers.csv"
}
File Operations Configuration:
operation: read
path: "/uploads/{{ input.filename }}"
Output:
{
"content": "name,email,phone\nAlice,alice@example.com,555-0101\nBob,bob@example.com,555-0102\n...",
"file_size_bytes": 1024,
"path": "/uploads/customers.csv"
}
Connect to Data Parser to convert CSV string → array of objects → Loop → Process each row!
Example 2: Read JSON Configuration File
operation: read
path: "/config/{{ input.config_name }}.json"
Perfect for loading environment-specific configs, feature flags, rules, etc.
Example 3: Read File from S3
operation: read
path: "s3://my-bucket/documents/{{ input.document_id }}.pdf"
cloud_storage: s3
Example 4: Read Multiple Files (List + Loop)
operation: list
path: "/data/{{ input.date }}"
pattern: "*.json"
Output:
{
"files": [
"/data/2025-02-10/file1.json",
"/data/2025-02-10/file2.json",
"/data/2025-02-10/file3.json"
],
"count": 3
}
Loop through files and process each one!
File Writing Examples
Example 1: Generate and Save Report
When to use: Create files for download, archiving, or sharing.
Incoming data:
{
"report_id": "RPT-001",
"content": "Sales Report for Q1 2025\n\nTotal: $500,000\n..."
}
File Operations Configuration:
operation: write
path: "/reports/{{ input.report_id }}.txt"
content: "{{ input.content }}"
Creates /reports/RPT-001.txt for download!
Example 2: Save JSON to File
operation: write
path: "/exports/{{ input.export_id }}.json"
content: "{{ input.data | json }}"
Example 3: Write to Cloud (S3)
operation: write
path: "s3://backups/{{ input.backup_date }}/data.json"
cloud_storage: s3
content: "{{ input.data | json }}"
Perfect for automated backups!
File Movement Examples
Example 1: Move Processed File
operation: move
path: "/inbox/{{ input.filename }}"
destination: "/processed/{{ input.filename }}"
Processed → move to archive folder.
Example 2: Copy File
operation: copy
path: "/source/template.docx"
destination: "/output/{{ input.document_id }}.docx"
Create file from template!
Common File Patterns
Pattern 1: Upload, Read, Process
Start (file upload)
↓
File Operations (read)
↓
Data Parser (parse CSV/JSON)
↓
Loop (for each record)
├─ Transform
├─ Validate
├─ Database Insert
└─ Next
↓
File Operations (move to processed folder)
↓
Notification (email summary)
Pattern 2: Generate and Download
Start (user request)
↓
AI Agent or Code (generate content)
↓
File Operations (write to disk)
↓
HTTP Response (send file to user)
PDF Operations Node
When to use: When you need to generate, merge, split, parse, or encrypt PDFs.
The PDF Operations Node handles all PDF workflows—create from templates, extract text, merge multiple PDFs, etc.
Configuration
Configuration:
operation: generate | parse | merge | split | extract | encrypt
template: "invoice_template"
data: "{{ input.invoice_data }}"
PDF Generation Examples
Example 1: Generate Invoice from Template
When to use: Create invoices, receipts, reports, contracts—anything PDF.
Incoming data:
{
"invoice_id": "INV-001",
"customer_name": "Alice Johnson",
"items": [
{ "name": "Widget", "price": 25.00, "qty": 2 },
{ "name": "Gadget", "price": 50.00, "qty": 1 }
],
"total": 100.00,
"due_date": "2025-03-10"
}
PDF Operations Configuration:
operation: generate
template: "invoice_template"
data:
invoice_id: "{{ input.invoice_id }}"
customer_name: "{{ input.customer_name }}"
items: "{{ input.items }}"
total: "{{ input.total }}"
due_date: "{{ input.due_date }}"
output_format: pdf
Output:
{
"pdf_data": "base64-encoded-pdf",
"filename": "invoice_INV-001.pdf"
}
Save to disk or email directly!
Example 2: Generate from HTML
operation: generate
source_type: html
html_content: |
<html>
<h1>{{ invoice_id }}</h1>
<p>Customer: {{ customer_name }}</p>
<table>
{% for item in items %}
<tr><td>{{ item.name }}</td><td>${{ item.price }}</td></tr>
{% endfor %}
</table>
<p>Total: ${{ total }}</p>
</html>
output_format: pdf
PDF Parsing Examples
Example 1: Extract Text from PDF
When to use: Extract information from uploaded documents—resumes, contracts, forms.
Incoming data:
{
"pdf_file": "resume.pdf"
}
PDF Operations Configuration:
operation: extract
source: "/uploads/{{ input.pdf_file }}"
extract_type: text
Output:
{
"text": "Alice Johnson\nphone: 555-0101\nemail: alice@example.com\n...",
"page_count": 1
}
Pass to AI Agent for extraction/analysis!
Example 2: Parse Structured PDF
operation: parse
source: "/uploads/contract.pdf"
structure: form
extract_fields:
- signature_field
- date_field
- amount_field
Example 3: Merge Multiple PDFs
When to use: Combine multiple PDFs—contracts, documents, reports.
Incoming data:
{
"pdfs": ["cover.pdf", "chapter1.pdf", "chapter2.pdf"]
}
PDF Operations Configuration:
operation: merge
files:
- "/templates/{{ input.pdfs[0] }}"
- "/templates/{{ input.pdfs[1] }}"
- "/templates/{{ input.pdfs[2] }}"
output_format: pdf
Creates one merged PDF!
Example 4: Split PDF
operation: split
source: "/documents/large_document.pdf"
split_strategy: by_page
output_path: "/documents/split/"
Outputs: split_1.pdf, split_2.pdf, etc.
Example 5: Encrypt PDF
When to use: Protect sensitive documents with passwords.
operation: encrypt
source: "/documents/confidential.pdf"
owner_password: "{{ credentials.pdf_master_password }}"
user_password: "{{ input.user_password }}"
permissions:
- no_print
- no_copy
Common PDF Patterns
Pattern 1: Generate and Email
Start (order confirmed)
↓
PDF Operations (generate invoice)
↓
File Operations (save to S3)
↓
Email (send PDF to customer)
Pattern 2: Extract and Classify
Start (PDF upload)
↓
PDF Operations (extract text)
↓
AI Agent (classify document type)
↓
Switch (route by type)
├─ Contract → Legal team
├─ Invoice → Accounting
└─ Report → Managers
Pattern 3: Multi-Step Document Creation
Start
├─ PDF Generate (cover page)
├─ PDF Generate (chapter 1)
├─ PDF Generate (chapter 2)
└─ PDF Generate (appendix)
↓
Merge (combine all)
↓
Encrypt (add password)
↓
Email
Image Processing Node
When to use: When you need to resize, crop, compress, rotate, watermark, or convert images.
The Image Processing Node handles all image transformations—optimize for web, create thumbnails, add branding, etc.
Configuration
Configuration:
operation: resize | crop | compress | rotate | watermark | convert
width: 800
height: 600
format: png | jpg | webp
Image Examples
Example 1: Generate Thumbnail
When to use: Create small previews for galleries, lists, etc.
Incoming data:
{
"image_path": "/uploads/product-photo.jpg"
}
Image Processing Configuration:
operation: resize
source: "{{ input.image_path }}"
width: 300
height: 300
quality: 80
output_format: webp
output_path: "/thumbnails/product-photo-thumb.webp"
Creates optimized thumbnail!
Example 2: Batch Image Optimization
Start (with array of images)
↓
Loop (for each image)
├─ Image Processing (resize to 1200px)
├─ Image Processing (compress, quality 75)
├─ Image Processing (convert to WebP)
├─ File Operations (save optimized version)
└─ Next
Perfect for website optimization!
Example 3: Add Watermark
operation: watermark
source: "/photos/event-photo.jpg"
watermark_image: "/assets/logo-watermark.png"
position: bottom-right
opacity: 0.7
output_path: "/watermarked/event-photo.jpg"
Example 4: Crop Image
operation: crop
source: "/images/full-photo.jpg"
x: 100
y: 50
width: 500
height: 300
output_path: "/images/cropped.jpg"
Example 5: Rotate and Convert Format
operation: rotate
source: "/images/photo.jpg"
degrees: 90
output_format: png
output_path: "/images/photo-rotated.png"
Image Processing Patterns
Pattern 1: Batch Image Processing
Start (photos uploaded)
↓
Loop (for each photo)
├─ Image Processing (resize to standard size)
├─ Image Processing (compress)
├─ Image Processing (convert to WebP)
├─ File Operations (save)
└─ Database (record new path)
↓
Notification (done)
Pattern 2: Generate Social Media Assets
Start (user uploads photo)
├─ Image Processing (resize to 1080x1080 for Instagram)
├─ Image Processing (resize to 1200x630 for Facebook)
├─ Image Processing (resize to 1024x512 for Twitter)
├─ File Operations (save all versions)
↓
Email (download links)
Pattern 3: Smart Image with OCR
Start (document image)
↓
Image Processing (optimize contrast)
↓
File Operations (save processed image)
↓
AI Agent (read text via OCR)
↓
Data Transform (extract structured data)
↓
Database (store extracted info)
File Processing Best Practices
Security
- Validate file types - check MIME type, not just extension
- Scan for malware - use third-party scanning service before processing
- Set file size limits - prevent DoS via huge uploads
- Use encryption for sensitive files
Performance
- Process in background - don't block waiting for file operations
- Optimize images - compress before storing (saves 60-80% space)
- Use cloud storage - S3/GCS for large files instead of local disk
- Cache PDFs - don't regenerate identical PDFs
Reliability
- Add retry logic - file operations can fail (disk full, permissions)
- Validate output - check file was created/readable
- Log operations - track what files were created/modified
- Clean up temp files - don't leave processed files lying around
Next Steps
- Need to parse structured files? Use Data Parser Node
- Process file contents with AI? Try AI & Intelligence Nodes
- Store file metadata? Visit Database Nodes
- Send files to users? Check Communication Nodes
- Transform file data? Use Data Processing Nodes