Stirling PDF Tools Installation
Stirling-PDF is a self-hosted web application for PDF manipulation that provides over 50 operations including merge, split, compress, OCR, convert, rotate, and watermark — all running locally on your Linux server without uploading sensitive documents to cloud services. With a Docker-based deployment and a REST API, Stirling-PDF integrates into document workflows and replaces paid services like Adobe Acrobat for common PDF tasks.
Prerequisites
- Docker and Docker Compose installed
- Root or sudo access
- Minimum 512 MB RAM (more needed for OCR and conversion operations)
- Sufficient temp disk space for processing large PDF files
Installing Stirling-PDF with Docker
Quick start with Docker:
docker run -d \
--name stirling-pdf \
-p 8080:8080 \
-v /opt/stirling-pdf/trainingData:/usr/share/tessdata \
-v /opt/stirling-pdf/extraConfigs:/configs \
-v /opt/stirling-pdf/customFiles:/customFiles \
-v /opt/stirling-pdf/logs:/logs \
-e DOCKER_ENABLE_SECURITY=false \
-e INSTALL_BOOK_AND_ADVANCED_HTML_OPS=false \
-e LANGS="en_GB" \
--restart unless-stopped \
frooodle/s-pdf:latest
Docker Compose with full features (recommended):
sudo mkdir -p /opt/stirling-pdf/{trainingData,extraConfigs,customFiles,logs}
sudo tee /opt/stirling-pdf/docker-compose.yml <<'EOF'
version: '3.8'
services:
stirling-pdf:
image: frooodle/s-pdf:latest
restart: unless-stopped
ports:
- "8080:8080"
volumes:
- /opt/stirling-pdf/trainingData:/usr/share/tessdata
- /opt/stirling-pdf/extraConfigs:/configs
- /opt/stirling-pdf/customFiles:/customFiles
- /opt/stirling-pdf/logs:/logs
environment:
# Application name displayed in UI
- APP_HOME_NAME=PDF Tools
# Default locale
- LANGS=en_GB
# Install additional tools for advanced features
- INSTALL_BOOK_AND_ADVANCED_HTML_OPS=true
# Disable security (enable if needed)
- DOCKER_ENABLE_SECURITY=false
# Log level
- SYSTEM_DEFAULT_LOGLEVEL=WARN
EOF
cd /opt/stirling-pdf
sudo docker compose up -d
sudo docker compose logs stirling-pdf
Access Stirling-PDF at http://your-server:8080.
Available PDF Operations
Stirling-PDF provides operations organized into categories:
Organize:
| Operation | Description |
|---|---|
| Merge PDFs | Combine multiple PDFs into one |
| Split PDF | Split by page number, range, or bookmarks |
| Remove Pages | Delete specific pages |
| Reorder Pages | Drag-and-drop page reordering |
| Rotate PDF | Rotate pages by 90/180/270 degrees |
| Multi-page Layout | N-up printing (2/4 pages per sheet) |
Convert:
| Operation | Description |
|---|---|
| PDF to Image | Convert pages to PNG/JPEG/TIFF |
| Image to PDF | Combine images into a PDF |
| PDF to Word | Convert to .docx |
| HTML to PDF | Convert web pages |
| Markdown to PDF | Convert .md files |
| PDF to Text | Extract plain text |
Security:
| Operation | Description |
|---|---|
| Add Password | Encrypt with user/owner passwords |
| Remove Password | Decrypt PDFs (requires password) |
| Change Permissions | Set printing/copying restrictions |
| Add Watermark | Text or image watermarks |
| Redact | Permanently remove sensitive content |
| Sign | Add digital signatures |
Optimize:
Compress PDF - Reduce file size (configurable quality)
Repair PDF - Fix corrupted PDF structure
Linearize PDF - Optimize for web delivery (fast first-page load)
Remove Blanks - Delete blank pages automatically
OCR Configuration
Stirling-PDF uses Tesseract for OCR (adding searchable text layers to scanned PDFs):
Install Tesseract language packs:
# Add language data to the mounted trainingData volume
# Download from: https://github.com/tesseract-ocr/tessdata
# English is included by default
# Add German language support
curl -LO https://github.com/tesseract-ocr/tessdata/raw/main/deu.traineddata
sudo mv deu.traineddata /opt/stirling-pdf/trainingData/
# Add French
curl -LO https://github.com/tesseract-ocr/tessdata/raw/main/fra.traineddata
sudo mv fra.traineddata /opt/stirling-pdf/trainingData/
# Add Spanish
curl -LO https://github.com/tesseract-ocr/tessdata/raw/main/spa.traineddata
sudo mv spa.traineddata /opt/stirling-pdf/trainingData/
# Restart to reload language data
sudo docker compose restart stirling-pdf
Perform OCR on a scanned PDF:
- Go to
http://your-server:8080 - Navigate to Convert > PDF to PDF/A or Other > Add Text Layer
- Upload your scanned PDF
- Select language(s)
- Choose OCR mode:
- Skip text: Only add text where none exists (fastest)
- Force OCR: Re-OCR all pages (most thorough)
- PDF/A: Create archival-quality PDF with text layer
- Download the processed PDF
REST API Usage
Stirling-PDF exposes a REST API for programmatic access:
API documentation: http://your-server:8080/swagger-ui/index.html
Common API operations:
BASE="http://localhost:8080/api"
# Merge two PDFs
curl -X POST "${BASE}/v1/general/merge-pdfs" \
-F "[email protected]" \
-F "[email protected]" \
-o merged.pdf
# Split PDF by page range
curl -X POST "${BASE}/v1/general/split-pages" \
-F "[email protected]" \
-F "pageNumbers=1-5,8,10-15" \
-o split.pdf
# Compress PDF
curl -X POST "${BASE}/v1/general/compress-pdf" \
-F "[email protected]" \
-F "optimizeLevel=3" \ # 1-5 (5=most compression)
-o compressed.pdf
# Convert PDF to images
curl -X POST "${BASE}/v1/convert/pdf/img" \
-F "[email protected]" \
-F "imageFormat=png" \
-F "singleOrMultiple=multiple" \
-F "dpi=150" \
-o images.zip
# Add watermark
curl -X POST "${BASE}/v1/stamp/add-watermark" \
-F "[email protected]" \
-F "watermarkText=CONFIDENTIAL" \
-F "fontSize=50" \
-F "opacity=0.3" \
-F "rotation=45" \
-o watermarked.pdf
# OCR a scanned PDF
curl -X POST "${BASE}/v1/misc/ocr-pdf" \
-F "[email protected]" \
-F "languages=eng" \
-F "ocrType=skip-text" \
-o searchable.pdf
Batch processing script:
#!/bin/bash
# Compress all PDFs in a directory
INPUT_DIR="/docs/incoming"
OUTPUT_DIR="/docs/compressed"
BASE="http://localhost:8080/api"
mkdir -p "$OUTPUT_DIR"
for pdf in "$INPUT_DIR"/*.pdf; do
filename=$(basename "$pdf")
echo "Compressing: $filename"
curl -s -X POST "${BASE}/v1/general/compress-pdf" \
-F "fileInput=@${pdf}" \
-F "optimizeLevel=2" \
-o "${OUTPUT_DIR}/${filename}"
echo "Saved to: ${OUTPUT_DIR}/${filename}"
done
Security Configuration
Enable login authentication:
# In docker-compose.yml environment:
- DOCKER_ENABLE_SECURITY=true
- SECURITY_ENABLELOGIN=true
- SECURITY_INITIALLOGIN_USERNAME=admin
- SECURITY_INITIALLOGIN_PASSWORD=change-me-now
# Disable the default user endpoint (prevents anonymous API access)
- SECURITY_CSRFDISABLED=false
Configure allowed operations (limit to specific tools):
# Create /opt/stirling-pdf/extraConfigs/settings.yml
# /opt/stirling-pdf/extraConfigs/settings.yml
security:
enableLogin: true
loginAttemptCount: 5
loginResetTimeMinutes: 10
system:
defaultLocale: en-US
googlevisibility: false
rootUriPath: "/"
ui:
appName: "My PDF Tools"
homeDescription: "Internal document processing"
appNameNavbar: "PDF Tools"
Reverse Proxy with Nginx
# /etc/nginx/sites-available/stirling-pdf
server {
listen 80;
server_name pdf.example.com;
return 301 https://$host$request_uri;
}
server {
listen 443 ssl;
server_name pdf.example.com;
ssl_certificate /etc/letsencrypt/live/pdf.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/pdf.example.com/privkey.pem;
# Allow large PDF uploads
client_max_body_size 500M;
proxy_read_timeout 600s;
proxy_send_timeout 600s;
location / {
proxy_pass http://127.0.0.1:8080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
sudo ln -s /etc/nginx/sites-available/stirling-pdf /etc/nginx/sites-enabled/
sudo certbot --nginx -d pdf.example.com
sudo systemctl reload nginx
Troubleshooting
OCR not working / "language not found" error:
# Verify language files are in the correct location
ls /opt/stirling-pdf/trainingData/
# Should show: eng.traineddata (and others)
# Check if the container can see the files
sudo docker compose exec stirling-pdf ls /usr/share/tessdata/
# Restart after adding new language files
sudo docker compose restart stirling-pdf
PDF conversion failing:
# Check container logs
sudo docker compose logs stirling-pdf -n 50
# Verify the PDF is not corrupted
# Try with a smaller/simpler PDF first
# Some operations require LibreOffice
# Enable INSTALL_BOOK_AND_ADVANCED_HTML_OPS=true and restart
Application timing out on large files:
# Increase timeout in Nginx proxy settings
# proxy_read_timeout 1800s;
# Or process via API with curl timeout
curl --max-time 600 -X POST ...
# Check available disk space (temp files are written during processing)
df -h /tmp
Conclusion
Stirling-PDF provides a comprehensive self-hosted PDF manipulation toolkit that eliminates the need for cloud PDF services or paid desktop software for common document operations. With over 50 available operations, Tesseract OCR integration, and a well-documented REST API for workflow automation, it handles everything from simple merges to complex document processing pipelines entirely within your own infrastructure.


