Ollama

Run AI models like Llama, Mistral and Gemma on your server. OpenAI-compatible REST API. Complete privacy without external services.

Version 3.2 app 1-Click Install

Deploy Now

Ollama

Getting started

Ollama lets you run large language models (LLMs) like Llama, Mistral and Gemma directly on your server. OpenAI-compatible REST API for easy integration.

Go to https://my.cubepath.com/deploy
Select a VPS plan (recommended: gp.medium or higher)
Under Operating System, choose "Ollama"
Click Deploy

Once deployment is complete, Ollama will be available at http://YOUR-IP:11434. You'll need to download models before using it.

Deploy via CubePath Cloud API

Via CubeCLI

cubecli vps create \
  --name ollama-server \
  --plan gp.medium \
  --template "Ollama" \
  --location us-mia-1 \
  --project <project-id>

Via API

curl -X POST https://api.cubepath.com/vps \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "ollama-server",
    "plan": "gp.medium",
    "template": "Ollama",
    "location": "us-mia-1",
    "project_id": "YOUR_PROJECT_ID"
  }'

Technical information

Access:
- API URL: http://YOUR-IP:11434
- No web interface (API only)

Installed software:
- Docker
- Ollama (official ollama/ollama image)

Ports used:
- 11434: Ollama API

File locations:
- Data and models: /opt/ollama/
- Downloaded models: /opt/ollama/models/

System requirements:
- Minimum RAM: 8 GB (plan gp.small)
- Recommended RAM: 16 GB or more (plan gp.medium)
- Disk space: Minimum 20 GB (models take 2-8 GB each)

Recommended models:
- 8 GB RAM: llama3.2 (2GB), gemma (2GB), mistral (4GB)
- 16 GB RAM: llama3.1 (5GB), codellama (4GB)
- 32 GB+ RAM: llama3.1:70b, larger models

Useful commands

# Download a model
docker exec -it ollama ollama pull llama3.2

# List installed models
docker exec -it ollama ollama list

# Run model in chat mode
docker exec -it ollama ollama run llama3.2

# Remove a model
docker exec -it ollama ollama rm llama3.2

# View logs
docker logs ollama

# Restart Ollama
docker restart ollama

Using the API

# List available models
curl http://YOUR-IP:11434/api/tags

# Generate text
curl http://YOUR-IP:11434/api/generate -d '{
  "model": "llama3.2",
  "prompt": "Why is the sky blue?",
  "stream": false
}'

¡Únete a Nuestra Comunidad en Discord!

Servidores Dedicados

VPS en la Nube

Nube Privada

Servidores Administrados con cPanel

Explorar

WordPress

NextCloud

GitLab

WireGuard VPN

Coolify

cPanel

Supabase

N8N Popular

NocoDB

Explorar

Ollama

Getting started

Deploy via CubePath Cloud API

Via CubeCLI

Via API

Technical information

Useful commands

Using the API

Links

Información Técnica

En esta página

Enlaces

Soporte