Find guides, tutorials, and documentation to help you get the most out of CubePath services.
Showing 10 of 10 guides

Deploy vLLM on Linux for high-throughput LLM inference with PagedAttention. Learn installation, model loading, OpenAI-compatible API, quantization, and GPU memory optimization.

Install Open WebUI to create a ChatGPT-like interface for self-hosted LLMs. Covers Docker deployment, Ollama integration, user management, RAG pipelines, and customization.

Use MinIO as S3-compatible storage for ML workflows. Learn bucket organization for models and datasets, versioning, Python SDK integration, and high-performance data pipelines.

Install NVIDIA GPU drivers on Linux for compute and AI workloads. Covers driver selection, DKMS setup, kernel compatibility, verification, and troubleshooting common installation issues.

Install and configure the NVIDIA CUDA Toolkit for GPU computing on Linux. Learn version selection, cuDNN setup, environment variables, multi-version management, and verification.

Deploy a secure Jupyter Notebook/Lab server on Linux for remote data science work. Covers installation, password protection, SSL with Nginx reverse proxy, and kernel management.

Install Ollama on Linux to run large language models locally. Learn model management, GPU acceleration, API usage, Open WebUI integration, and performance optimization.

Deploy MLflow on Linux for experiment tracking, model registry, and ML lifecycle management. Covers server setup, backend storage, artifact stores, and team collaboration.

Deploy TensorFlow Serving for production model inference on Linux. Learn installation methods, model versioning, REST and gRPC APIs, GPU support, and performance tuning.

Install Stable Diffusion WebUI (AUTOMATIC1111) on a Linux server for AI image generation. Covers dependency setup, GPU configuration, model downloads, and remote access.