Apache Superset Data Visualization Installation
Apache Superset is an open-source data exploration and visualization platform that connects to dozens of databases and lets you build interactive dashboards without writing code. This guide covers installing Superset on Linux using Docker Compose, configuring database connections, creating charts and dashboards, using SQL Lab, and managing user roles.
Prerequisites
- Ubuntu 20.04+ or CentOS 8+ / Rocky Linux 8+
- Docker and Docker Compose, or Python 3.9+
- 4 GB RAM minimum (8 GB recommended for production)
- A target database to visualize (PostgreSQL, MySQL, Trino, etc.)
Installing Superset with Docker Compose
The Docker Compose approach is the recommended path for production:
# Clone the Superset repository
git clone https://github.com/apache/superset.git
cd superset
# Check out the latest stable release tag
git checkout $(git tag | grep -E '^[0-9]+\.[0-9]+\.[0-9]+$' | sort -V | tail -1)
# Copy example environment file
cp docker/.env-non-dev docker/.env
# Edit the environment file - set a strong SECRET_KEY
nano docker/.env
# Change: SECRET_KEY=your_very_long_random_secret_key_here
# Generate one with: openssl rand -base64 42
# Start Superset (first start downloads images and runs migrations - takes ~5 min)
docker compose -f docker-compose-non-dev.yml up -d
# Check status
docker compose -f docker-compose-non-dev.yml ps
# Default admin credentials: admin / general
# Access at http://your-server:8088
Installing Superset with pip
For a non-Docker installation on Ubuntu:
# Install system dependencies
sudo apt-get update
sudo apt-get install -y build-essential libssl-dev libffi-dev python3-dev \
python3-pip libsasl2-dev libldap2-dev default-libmysqlclient-dev
# Create virtual environment
python3 -m venv /opt/superset-venv
source /opt/superset-venv/bin/activate
# Install Superset
pip install apache-superset
# Install database drivers (add what you need)
pip install psycopg2-binary # PostgreSQL
pip install mysqlclient # MySQL
pip install pydruid # Druid
# Set environment variables
export FLASK_APP=superset
export SECRET_KEY=$(openssl rand -base64 42)
# Initialize the database
superset db upgrade
# Create admin user
superset fab create-admin \
--username admin \
--firstname Admin \
--lastname User \
--email [email protected] \
--password adminpassword
# Load example data (optional)
superset load_examples
# Initialize default roles and permissions
superset init
# Start the development server (use gunicorn for production)
superset run -p 8088 --with-threads --reload --debugger
For production, run with Gunicorn:
# Install gunicorn and celery
pip install gunicorn celery redis
# Start with gunicorn
gunicorn \
--bind 0.0.0.0:8088 \
--workers 4 \
--timeout 120 \
--limit-request-line 0 \
--limit-request-field_size 0 \
"superset.app:create_app()"
Connecting Databases
Superset supports 40+ databases through SQLAlchemy. To add a database:
- Go to Settings → Database Connections → + Database
- Select your database type from the dropdown
- Enter the SQLAlchemy URI or use the form fields
Common connection strings:
# PostgreSQL
postgresql+psycopg2://user:password@host:5432/dbname
# MySQL
mysql+mysqlconnector://user:password@host:3306/dbname
# SQLite (for testing)
sqlite:////path/to/database.db
# Amazon Redshift
redshift+psycopg2://user:password@host:5439/dbname
# BigQuery (requires google-cloud-bigquery pip package)
bigquery://project-id
Enable Allow DML and Expose in SQL Lab as needed, then click Test Connection before saving.
Creating Charts and Dashboards
Create a chart:
- Go to Charts → + Chart
- Choose your dataset (table or saved SQL query)
- Select a chart type (Bar Chart, Line Chart, Table, Map, etc.)
- Configure the chart in the Data tab:
- Set Dimensions (X-axis / group by)
- Set Metrics (COUNT, SUM, AVG, etc.)
- Add Filters as needed
- Customize in the Customize tab (colors, labels, legends)
- Click Save
Create a dashboard:
- Go to Dashboards → + Dashboard
- Name your dashboard
- Click Edit dashboard
- Drag charts from the right panel onto the canvas
- Resize and rearrange cards
- Add Filters using the filter icon to link charts
- Click Save
For cross-filtering (click one chart to filter others), enable it in Dashboard properties → Cross-filtering.
Using SQL Lab
SQL Lab is Superset's SQL IDE for ad-hoc analysis:
- Go to SQL → SQL Lab
- Select a database and schema from the dropdowns
- Write your query:
-- Example: cohort analysis
SELECT
date_trunc('week', first_order_date)::date AS cohort_week,
count(DISTINCT customer_id) AS cohort_size,
sum(revenue) AS total_revenue
FROM (
SELECT
customer_id,
min(created_at) AS first_order_date,
sum(amount) AS revenue
FROM orders
WHERE created_at >= '2024-01-01'
GROUP BY customer_id
) sub
GROUP BY 1
ORDER BY 1;
- Press Ctrl+Enter or click Run
- Click Save to save as a query or Explore to create a chart from results
- Use Create dataset to make results available as a reusable dataset
Query history is saved automatically. Use Search to find previous queries.
Caching Configuration
Configure Redis caching to avoid re-running expensive queries:
# superset_config.py (mount into container or set in config)
from cachelib.redis import RedisCache
CACHE_CONFIG = {
"CACHE_TYPE": "RedisCache",
"CACHE_DEFAULT_TIMEOUT": 300, # 5 minutes
"CACHE_KEY_PREFIX": "superset_",
"CACHE_REDIS_URL": "redis://redis:6379/0",
}
DATA_CACHE_CONFIG = {
"CACHE_TYPE": "RedisCache",
"CACHE_DEFAULT_TIMEOUT": 3600, # 1 hour for query results
"CACHE_KEY_PREFIX": "superset_data_",
"CACHE_REDIS_URL": "redis://redis:6379/1",
}
# Async query execution via Celery
RESULTS_BACKEND = RedisCache(
host="redis",
port=6379,
key_prefix="superset_results_"
)
Set chart-level cache timeout in chart settings under the Data tab.
User Roles and Row-Level Security
Superset uses Flask-AppBuilder roles:
| Role | Access Level |
|---|---|
| Admin | Full access |
| Alpha | Can create charts/dashboards, manage own data |
| Gamma | View-only, sees what's explicitly granted |
| Public | Anonymous access (if enabled) |
Create a custom role:
- Settings → List Roles → +
- Name the role and add specific permissions
Row-Level Security (RLS) restricts which rows users see:
- Security → Row Level Security
- Click + and configure:
- Table: the dataset to restrict
- Roles: who the filter applies to
- Group Key: optional grouping
- Clause: SQL WHERE clause fragment
-- Example RLS clause: users only see their department's data
department = '{{current_username}}'
-- Or use a lookup table
region IN (SELECT region FROM user_regions WHERE username = '{{current_username}}')
Troubleshooting
Superset container keeps restarting:
docker compose -f docker-compose-non-dev.yml logs superset_app | tail -30
# Common cause: wrong SECRET_KEY format or missing DB migration
Database connection fails:
# Test the SQLAlchemy URI directly
docker exec -it superset_app python3 -c "
from sqlalchemy import create_engine
e = create_engine('postgresql+psycopg2://user:pass@host/db')
print(e.connect())
"
Charts load slowly:
- Enable Redis caching (see above)
- Set an appropriate Cache Timeout on the dataset
- Add indexes to your database on GROUP BY and WHERE columns
- Use Async execution for long queries
"Unknown database" error after adding driver:
# Rebuild container with new pip packages
docker compose -f docker-compose-non-dev.yml build --no-cache superset
Celery workers not processing async queries:
docker compose -f docker-compose-non-dev.yml logs superset_worker
# Ensure Redis is running and BROKER_URL is correct
Conclusion
Apache Superset delivers a full-featured self-hosted BI platform with SQL Lab for ad-hoc analysis, a rich chart library, and granular role-based access control. Docker Compose is the easiest path to a production deployment, and Redis caching keeps dashboards responsive even against large datasets. With row-level security, you can safely expose the same dashboard to users who should only see their own data slice.


