Slow Queries: Analysis and Optimization - Complete Performance Guide

Introduction

Database query performance is a critical factor in application responsiveness, user satisfaction, and overall system efficiency. Slow queries can significantly impact application performance, leading to poor user experience, increased server resource consumption, and potential system failures under high load. A single poorly optimized query can bring an entire application to its knees, affecting thousands of users and costing organizations substantial revenue and reputation.

Query optimization is both an art and a science, requiring deep understanding of database internals, indexing strategies, query execution plans, and data access patterns. While modern database systems like MySQL, PostgreSQL, and MariaDB have sophisticated query optimizers, they rely on accurate statistics, proper indexing, and well-structured queries to generate efficient execution plans. Even with automatic optimization, understanding how to identify, analyze, and fix slow queries remains an essential skill for developers, DBAs, and DevOps engineers.

This comprehensive guide covers everything from identifying slow queries using various monitoring tools to understanding query execution plans, implementing effective indexing strategies, and applying advanced optimization techniques. Whether you're troubleshooting a production performance issue or proactively optimizing your database, this guide provides the knowledge and practical techniques needed to significantly improve query performance and maintain a responsive, scalable database system.

Prerequisites

Before diving into query optimization, ensure you have:

System Requirements

MySQL 5.7+, MySQL 8.0+, PostgreSQL 12+, or MariaDB 10.3+
Access to database server with sufficient resources
Administrative privileges on the database
Ability to modify database configuration files

Database Access

Root or administrative database credentials
Ability to enable slow query logging
Permission to create and modify indexes
Access to performance schema or system catalogs

Knowledge Requirements

Solid understanding of SQL syntax
Basic knowledge of database indexing concepts
Familiarity with database architecture
Understanding of query execution concepts
Basic performance monitoring skills

Tools and Access

Database command-line client (mysql, psql)
Access to database configuration files
Log file reading permissions
Optional: Query analysis tools (pt-query-digest, pgBadger)

Understanding Query Performance

Query Execution Lifecycle

Understanding how databases execute queries is crucial for optimization:

Parsing: Query syntax is validated and parsed into an internal structure
Optimization: Query optimizer generates execution plan
Execution: Database engine executes the plan
Result Return: Filtered and formatted results are returned to client

Common Performance Bottlenecks

Missing Indexes: Most common cause of slow queries. Full table scans read entire tables instead of using indexes to quickly locate data.

Inefficient Indexes: Wrong index type, column order, or unused indexes that slow down writes without improving reads.

Large Result Sets: Returning millions of rows when only hundreds are needed.

Complex Joins: Multiple table joins without proper indexes or inefficient join order.

Suboptimal Query Structure: Using subqueries instead of joins, or inefficient WHERE clauses.

Table Locks: Long-running queries holding locks and blocking other operations.

Resource Constraints: Insufficient memory, slow disk I/O, or CPU limitations.

Poor Schema Design: Denormalization issues, incorrect data types, or missing foreign keys.

Key Performance Metrics

Query Execution Time: Total time from query submission to result return.

Rows Examined: Number of rows the database had to scan to produce results.

Rows Returned: Number of rows actually returned to the client.

Index Usage: Whether and which indexes were used.

Lock Wait Time: Time spent waiting for locks to be released.

Temporary Table Creation: Whether temporary tables were created in memory or on disk.

Identifying Slow Queries

Method 1: Enable MySQL Slow Query Log

Configure MySQL to log slow queries:

sudo nano /etc/mysql/mysql.conf.d/mysqld.cnf

Add or modify:

[mysqld]
# Enable slow query log
slow_query_log = 1
slow_query_log_file = /var/log/mysql/mysql-slow.log

# Log queries taking longer than 2 seconds
long_query_time = 2

# Log queries not using indexes
log_queries_not_using_indexes = 1

# Limit logging of queries not using indexes
log_throttle_queries_not_using_indexes = 10

# Log slow administrative statements
log_slow_admin_statements = 1

# Log slow replica statements
log_slow_slave_statements = 1

# Minimum number of examined rows to log
min_examined_row_limit = 1000

Restart MySQL:

sudo systemctl restart mysql

Enable at runtime (no restart):

SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 2;
SET GLOBAL log_queries_not_using_indexes = 'ON';

View slow query log:

sudo tail -f /var/log/mysql/mysql-slow.log

Method 2: Enable PostgreSQL Slow Query Logging

Configure PostgreSQL to log slow queries:

sudo nano /etc/postgresql/14/main/postgresql.conf

Add or modify:

# Log queries slower than 1000ms (1 second)
log_min_duration_statement = 1000

# Log more query details
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h '
log_statement = 'none'
log_duration = off

# Log query plans for slow queries
auto_explain.log_min_duration = 1000
shared_preload_libraries = 'auto_explain'
auto_explain.log_analyze = on
auto_explain.log_buffers = on
auto_explain.log_timing = on

Reload configuration:

sudo systemctl reload postgresql

Or enable at runtime:

ALTER SYSTEM SET log_min_duration_statement = 1000;
SELECT pg_reload_conf();

View PostgreSQL logs:

sudo tail -f /var/log/postgresql/postgresql-14-main.log

Method 3: Query Performance Schema (MySQL 8.0+)

Use Performance Schema to identify slow queries:

-- Enable Performance Schema (usually enabled by default in MySQL 8.0+)
UPDATE performance_schema.setup_instruments
SET ENABLED = 'YES', TIMED = 'YES'
WHERE NAME LIKE 'statement/%';

UPDATE performance_schema.setup_consumers
SET ENABLED = 'YES'
WHERE NAME LIKE '%statements%';

-- Find slowest queries by total time
SELECT
    DIGEST_TEXT as query,
    COUNT_STAR as exec_count,
    AVG_TIMER_WAIT / 1000000000000 as avg_time_sec,
    SUM_TIMER_WAIT / 1000000000000 as total_time_sec,
    MAX_TIMER_WAIT / 1000000000000 as max_time_sec,
    SUM_ROWS_EXAMINED as rows_examined,
    SUM_ROWS_SENT as rows_sent
FROM performance_schema.events_statements_summary_by_digest
ORDER BY SUM_TIMER_WAIT DESC
LIMIT 20;

-- Find queries with most rows examined
SELECT
    DIGEST_TEXT as query,
    COUNT_STAR as exec_count,
    AVG_TIMER_WAIT / 1000000000000 as avg_time_sec,
    SUM_ROWS_EXAMINED as rows_examined,
    SUM_ROWS_SENT as rows_sent,
    SUM_ROWS_EXAMINED / SUM_ROWS_SENT as examined_sent_ratio
FROM performance_schema.events_statements_summary_by_digest
WHERE SUM_ROWS_SENT > 0
ORDER BY SUM_ROWS_EXAMINED DESC
LIMIT 20;

-- Find queries with full table scans
SELECT
    DIGEST_TEXT as query,
    COUNT_STAR as exec_count,
    SUM_NO_INDEX_USED as full_scans,
    SUM_NO_GOOD_INDEX_USED as bad_index
FROM performance_schema.events_statements_summary_by_digest
WHERE SUM_NO_INDEX_USED > 0 OR SUM_NO_GOOD_INDEX_USED > 0
ORDER BY SUM_NO_INDEX_USED DESC
LIMIT 20;

Method 4: Query pg_stat_statements (PostgreSQL)

Install and use pg_stat_statements:

-- Install extension
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- View slowest queries
SELECT
    query,
    calls,
    total_exec_time / 1000 as total_time_sec,
    mean_exec_time / 1000 as avg_time_sec,
    max_exec_time / 1000 as max_time_sec,
    rows,
    100.0 * shared_blks_hit / NULLIF(shared_blks_hit + shared_blks_read, 0) AS cache_hit_ratio
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;

-- Find queries with low cache hit ratio
SELECT
    query,
    calls,
    shared_blks_hit,
    shared_blks_read,
    100.0 * shared_blks_hit / NULLIF(shared_blks_hit + shared_blks_read, 0) AS cache_hit_ratio
FROM pg_stat_statements
WHERE shared_blks_read > 0
ORDER BY cache_hit_ratio ASC
LIMIT 20;

-- Reset statistics
SELECT pg_stat_statements_reset();

Method 5: Using pt-query-digest (Percona Toolkit)

Analyze MySQL slow query log:

# Install Percona Toolkit
sudo apt-get install percona-toolkit

# Analyze slow query log
pt-query-digest /var/log/mysql/mysql-slow.log > slow_query_analysis.txt

# Analyze with specific options
pt-query-digest \
    --limit 95%:20 \
    --filter '$event->{fingerprint} =~ m/^select/i' \
    /var/log/mysql/mysql-slow.log

# Analyze queries from Performance Schema
pt-query-digest \
    --source performance_schema:dbi:mysql \
    --limit 20

View the analysis report:

less slow_query_analysis.txt

Method 6: Using pgBadger (PostgreSQL)

Analyze PostgreSQL log files:

# Install pgBadger
sudo apt-get install pgbadger

# Analyze log file
pgbadger /var/log/postgresql/postgresql-14-main.log -o report.html

# Analyze with specific options
pgbadger \
    --prefix '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h ' \
    --outfile report.html \
    /var/log/postgresql/postgresql-14-main.log

# Open report in browser
xdg-open report.html

Analyzing Query Execution Plans

MySQL EXPLAIN Analysis

Use EXPLAIN to understand query execution:

-- Basic EXPLAIN
EXPLAIN SELECT * FROM orders WHERE customer_id = 12345;

-- Extended EXPLAIN with more details
EXPLAIN FORMAT=JSON SELECT * FROM orders WHERE customer_id = 12345;

-- Analyze actual execution statistics (MySQL 8.0+)
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 12345;

-- Visual tree format (MySQL 8.0.16+)
EXPLAIN FORMAT=TREE SELECT * FROM orders WHERE customer_id = 12345;

Understanding EXPLAIN output:

EXPLAIN SELECT o.*, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'pending'
  AND o.created_at > '2024-01-01'
ORDER BY o.created_at DESC
LIMIT 100;

Key columns in EXPLAIN output:

id: Query identifier, higher numbers execute first
select_type: Type of SELECT (SIMPLE, SUBQUERY, UNION, etc.)
table: Table being accessed
type: Join type (system, const, eq_ref, ref, range, index, ALL)
- ALL = Full table scan (worst)
- index = Full index scan
- range = Index range scan
- ref = Non-unique index lookup
- eq_ref = Unique index lookup
- const = Single row match (best)
possible_keys: Indexes that could be used
key: Index actually used
key_len: Length of index used
ref: Columns compared to index
rows: Estimated rows to examine
Extra: Additional information (Using where, Using index, Using temporary, Using filesort)

PostgreSQL EXPLAIN Analysis

Use EXPLAIN in PostgreSQL:

-- Basic EXPLAIN
EXPLAIN SELECT * FROM orders WHERE customer_id = 12345;

-- EXPLAIN with execution statistics
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 12345;

-- Detailed buffer usage information
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE customer_id = 12345;

-- Verbose output with full detail
EXPLAIN (ANALYZE, BUFFERS, VERBOSE, COSTS, TIMING)
SELECT * FROM orders WHERE customer_id = 12345;

Understanding PostgreSQL EXPLAIN output:

EXPLAIN (ANALYZE, BUFFERS) SELECT o.*, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'pending'
  AND o.created_at > '2024-01-01'
ORDER BY o.created_at DESC
LIMIT 100;

Key node types:

Seq Scan: Sequential scan (full table scan) - slow for large tables
Index Scan: Index scan with table lookup
Index Only Scan: Index contains all needed columns (fastest)
Bitmap Index Scan: Scan index and build bitmap of matching rows
Nested Loop: Join by iterating inner table for each outer row
Hash Join: Build hash table and probe for matches
Merge Join: Sort both tables and merge
Sort: Explicit sorting operation
Aggregate: Grouping and aggregation

Indexing Strategies for Performance

Understanding Index Types

B-Tree Index (Default): Best for equality and range queries. Works for >, <, =, BETWEEN, IN, LIKE 'prefix%'.

-- MySQL
CREATE INDEX idx_customer_email ON customers(email);

-- PostgreSQL
CREATE INDEX idx_customer_email ON customers(email);

Hash Index: Only for equality comparisons. Not supported in MySQL InnoDB, limited use in PostgreSQL.

-- PostgreSQL only
CREATE INDEX idx_customer_email_hash ON customers USING HASH (email);

Full-Text Index: For full-text search operations.

-- MySQL
CREATE FULLTEXT INDEX idx_product_description ON products(description);

-- PostgreSQL (using GIN)
CREATE INDEX idx_product_description ON products USING GIN (to_tsvector('english', description));

Spatial Index: For geographic data.

-- MySQL
CREATE SPATIAL INDEX idx_location ON stores(location);

-- PostgreSQL (requires PostGIS)
CREATE INDEX idx_location ON stores USING GIST (location);

Creating Effective Indexes

Single-column indexes:

-- Index for frequent WHERE conditions
CREATE INDEX idx_orders_status ON orders(status);

-- Index for JOIN conditions
CREATE INDEX idx_orders_customer_id ON orders(customer_id);

-- Index for ORDER BY columns
CREATE INDEX idx_orders_created_at ON orders(created_at);

Composite (multi-column) indexes:

-- Index for queries filtering on multiple columns
CREATE INDEX idx_orders_status_date ON orders(status, created_at);

-- Index column order matters - most selective first
CREATE INDEX idx_users_country_city ON users(country, city, registration_date);

-- Covering index includes all query columns
CREATE INDEX idx_orders_covering ON orders(customer_id, status, total_amount, created_at);

Partial indexes (PostgreSQL):

-- Index only active orders
CREATE INDEX idx_orders_active ON orders(created_at)
WHERE status IN ('pending', 'processing');

-- Index only recent data
CREATE INDEX idx_orders_recent ON orders(customer_id)
WHERE created_at > '2024-01-01';

Index Best Practices

Create indexes for:

Foreign key columns used in JOINs
Columns in WHERE clauses
Columns in ORDER BY clauses
Columns in GROUP BY clauses
Columns in DISTINCT operations

Index column ordering (for composite indexes):

Equality conditions first (=)
Range conditions second (>, <, BETWEEN)
Sort conditions last (ORDER BY)

-- Good composite index for this query
-- WHERE status = 'active' AND created_at > '2024-01-01' ORDER BY updated_at
CREATE INDEX idx_optimized ON orders(status, created_at, updated_at);

Step-by-Step Query Optimization

Step 1: Identify the Slow Query

Use slow query log or monitoring tools:

-- MySQL: Check slow queries
SELECT
    DIGEST_TEXT as query,
    COUNT_STAR as exec_count,
    AVG_TIMER_WAIT / 1000000000000 as avg_time_sec
FROM performance_schema.events_statements_summary_by_digest
ORDER BY AVG_TIMER_WAIT DESC
LIMIT 10;

-- PostgreSQL: Check slow queries
SELECT query, calls, mean_exec_time / 1000 as avg_time_sec
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;

Step 2: Analyze the Query Execution Plan

-- Example slow query
EXPLAIN ANALYZE
SELECT o.*, c.name, c.email, p.name as product_name
FROM orders o
JOIN customers c ON o.customer_id = c.id
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id
WHERE o.status = 'pending'
  AND o.created_at > NOW() - INTERVAL 30 DAY
  AND c.country = 'US'
ORDER BY o.created_at DESC
LIMIT 100;

Look for:

Full table scans (type: ALL or Seq Scan)
High row counts examined
Missing index usage
Temporary table creation
Filesort operations

Step 3: Create Necessary Indexes

Based on EXPLAIN analysis:

-- Index for orders filtering
CREATE INDEX idx_orders_status_date ON orders(status, created_at);

-- Index for customer country filter
CREATE INDEX idx_customers_country ON customers(country);

-- Indexes for JOIN columns (if not already present)
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
CREATE INDEX idx_order_items_order_id ON order_items(order_id);
CREATE INDEX idx_order_items_product_id ON order_items(product_id);

Step 4: Rewrite Inefficient Queries

Original slow query:

-- Slow: Subquery executed for each row
SELECT *
FROM orders
WHERE customer_id IN (
    SELECT id FROM customers WHERE country = 'US'
)
AND status = 'pending';

Optimized version:

-- Fast: JOIN with proper indexes
SELECT o.*
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE c.country = 'US'
  AND o.status = 'pending';

Another example - avoiding functions on indexed columns:

-- Slow: Function prevents index usage
SELECT * FROM orders
WHERE DATE(created_at) = '2024-01-15';

-- Fast: Range query uses index
SELECT * FROM orders
WHERE created_at >= '2024-01-15'
  AND created_at < '2024-01-16';

Step 5: Optimize JOIN Operations

Ensure JOIN columns are indexed:

-- Check indexes on JOIN columns
SHOW INDEX FROM orders WHERE Column_name = 'customer_id';
SHOW INDEX FROM customers WHERE Column_name = 'id';

-- Create missing indexes
CREATE INDEX idx_orders_customer_id ON orders(customer_id);

Use appropriate JOIN types:

-- INNER JOIN when you need matching rows from both tables
SELECT o.*, c.name
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id;

-- LEFT JOIN only when you need all rows from left table
SELECT c.name, COUNT(o.id) as order_count
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name;

Step 6: Limit Result Sets

Always use LIMIT when appropriate:

-- Add LIMIT to prevent returning millions of rows
SELECT * FROM orders
WHERE status = 'pending'
ORDER BY created_at DESC
LIMIT 100;

-- Use pagination for large result sets
SELECT * FROM orders
WHERE status = 'pending'
ORDER BY created_at DESC
LIMIT 100 OFFSET 0;  -- First page

Step 7: Use Covering Indexes

Create indexes that include all query columns:

-- Query needs id, customer_id, status, total_amount
CREATE INDEX idx_orders_covering ON orders(
    status,
    created_at,
    customer_id,
    total_amount
);

-- Now this query can use index-only scan
SELECT customer_id, status, total_amount
FROM orders
WHERE status = 'pending'
  AND created_at > '2024-01-01';

Step 8: Optimize Aggregations

Use proper indexing for GROUP BY:

-- Add index for GROUP BY columns
CREATE INDEX idx_orders_customer_status ON orders(customer_id, status);

-- Now this query is much faster
SELECT customer_id, status, COUNT(*) as order_count
FROM orders
GROUP BY customer_id, status;

Use summary tables for complex aggregations:

-- Create summary table
CREATE TABLE daily_order_stats (
    stat_date DATE PRIMARY KEY,
    total_orders INT,
    total_amount DECIMAL(15,2),
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Update summary daily
INSERT INTO daily_order_stats (stat_date, total_orders, total_amount)
SELECT DATE(created_at), COUNT(*), SUM(total_amount)
FROM orders
WHERE DATE(created_at) = CURDATE()
GROUP BY DATE(created_at)
ON DUPLICATE KEY UPDATE
    total_orders = VALUES(total_orders),
    total_amount = VALUES(total_amount),
    updated_at = CURRENT_TIMESTAMP;

Step 9: Avoid SELECT *

Select only needed columns:

-- Slow: Returns all columns
SELECT * FROM orders WHERE customer_id = 12345;

-- Fast: Returns only needed columns
SELECT id, order_number, status, total_amount, created_at
FROM orders
WHERE customer_id = 12345;

Step 10: Update Statistics

Keep statistics current for optimizer:

-- MySQL: Analyze tables
ANALYZE TABLE orders, customers, products;

-- PostgreSQL: Analyze tables
ANALYZE orders;
ANALYZE customers;
ANALYZE products;

-- Or analyze entire database
ANALYZE;

Advanced Optimization Techniques

Query Caching

MySQL query cache (deprecated in MySQL 8.0):

-- Check if query cache is enabled (MySQL 5.7)
SHOW VARIABLES LIKE 'query_cache%';

-- For MySQL 8.0+, use application-level caching (Redis, Memcached)

PostgreSQL prepared statements:

-- Prepare statement for reuse
PREPARE order_lookup (int) AS
    SELECT * FROM orders WHERE id = $1;

-- Execute prepared statement
EXECUTE order_lookup(12345);

-- Deallocate when done
DEALLOCATE order_lookup;

Partitioning Large Tables

MySQL table partitioning:

-- Partition orders by date range
CREATE TABLE orders (
    id INT NOT NULL AUTO_INCREMENT,
    customer_id INT,
    status VARCHAR(20),
    total_amount DECIMAL(10,2),
    created_at DATETIME,
    PRIMARY KEY (id, created_at)
)
PARTITION BY RANGE (YEAR(created_at)) (
    PARTITION p2022 VALUES LESS THAN (2023),
    PARTITION p2023 VALUES LESS THAN (2024),
    PARTITION p2024 VALUES LESS THAN (2025),
    PARTITION p2025 VALUES LESS THAN (2026),
    PARTITION p_future VALUES LESS THAN MAXVALUE
);

PostgreSQL declarative partitioning:

-- Create partitioned table
CREATE TABLE orders (
    id SERIAL,
    customer_id INT,
    status VARCHAR(20),
    total_amount DECIMAL(10,2),
    created_at TIMESTAMP
) PARTITION BY RANGE (created_at);

-- Create partitions
CREATE TABLE orders_2024_q1 PARTITION OF orders
    FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');

CREATE TABLE orders_2024_q2 PARTITION OF orders
    FOR VALUES FROM ('2024-04-01') TO ('2024-07-01');

-- Create indexes on partitions
CREATE INDEX idx_orders_2024_q1_customer ON orders_2024_q1(customer_id);
CREATE INDEX idx_orders_2024_q2_customer ON orders_2024_q2(customer_id);

Materialized Views

PostgreSQL materialized views:

-- Create materialized view for complex aggregation
CREATE MATERIALIZED VIEW customer_order_summary AS
SELECT
    c.id,
    c.name,
    c.email,
    COUNT(o.id) as total_orders,
    SUM(o.total_amount) as total_spent,
    MAX(o.created_at) as last_order_date
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name, c.email;

-- Create index on materialized view
CREATE INDEX idx_customer_summary_id ON customer_order_summary(id);

-- Query materialized view (very fast)
SELECT * FROM customer_order_summary WHERE id = 12345;

-- Refresh materialized view
REFRESH MATERIALIZED VIEW customer_order_summary;

-- Refresh without locking (PostgreSQL 9.4+)
REFRESH MATERIALIZED VIEW CONCURRENTLY customer_order_summary;

Query Optimization Hints

MySQL optimizer hints (8.0+):

-- Force index usage
SELECT /*+ INDEX(orders idx_orders_status) */
    * FROM orders
WHERE status = 'pending';

-- Force JOIN order
SELECT /*+ JOIN_ORDER(o, c) */
    o.*, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id;

-- Use specific JOIN algorithm
SELECT /*+ HASH_JOIN(o, c) */
    o.*, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id;

Monitoring and Maintenance

Create Performance Monitoring Script

MySQL monitoring script:

#!/bin/bash
# MySQL Performance Monitor

mysql -e "
SELECT 'Slow Queries' as Metric, variable_value as Value
FROM performance_schema.global_status WHERE variable_name = 'Slow_queries'
UNION ALL
SELECT 'Questions', variable_value
FROM performance_schema.global_status WHERE variable_name = 'Questions'
UNION ALL
SELECT 'Threads Connected', variable_value
FROM performance_schema.global_status WHERE variable_name = 'Threads_connected'
UNION ALL
SELECT 'Table Scans', variable_value
FROM performance_schema.global_status WHERE variable_name = 'Select_scan';

SELECT
    SUBSTRING(DIGEST_TEXT, 1, 80) as query,
    COUNT_STAR as executions,
    ROUND(AVG_TIMER_WAIT / 1000000000, 2) as avg_ms
FROM performance_schema.events_statements_summary_by_digest
WHERE DIGEST_TEXT IS NOT NULL
ORDER BY AVG_TIMER_WAIT DESC
LIMIT 5;
"

PostgreSQL monitoring script:

#!/bin/bash
# PostgreSQL Performance Monitor

sudo -u postgres psql << 'EOF'
\echo '=== Database Statistics ==='
SELECT
    'Active Connections' as metric,
    COUNT(*) as value
FROM pg_stat_activity
WHERE state = 'active'
UNION ALL
SELECT
    'Cache Hit Ratio',
    ROUND(100.0 * sum(blks_hit) / NULLIF(sum(blks_hit + blks_read), 0), 2)
FROM pg_stat_database;

\echo ''
\echo '=== Slowest Queries ==='
SELECT
    SUBSTRING(query, 1, 60) as query,
    calls,
    ROUND(mean_exec_time::numeric, 2) as avg_ms
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 5;
EOF

Regular Maintenance Tasks

Schedule regular database maintenance:

# MySQL maintenance script
#!/bin/bash
echo "Starting MySQL maintenance: $(date)"

# Optimize tables
mysql -e "OPTIMIZE TABLE orders, customers, products;"

# Update statistics
mysql -e "ANALYZE TABLE orders, customers, products;"

# Check for unused indexes
mysql << 'EOF'
SELECT
    object_schema,
    object_name,
    index_name
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE index_name IS NOT NULL
  AND count_star = 0
  AND object_schema NOT IN ('mysql', 'performance_schema');
EOF

echo "MySQL maintenance completed: $(date)"

# PostgreSQL maintenance script
#!/bin/bash
echo "Starting PostgreSQL maintenance: $(date)"

# Vacuum and analyze
sudo -u postgres psql << 'EOF'
VACUUM ANALYZE orders;
VACUUM ANALYZE customers;
VACUUM ANALYZE products;

-- Check for bloated tables
SELECT
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size
FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC
LIMIT 10;

-- Check for unused indexes
SELECT
    schemaname,
    tablename,
    indexname,
    idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexname NOT LIKE '%_pkey';
EOF

echo "PostgreSQL maintenance completed: $(date)"

Troubleshooting

High CPU Usage

Issue: Database consuming excessive CPU

Solution: Identify CPU-intensive queries:

-- MySQL
SELECT
    DIGEST_TEXT as query,
    COUNT_STAR as exec_count,
    SUM_TIMER_WAIT / 1000000000000 as total_time_sec
FROM performance_schema.events_statements_summary_by_digest
ORDER BY SUM_TIMER_WAIT DESC
LIMIT 10;

-- PostgreSQL
SELECT query, calls, total_exec_time / 1000 as total_sec
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Slow JOIN Performance

Issue: Queries with JOINs are very slow

Solution: Check and create indexes on JOIN columns:

-- Verify indexes exist
SHOW INDEX FROM orders WHERE Column_name = 'customer_id';

-- Create missing indexes
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
CREATE INDEX idx_order_items_order_id ON order_items(order_id);

-- Use EXPLAIN to verify index usage
EXPLAIN SELECT o.*, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id;

Table Lock Contention

Issue: Queries waiting for table locks

Solution: Identify and resolve locks:

-- MySQL: Show locked tables
SHOW OPEN TABLES WHERE In_use > 0;

-- Show processlist for blocking queries
SHOW FULL PROCESSLIST;

-- Kill blocking query
KILL <process_id>;

-- PostgreSQL: Show locks
SELECT * FROM pg_locks WHERE NOT granted;

-- Find blocking queries
SELECT
    blocked_activity.pid AS blocked_pid,
    blocked_activity.query AS blocked_query,
    blocking_activity.pid AS blocking_pid,
    blocking_activity.query AS blocking_query
FROM pg_stat_activity AS blocked_activity
JOIN pg_stat_activity AS blocking_activity
    ON blocking_activity.pid = ANY(pg_blocking_pids(blocked_activity.pid));

Missing Index Detection

Issue: Unsure which indexes to create

Solution: Identify missing indexes:

-- MySQL: Queries not using indexes
SELECT
    DIGEST_TEXT,
    SUM_NO_INDEX_USED,
    SUM_NO_GOOD_INDEX_USED
FROM performance_schema.events_statements_summary_by_digest
WHERE SUM_NO_INDEX_USED > 0 OR SUM_NO_GOOD_INDEX_USED > 0
ORDER BY SUM_NO_INDEX_USED DESC;

-- PostgreSQL: Sequential scans on large tables
SELECT
    schemaname,
    tablename,
    seq_scan,
    seq_tup_read,
    idx_scan,
    seq_tup_read / seq_scan as avg_seq_read
FROM pg_stat_user_tables
WHERE seq_scan > 0
ORDER BY seq_tup_read DESC
LIMIT 20;

Best Practices Summary

1. Index Strategically

Create indexes on foreign keys and JOIN columns
Index columns in WHERE, ORDER BY, and GROUP BY clauses
Use composite indexes for multi-column queries
Don't over-index - each index has write overhead

2. Write Efficient Queries

Select only needed columns (avoid SELECT *)
Use LIMIT to restrict result sets
Avoid functions on indexed columns in WHERE
Use JOINs instead of subqueries when possible

3. Monitor Continuously

Enable and review slow query logs
Use Performance Schema / pg_stat_statements
Track query execution time trends
Set up alerts for performance degradation

4. Regular Maintenance

Update table statistics weekly
Optimize/vacuum tables monthly
Review and remove unused indexes
Archive old data to keep tables manageable

5. Test Before Production

Test all optimizations in staging environment
Use production-like data volumes
Benchmark performance improvements
Plan rollback procedures

6. Document Changes

Record all index creations
Document query optimizations
Maintain baseline performance metrics
Track schema changes

Conclusion

Query optimization is an ongoing process that requires continuous monitoring, analysis, and refinement. By following this comprehensive guide, you've learned how to identify slow queries using various monitoring tools, analyze query execution plans to understand performance bottlenecks, create effective indexes that dramatically improve query performance, rewrite queries for better efficiency, implement advanced optimization techniques, and maintain optimal database performance over time.

The key to maintaining high-performance databases is establishing a systematic approach to query optimization, regularly monitoring query performance and resource usage, proactively identifying and addressing bottlenecks before they impact users, maintaining up-to-date statistics and well-designed indexes, and continuously reviewing and optimizing as data volumes and access patterns change.

Remember that premature optimization can be counterproductive. Focus on optimizing queries that have the greatest impact on user experience or system resources. Use actual performance data to guide optimization efforts, test all changes thoroughly before deploying to production, document your optimizations and their impact, and maintain a balance between read and write performance.

As your database grows and evolves, revisit your optimization strategies regularly. New data patterns may require different indexes, growing data volumes may necessitate partitioning, changing access patterns may obsolete existing optimizations, and new database features may offer better solutions. Stay informed about database best practices and new features, continue to measure and monitor performance, and maintain a culture of performance awareness throughout your development team. With the foundation built from this guide, you're well-equipped to maintain fast, efficient database systems that scale with your application's needs.

Slow Queries: Analysis and Optimization - Complete Performance Guide

Slow Queries: Analysis and Optimization - Complete Performance Guide

Introduction

Prerequisites

System Requirements

Database Access

Knowledge Requirements

Tools and Access

Understanding Query Performance

Query Execution Lifecycle

Common Performance Bottlenecks

Key Performance Metrics

Identifying Slow Queries

Method 1: Enable MySQL Slow Query Log

Method 2: Enable PostgreSQL Slow Query Logging

Method 3: Query Performance Schema (MySQL 8.0+)

Method 4: Query pg_stat_statements (PostgreSQL)

Method 5: Using pt-query-digest (Percona Toolkit)

Method 6: Using pgBadger (PostgreSQL)

Analyzing Query Execution Plans

MySQL EXPLAIN Analysis

PostgreSQL EXPLAIN Analysis

Indexing Strategies for Performance

Understanding Index Types

Creating Effective Indexes

Index Best Practices

Step-by-Step Query Optimization

Step 1: Identify the Slow Query

Step 2: Analyze the Query Execution Plan

Step 3: Create Necessary Indexes

Step 4: Rewrite Inefficient Queries

Step 5: Optimize JOIN Operations

Step 6: Limit Result Sets

Step 7: Use Covering Indexes

Step 8: Optimize Aggregations

Step 9: Avoid SELECT *

Step 10: Update Statistics

Advanced Optimization Techniques

Query Caching

Partitioning Large Tables

Materialized Views

Query Optimization Hints

Monitoring and Maintenance

Create Performance Monitoring Script

Regular Maintenance Tasks

Troubleshooting

High CPU Usage

Slow JOIN Performance

Table Lock Contention

Missing Index Detection

Best Practices Summary

1. Index Strategically

2. Write Efficient Queries

3. Monitor Continuously

4. Regular Maintenance

5. Test Before Production

6. Document Changes

Conclusion

Latest Video

Get $20 Free Credit