Fluent Bit Lightweight Log Processor
Fluent Bit is a fast, lightweight log processor and forwarder written in C, designed for high-throughput log collection with minimal CPU and memory usage. This guide covers deploying Fluent Bit on Linux, configuring input plugins, parsing rules, filters, output destinations, and Kubernetes integration for memory-efficient log forwarding.
Prerequisites
- Ubuntu 20.04+ / Debian 11+ or CentOS 8+ / Rocky Linux 8+
- 64 MB RAM (typical production usage is under 5 MB)
- Root or sudo access
- Log sources: systemd, files, TCP/UDP syslog, etc.
Installing Fluent Bit
# Ubuntu 22.04/24.04
curl https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh
# Or manually via APT
curl -fsSL https://packages.fluentbit.io/fluentbit.key | sudo gpg --dearmor -o /usr/share/keyrings/fluentbit-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/fluentbit-keyring.gpg] https://packages.fluentbit.io/ubuntu/$(lsb_release -cs) $(lsb_release -cs) main" \
| sudo tee /etc/apt/sources.list.d/fluent-bit.list
sudo apt-get update && sudo apt-get install -y fluent-bit
# CentOS/Rocky Linux
curl https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh
# or:
sudo rpm --import https://packages.fluentbit.io/fluentbit.key
cat > /etc/yum.repos.d/fluent-bit.repo << 'EOF'
[fluent-bit]
name = Fluent Bit
baseurl = https://packages.fluentbit.io/centos/8/
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.fluentbit.io/fluentbit.key
enabled=1
EOF
sudo dnf install -y fluent-bit
# Verify installation
fluent-bit --version
# Enable and start
sudo systemctl enable --now fluent-bit
sudo systemctl status fluent-bit
Configuration Structure
Fluent Bit uses a classic INI-like format (.conf) or YAML. Config files live in /etc/fluent-bit/:
/etc/fluent-bit/
├── fluent-bit.conf # Main config
├── parsers.conf # Parser definitions
└── plugins.conf # External plugins
The main config has sections: [SERVICE], [INPUT], [FILTER], [OUTPUT]. Data flows from INPUT through FILTERs to OUTPUT.
# /etc/fluent-bit/fluent-bit.conf
[SERVICE]
Flush 5 # Flush every 5 seconds
Log_Level info # debug, info, warn, error
Daemon Off # systemd manages the process
Parsers_File parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020 # Metrics endpoint
[INPUT]
Name tail
Path /var/log/nginx/access.log
Tag nginx.access
Parser nginx
[OUTPUT]
Name stdout
Match *
Input Plugins
Tail (follow log files):
[INPUT]
Name tail
Path /var/log/app/*.log
Path_Key filename # add filename field to each record
Tag app.*
Parser json
Refresh_Interval 10
Rotate_Wait 30 # seconds to wait after log rotation
Skip_Long_Lines On
Mem_Buf_Limit 50MB
Systemd journal:
[INPUT]
Name systemd
Tag systemd.*
Systemd_Filter _SYSTEMD_UNIT=nginx.service
Systemd_Filter _SYSTEMD_UNIT=postgresql.service
Read_From_Tail On
Strip_Underscores On # Remove _ prefix from journal fields
TCP/UDP Syslog:
[INPUT]
Name syslog
Parser syslog-rfc5424
Listen 0.0.0.0
Port 5140
Mode tcp
Tag syslog
Forward (receive from Fluentd/other Fluent Bit):
[INPUT]
Name forward
Listen 0.0.0.0
Port 24224
Buffer_Chunk_Size 1M
Buffer_Max_Size 6M
Parsing Logs
Define parsers in /etc/fluent-bit/parsers.conf:
[PARSER]
Name nginx
Format regex
Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name json
Format json
Time_Key timestamp
Time_Format %Y-%m-%dT%H:%M:%S.%L%z
Time_Keep On
[PARSER]
Name syslog-rfc5424
Format regex
Regex ^\<(?<pri>[0-9]{1,5})\>1 (?<time>[^ ]+) (?<host>[^ ]+) (?<ident>[^ ]+) (?<pid>[-0-9]+) (?<msgid>[^ ]+) (?<extradata>(\[(.*)\]|-)) (?<message>.+)$
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L%z
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
Filtering
Grep filter (include/exclude records):
# Keep only error-level logs
[FILTER]
Name grep
Match app.*
Regex level (error|critical|fatal)
# Exclude health check endpoints
[FILTER]
Name grep
Match nginx.*
Exclude path /health
Record Modifier (add/remove fields):
[FILTER]
Name record_modifier
Match *
Record hostname ${HOSTNAME}
Record environment production
Remove_key password
Remove_key secret
Lua filter (custom logic):
[FILTER]
Name lua
Match app.*
Script /etc/fluent-bit/transform.lua
Call transform_record
-- /etc/fluent-bit/transform.lua
function transform_record(tag, timestamp, record)
-- Normalize log level
if record["level"] then
record["level"] = string.upper(record["level"])
end
-- Add derived fields
if record["duration_ms"] then
if record["duration_ms"] > 1000 then
record["slow_request"] = true
end
end
return 1, timestamp, record
end
Throttle filter (rate limiting):
[FILTER]
Name throttle
Match *
Rate 100 # max 100 records
Window 5 # per 5-second window
Print_Status On
Output Destinations
Elasticsearch / OpenSearch:
[OUTPUT]
Name es
Match *
Host elasticsearch.example.com
Port 9200
Index logs
Type _doc
Logstash_Format On
Logstash_Prefix fluentbit
Time_Key @timestamp
HTTP_User elastic
HTTP_Passwd yourpassword
tls On
tls.verify On
Grafana Loki:
[OUTPUT]
Name loki
Match *
Host loki.example.com
Port 3100
Labels job=fluent-bit,env=production,host=${HOSTNAME}
Label_Keys $app,$level
Line_Format json
S3 (archive):
[OUTPUT]
Name s3
Match *
Region us-east-1
Bucket my-log-archive
Total_File_Size 100M
Upload_Timeout 10m
S3_Key_Format /logs/%Y/%m/%d/%H/$UUID.gz
Compression gzip
Content_Type application/json
Forward to another Fluent Bit or Fluentd:
[OUTPUT]
Name forward
Match *
Host aggregator.internal
Port 24224
Shared_Key mysecretkey
Self_Hostname ${HOSTNAME}
tls On
Kubernetes Integration
Deploy Fluent Bit as a DaemonSet in Kubernetes using the official Helm chart:
# Add Helm repo
helm repo add fluent https://fluent.github.io/helm-charts
helm repo update
# Create values override
cat > fluent-bit-values.yaml << 'EOF'
config:
inputs: |
[INPUT]
Name tail
Path /var/log/containers/*.log
multiline.parser docker, cri
Tag kube.*
Mem_Buf_Limit 50MB
Skip_Long_Lines On
filters: |
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
Kube_Tag_Prefix kube.var.log.containers.
Merge_Log On
Keep_Log Off
K8S-Logging.Parser On
K8S-Logging.Exclude On
outputs: |
[OUTPUT]
Name loki
Match kube.*
Host loki.monitoring.svc.cluster.local
Port 3100
Labels job=fluent-bit,namespace=$kubernetes['namespace_name'],pod=$kubernetes['pod_name']
tolerations:
- operator: Exists # Schedule on all nodes including masters
EOF
helm install fluent-bit fluent/fluent-bit \
--namespace monitoring \
--create-namespace \
--values fluent-bit-values.yaml
Troubleshooting
Check metrics and pipeline health:
# Fluent Bit exposes Prometheus metrics
curl http://localhost:2020/api/v1/metrics
curl http://localhost:2020/api/v1/uptime
# Check which outputs are healthy
curl http://localhost:2020/api/v1/health
Debug log output:
# Run interactively with debug logging
fluent-bit -c /etc/fluent-bit/fluent-bit.conf -v
# Or change Log_Level in [SERVICE] to debug
sudo systemctl restart fluent-bit && journalctl -u fluent-bit -f
Logs not being picked up (tail input):
# Fluent Bit tracks file positions in a DB file
# Reset position tracking to re-read from beginning
sudo find /var/lib/fluent-bit/ -name "*.db" -delete
sudo systemctl restart fluent-bit
High memory usage:
# Reduce Mem_Buf_Limit on inputs
# Enable filesystem buffering for backpressure
[INPUT]
Name tail
storage.type filesystem # buffer to disk instead of RAM
Mem_Buf_Limit 5MB
Parser not matching:
# Test parser against a log line
echo '127.0.0.1 - - [01/Jan/2024:12:00:00 +0000] "GET / HTTP/1.1" 200 1234' \
| fluent-bit -i stdin -p tag=test \
-F parser -p key_name=log -p parser=nginx \
-o stdout -f 1
Conclusion
Fluent Bit's C implementation makes it ideal for resource-constrained environments where heavier agents like Logstash would add unacceptable overhead. With its plugin system covering dozens of inputs and outputs, Lua scripting for custom transforms, and native Kubernetes metadata enrichment, Fluent Bit handles the full log forwarding pipeline for both small VPS deployments and large Kubernetes clusters. Configure filesystem buffering and appropriate memory limits to ensure reliable delivery even during downstream outages.


