Agent
#1. Global
#1.1 Enabled
Tags:
hot_update
FQCN:
global.enabled
Upgrade from old version: enabled
Default value:
global:
enabled: true
2
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Disabled / Enabled the deepflow-agent.
#1.2 Limits
Resource limitations
#1.2.1 CPU Limit
Tags:
hot_update
FQCN:
global.limits.max_millicpus
Upgrade from old version: max_millicpus
Default value:
global:
limits:
max_millicpus: 1000
2
3
Schema:
Key | Value |
---|---|
Type | int |
Unit | Logical Milli Cores |
Range | [1, 100000] |
Description:
deepflow-agent uses cgroups to limit CPU usage. 1 millicpu = 1 millicore = 0.001 core.
#1.2.2 CPU Limit (Cores)
Tags:
deprecated
FQCN:
global.limits.max_cpus
Upgrade from old version: max_cpus
Default value:
global:
limits:
max_cpus: 1
2
3
Schema:
Key | Value |
---|---|
Type | int |
#1.2.3 Memory Limit
Tags:
hot_update
FQCN:
global.limits.max_memory
Upgrade from old version: max_memory
Default value:
global:
limits:
max_memory: 768
2
3
Schema:
Key | Value |
---|---|
Type | int |
Unit | MiB |
Range | [128, 100000] |
Description:
deepflow-agent uses cgroups to limit memory usage.
#1.2.4 Maximum Log Backhaul Rate
Tags:
hot_update
FQCN:
global.limits.max_log_backhaul_rate
Upgrade from old version: log_threshold
Default value:
global:
limits:
max_log_backhaul_rate: 300
2
3
Schema:
Key | Value |
---|---|
Type | int |
Unit | Lines/Hour |
Range | [0, 10000] |
Description:
deepflow-agent will send logs to deepflow-server, 0 means no limit.
#1.2.5 Maximum Local Log File Size
Tags:
hot_update
FQCN:
global.limits.max_local_log_file_size
Upgrade from old version: log_file_size
Default value:
global:
limits:
max_local_log_file_size: 1000
2
3
Schema:
Key | Value |
---|---|
Type | int |
Unit | MiB |
Range | [10, 10000] |
Description:
The maximum disk space allowed for deepflow-agent log files.
#1.2.6 Local Log Retention
Tags:
hot_update
FQCN:
global.limits.local_log_retention
Upgrade from old version: log_retention
Default value:
global:
limits:
local_log_retention: 300d
2
3
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['10d', '10000d'] |
Description:
The retention time for deepflow-agent log files.
#1.3 Alerts
#1.3.1 Thread Limit
Tags:
hot_update
FQCN:
global.alerts.thread_threshold
Upgrade from old version: thread_threshold
Default value:
global:
alerts:
thread_threshold: 500
2
3
Schema:
Key | Value |
---|---|
Type | int |
Range | [1, 1000] |
Description:
Maximum number of threads that deepflow-agent is allowed to launch.
#1.3.2 Process Limit
Tags:
hot_update
FQCN:
global.alerts.process_threshold
Upgrade from old version: process_threshold
Default value:
global:
alerts:
process_threshold: 10
2
3
Schema:
Key | Value |
---|---|
Type | int |
Range | [1, 100] |
Description:
Maximum number of processes that deepflow-agent is allowed to launch.
#1.3.3 Core File Checker
Tags:
agent_restart deprecated
FQCN:
global.alerts.check_core_file_disabled
Upgrade from old version: static_config.check-core-file-disabled
Default value:
global:
alerts:
check_core_file_disabled: false
2
3
Schema:
Key | Value |
---|---|
Type | bool |
Description:
When the host has an invalid NFS file system or a docker is running, sometime program hang when checking the core file, so the core file check provides a switch to prevent the process hang. Additional links:
- https://serverfault.com/questions/367438/ls-hangs-for-a-certain-directory
- https://unix.stackexchange.com/questions/495854/processes-hanging-when-trying-to-access-a-file
#1.4 Circuit Breakers
#1.4.1 System Free Memory Percentage
Calculation Method: (free_memory / total_memory) * 100%
#1.4.1.1 Trigger Threshold
Tags:
hot_update
FQCN:
global.circuit_breakers.sys_memory_percentage.trigger_threshold
Upgrade from old version: sys_free_memory_limit
Default value:
global:
circuit_breakers:
sys_memory_percentage:
trigger_threshold: 0
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Unit | % |
Range | [0, 100] |
Description:
Setting sys_free_memory_limit to 0 indicates that the system free memory ratio is not checked.
- When the current system free memory ratio is below sys_free_memory_limit * 70%, the agent will automatically restart.
- When the current system free memory ratio is below sys_free_memory_limit but above 70%, the agent enters the disabled state.
- When the current system free memory ratio remains above sys_free_memory_limit * 110%, the agent recovers from the disabled state.
#1.4.1.2 Metric
Tags:
hot_update
FQCN:
global.circuit_breakers.sys_memory_percentage.metric
Upgrade from old version: sys_free_memory_metric
Default value:
global:
circuit_breakers:
sys_memory_percentage:
metric: free
2
3
4
Schema:
Key | Value |
---|---|
Type | string |
Description:
deepflow-agent observes the percentage of this memory metric.
#1.4.2 Relative System Load
Calculation Method: system_load / total_cpu_cores
#1.4.2.1 Trigger Threshold
Tags:
hot_update
FQCN:
global.circuit_breakers.relative_sys_load.trigger_threshold
Upgrade from old version: system_load_circuit_breaker_threshold
Default value:
global:
circuit_breakers:
relative_sys_load:
trigger_threshold: 1.0
2
3
4
Schema:
Key | Value |
---|---|
Type | float |
Range | [0, 10] |
Description:
When the load of the Linux system divided by the number of CPU cores exceeds this value, the agent automatically enters the disabled state. It will automatically recover if it remains below 90% of this value for a continuous 5 minutes. Setting it to 0 disables this feature.
#1.4.2.2 Recovery Threshold
Tags:
hot_update
FQCN:
global.circuit_breakers.relative_sys_load.recovery_threshold
Upgrade from old version: system_load_circuit_breaker_recover
Default value:
global:
circuit_breakers:
relative_sys_load:
recovery_threshold: 0.9
2
3
4
Schema:
Key | Value |
---|---|
Type | float |
Range | [0, 10] |
Description:
When the system load of the Linux system divided by the number of CPU cores is continuously below this value for 5 minutes, the agent can recover from the circuit breaker disabled state, and setting it to 0 means turning off the circuit breaker feature.
#1.4.2.3 Metric
Tags:
hot_update
FQCN:
global.circuit_breakers.relative_sys_load.system_load_circuit_breaker_metric
Upgrade from old version: system_load_circuit_breaker_metric
Default value:
global:
circuit_breakers:
relative_sys_load:
metric: load15
2
3
4
Enum options:
Value | Note |
---|---|
load1 | |
load5 | |
load15 |
Schema:
Key | Value |
---|---|
Type | string |
Description:
The system load circuit breaker mechanism uses this metric, and the agent will check this metric every 10 seconds by default.
#1.4.3 Tx Throughput
#1.4.3.1 Trigger Threshold
Tags:
hot_update
ee_feature
FQCN:
global.circuit_breakers.tx_throughput.trigger_threshold
Upgrade from old version: max_tx_bandwidth
Default value:
global:
circuit_breakers:
tx_throughput:
trigger_threshold: 0
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Unit | Mbps |
Range | [0, 100000] |
Description:
When the outbound throughput of the NPB interface reaches or exceeds
the threshold, the broker will be stopped, after that the broker will
be resumed if the throughput is lower than
(trigger_threshold - outputs.npb.max_npb_throughput)*90%
within 5 consecutive monitoring intervals.
Attention: When configuring this value, it must be greater than
outputs.npb.max_npb_throughput
. Set to 0 will disable this feature.
#1.4.3.2 Throughput Monitoring Interval
Tags:
hot_update
ee_feature
FQCN:
global.circuit_breakers.tx_throughput.throughput_monitoring_interval
Upgrade from old version: bandwidth_probe_interval
Default value:
global:
circuit_breakers:
tx_throughput:
throughput_monitoring_interval: 10s
2
3
4
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['1s', '60s'] |
Description:
Monitoring interval for outbound traffic rate of NPB interface.
#1.5 Tunning
#1.5.1 CPU Affinity
Tags:
agent_restart
FQCN:
global.tunning.cpu_affinity
Upgrade from old version: static_config.cpu-affinity
Default value:
global:
tunning:
cpu_affinity: []
2
3
Schema:
Key | Value |
---|---|
Type | int |
Range | [0, 65536] |
Description:
CPU affinity is the tendency of a process to run on a given CPU for as long as possible without being migrated to other processors. Example:
global:
tunning:
cpu_affinity: [1, 3, 5, 7, 9]
2
3
#1.5.2 Process Scheduling Priority
Tags:
agent_restart
FQCN:
global.tunning.process_scheduling_priority
Upgrade from old version: static_config.process-scheduling-priority
Default value:
global:
tunning:
process_scheduling_priority: 0
2
3
Schema:
Key | Value |
---|---|
Type | int |
Range | [-20, 19] |
Description:
The smaller the value of process scheduling priority, the higher the priority of the
deepflow-agent
process, and the larger the value, the lower the priority.
#1.5.3 Idle Memory Trimming
Tags:
agent_restart
FQCN:
global.tunning.idle_memory_trimming
Upgrade from old version: static_config.memory-trim-disabled
Default value:
global:
tunning:
idle_memory_trimming: false
2
3
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Proactive memory trimming can effectively reduce memory usage, but there may be performance loss.
#1.5.4 Resource Monitoring Interval
Tags:
agent_restart
FQCN:
global.tunning.resource_monitoring_interval
Upgrade from old version: static_config.guard-interval
Default value:
global:
tunning:
resource_monitoring_interval: 10s
2
3
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['1s', '3600s'] |
Description:
The agent will monitor:
- System free memory
- Get the number of threads of the agent itself by reading the file information under the /proc directory
- Size and number of log files generated by the agent.
- System load
- Agent memory usage (check if memory trimming is needed)
#1.6 NTP Clock Synchronization
This synchronization mechanism does not alter the host's clock; it is only used internally by the deepflow-agent process.
#1.6.1 Enabled
Tags:
hot_update
FQCN:
global.ntp.enabled
Upgrade from old version: ntp_enabled
Default value:
global:
ntp:
enabled: false
2
3
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Whether to synchronize the clock to the deepflow-server, this behavior will not change the time of the deepflow-agent running environment.
#1.6.2 Maximum Drift
Tags:
agent_restart
FQCN:
global.ntp.max_drift
Upgrade from old version: static_config.ntp-max-interval
Default value:
global:
ntp:
max_drift: 300s
2
3
Schema:
Key | Value |
---|---|
Type | duration |
Range | [0, '365d'] |
Description:
When the clock drift exceeds this value, the agent will restart.
#1.6.3 Minimal Drift
Tags:
agent_restart
FQCN:
global.ntp.min_drift
Upgrade from old version: static_config.ntp-min-interval
Default value:
global:
ntp:
min_drift: 10s
2
3
Schema:
Key | Value |
---|---|
Type | duration |
Range | [0, '365d'] |
Description:
When the clock drift exceeds this value, the timestamp will be corrected.
#1.7 Communication
#1.7.1 Proactive Request Interval
Tags:
hot_update
FQCN:
global.communication.proactive_request_interval
Upgrade from old version: sync_interval
Default value:
global:
communication:
proactive_request_interval: 60s
2
3
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['10s', '3600s'] |
Description:
The interval at which deepflow-agent proactively requests configuration and tag information from deepflow-server.
#1.7.2 Maximum Escape Duration
Tags:
hot_update
FQCN:
global.communication.max_escape_duration
Upgrade from old version: max_escape_seconds
Default value:
global:
communication:
max_escape_duration: 3600s
2
3
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['600s', '30d'] |
Description:
The maximum time that the agent is allowed to work normally when it cannot connect to the server. After the timeout, the agent automatically enters the disabled state.
#1.7.3 Controller IP Address
Tags:
hot_update
FQCN:
global.communication.controller_ip
Upgrade from old version: proxy_controller_ip
Default value:
global:
communication:
controller_ip: ''
2
3
Schema:
Key | Value |
---|---|
Type | ip |
Description:
When this value is set, deepflow-agent will use this IP to access the control plane port of deepflow-server, which is usually used when deepflow-server uses an external load balancer.
#1.7.4 Controller Port
Tags:
hot_update
FQCN:
global.communication.controller_port
Upgrade from old version: proxy_controller_port
Default value:
global:
communication:
controller_port: 30035
2
3
Schema:
Key | Value |
---|---|
Type | int |
Range | [1, 65535] |
Description:
The control plane port used by deepflow-agent to access deepflow-server. The default port within the same K8s cluster is 20035, and the default port of deepflow-agent outside the cluster is 30035.
#1.7.5 Ingester IP Address
Tags:
hot_update
FQCN:
global.communication.ingester_ip
Upgrade from old version: analyzer_ip
Default value:
global:
communication:
ingester_ip: ''
2
3
Schema:
Key | Value |
---|---|
Type | ip |
Description:
When this value is set, deepflow-agent will use this IP to access the data plane port of deepflow-server, which is usually used when deepflow-server uses an external load balancer.
#1.7.6 Ingester Port
Tags:
hot_update
FQCN:
global.communication.ingester_port
Upgrade from old version: analyzer_port
Default value:
global:
communication:
ingester_port: 30033
2
3
Schema:
Key | Value |
---|---|
Type | int |
Range | [1, 65535] |
Description:
The data plane port used by deepflow-agent to access deepflow-server. The default port within the same K8s cluster is 20033, and the default port of deepflow-agent outside the cluster is 30033.
#1.7.7 gRPC Socket Buffer Size
Tags:
agent_restart
FQCN:
global.communication.grpc_buffer_size
Upgrade from old version: static_config.grpc-buffer-size
Default value:
global:
communication:
grpc_buffer_size: 5
2
3
Schema:
Key | Value |
---|---|
Type | int |
Unit | MiB |
Range | [5, 1024] |
Description:
gRPC socket buffer size.
#1.7.8 Request via NAT IP Address
Tags:
hot_update
FQCN:
global.communication.request_via_nat_ip
Upgrade from old version: nat_ip_enabled
Default value:
global:
communication:
request_via_nat_ip: false
2
3
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Used when deepflow-agent uses an external IP address to access deepflow-server. For example, when deepflow-server is behind a NAT gateway, or the host where deepflow-server is located has multiple node IP addresses and different deepflow-agents need to access different node IPs, you can set an additional NAT IP for each deepflow-server address, and modify this value to true.
#1.8 Self Monitoring
#1.8.1 Log
#1.8.1.1 Log Level
Tags:
hot_update
FQCN:
global.self_monitoring.log.log_level
Upgrade from old version: log_level
Default value:
global:
self_monitoring:
log:
log_level: INFO
2
3
4
Enum options:
Value | Note |
---|---|
DEBUG | |
INFO | |
WARNING | |
ERROR |
Schema:
Key | Value |
---|---|
Type | string |
Description:
Log level of deepflow-agent.
#1.8.1.2 Log File
Tags:
agent_restart
FQCN:
global.self_monitoring.log.log_file
Upgrade from old version: static_config.log-file
Default value:
global:
self_monitoring:
log:
log_file: /var/log/deepflow_agent/deepflow_agent.log
2
3
4
Schema:
Key | Value |
---|---|
Type | string |
Description:
Note that this configuration is only used in standalone mode.
#1.8.1.3 Log Backhaul Enabled
Tags:
hot_update
FQCN:
global.self_monitoring.log.log_backhaul_enabled
Upgrade from old version: rsyslog_enabled
Default value:
global:
self_monitoring:
log:
log_backhaul_enabled: true
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
When enabled, deepflow-agent will send its own logs to deepflow-server.
#1.8.2 Profile
#1.8.2.1 Enabled
Tags:
agent_restart deprecated
FQCN:
global.self_monitoring.profile.enabled
Upgrade from old version: static_config.profiler
Default value:
global:
self_monitoring:
profile:
enabled: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Only available for Trident (Golang version of Agent).
#1.8.3 Debug
#1.8.3.1 Enabled
Tags:
hot_update
FQCN:
global.self_monitoring.debug.enabled
Upgrade from old version: debug_enabled
Default value:
global:
self_monitoring:
debug:
enabled: true
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Disabled / Enabled the debug function of the deepflow-agent.
#1.8.3.2 Local UDP Port
Tags:
agent_restart
FQCN:
global.self_monitoring.debug.local_udp_port
Upgrade from old version: static_config.debug-listen-port
Default value:
global:
self_monitoring:
debug:
local_udp_port: 0
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [0, 65535] |
Description:
Default value 0
means use a random client port number.
Only available for Trident (Golang version of Agent).
#1.8.3.3 Debug Metrics Enabled
Tags:
agent_restart deprecated
FQCN:
global.self_monitoring.debug.debug_metrics_enabled
Upgrade from old version: static_config.enable-debug-stats
Default value:
global:
self_monitoring:
debug:
debug_metrics_enabled: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Only available for Trident (Golang version of Agent).
#1.8.4 Hostname
Tags:
hot_update
FQCN:
global.self_monitoring.hostname
Upgrade from old version: host
Default value:
global:
self_monitoring:
hostname: ''
2
3
Schema:
Key | Value |
---|---|
Type | string |
Description:
Override statsd host tag.
#1.9 Standalone Mode
#1.9.1 Maximum Data File Size
Tags:
agent_restart
FQCN:
global.standalone_mode.max_data_file_size
Upgrade from old version: static_config.standalone-data-file-size
Default value:
global:
standalone_mode:
max_data_file_size: 200
2
3
Schema:
Key | Value |
---|---|
Type | int |
Unit | MiB |
Range | [1, 1000000] |
Description:
When deepflow-agent runs in standalone mode, it will not be controlled by deepflow-server, and the collected data will only be written to the local file. Currently supported data types for writing are l4_flow_log and l7_flow_log. Each type of data is written to a separate file. This configuration can be used to specify the maximum size of the data file, and rotate when it exceeds this size. A maximum of two files are kept for each type of data.
#1.9.2 Data File Directory
Tags:
agent_restart
FQCN:
global.standalone_mode.data_file_dir
Upgrade from old version: static_config.standalone-data-file-dir
Default value:
global:
standalone_mode:
data_file_dir: /var/log/deepflow_agent/
2
3
Schema:
Key | Value |
---|---|
Type | string |
Description:
Directory where data files are written to.
#1.10 Tags
Tags related to deepflow-agent.
#1.10.1 Region ID
Tags:
hot_update
FQCN:
global.tags.region_id
Upgrade from old version: region_id
Default value:
global:
tags:
region_id: 0
2
3
Schema:
Key | Value |
---|---|
Type | int |
Description:
Region ID of the deepflow-agent or Region ID of the data node.
#1.10.2 Pod cluster ID
Tags:
hot_update
FQCN:
global.tags.pod_cluster_id
Upgrade from old version: pod_cluster_id
Default value:
global:
tags:
pod_cluster_id: 0
2
3
Schema:
Key | Value |
---|---|
Type | int |
Description:
Cluster ID of the container where the deepflow-agent is located.
#1.10.3 VPC ID
Tags:
hot_update
FQCN:
global.tags.vpc_id
Upgrade from old version: epc_id
Default value:
global:
tags:
vpc_id: 0
2
3
Schema:
Key | Value |
---|---|
Type | int |
Description:
The ID of the VPC where the deepflow-agent is located is meaningful only for Workload-V/P and pod-V/P types.
#1.10.4 Agent ID
Tags:
hot_update
FQCN:
global.tags.agent_id
Upgrade from old version: vtap_id
Default value:
global:
tags:
agent_id: 0
2
3
Schema:
Key | Value |
---|---|
Type | int |
Range | [0, 64000] |
Description:
Agent ID.
#1.10.5 Agent Type
Tags:
hot_update
FQCN:
global.tags.agent_type
Upgrade from old version: trident_type
Default value:
global:
tags:
agent_type: 0
2
3
Enum options:
Value | Note |
---|---|
0 | TT_UNKNOWN |
1 | TT_PROCESS, Agent in KVM |
2 | TT_VM, Agent in a dedicated VM on ESXi |
3 | TT_PUBLIC_CLOUD, Agent in Cloud host (VM) |
5 | TT_PHYSICAL_MACHINE, Agent in Cloud host (BM), or legacy host |
6 | TT_DEDICATED_PHYSICAL_MACHINE, Agent in a dedicated host to receive mirror traffic |
7 | TT_HOST_POD, Agent in K8s Node (Cloud BM, or legacy host) |
8 | TT_VM_POD, Agent in K8s Node (Cloud VM) |
9 | TT_TUNNEL_DECAPSULATION, Agent in a dedicated host to decap tunnel traffic |
10 | TT_HYPER_V_COMPUTE, Agent in Hyper-V Compute Node |
11 | TT_HYPER_V_NETWORK, Agent in Hyper-V Network Node |
12 | TT_K8S_SIDECAR, Agent in K8s POD |
Schema:
Key | Value |
---|---|
Type | int |
Range | [0, 12] |
Description:
Agent Type.
#1.10.6 Team ID
Tags:
hot_update
FQCN:
global.tags.team_id
Upgrade from old version: team_id
Default value:
global:
tags:
team_id: 0
2
3
Schema:
Key | Value |
---|---|
Type | int |
Description:
The ID of the team where the deepflow-agent is located.
#1.10.7 Organize ID
Tags:
hot_update
FQCN:
global.tags.organize_id
Upgrade from old version: organize_id
Default value:
global:
tags:
organize_id: 0
2
3
Schema:
Key | Value |
---|---|
Type | int |
Description:
The ID of the organize where the deepflow-agent is located.
#2. Inputs
#2.1 Proc
#2.1.1 Enabled
Tags:
agent_restart
FQCN:
inputs.proc.enabled
Upgrade from old version: static_config.os-proc-sync-enabled
Default value:
inputs:
proc:
enabled: false
2
3
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Only make sense when agent type is one of CHOST_VM, CHOST_BM, K8S_VM, K8S_BM.
#2.1.2 Directory of /proc
Tags:
agent_restart
FQCN:
inputs.proc.proc_dir_path
Upgrade from old version: static_config.os-proc-root
Default value:
inputs:
proc:
proc_dir_path: /proc
2
3
Schema:
Key | Value |
---|---|
Type | string |
Description:
The /proc fs mount path.
#2.1.3 Synchronization Interval
Tags:
agent_restart
FQCN:
inputs.proc.sync_interval
Upgrade from old version: static_config.os-proc-socket-sync-interval
Default value:
inputs:
proc:
sync_interval: 10s
2
3
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['1s', '1h'] |
Description:
The interval of socket info sync.
#2.1.4 Minimal Lifetime
Tags:
agent_restart
FQCN:
inputs.proc.min_lifetime
Upgrade from old version: static_config.os-proc-socket-min-lifetime
Default value:
inputs:
proc:
min_lifetime: 3s
2
3
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['1s', '1h'] |
Description:
Socket and Process uptime threshold
#2.1.5 Tag Extraction
#2.1.5.1 Script Command
Tags:
agent_restart
FQCN:
inputs.proc.tag_extraction.script_command
Upgrade from old version: static_config.os-app-tag-exec
Default value:
inputs:
proc:
tag_extraction:
script_command: []
2
3
4
Schema:
Key | Value |
---|---|
Type | string |
Description:
Execute the command every time when scan the process, expect get the process tag from stdout in yaml format, the example yaml format as follow:
- pid: 1
tags:
- key: xxx
value: xxx
- pid: 2
tags:
- key: xxx
value: xxx
2
3
4
5
6
7
8
Example configuration:
inputs:
proc:
tag_extraction:
script_command: ["cat", "/tmp/tag.yaml"]
2
3
4
#2.1.5.2 Execution Username
Tags:
agent_restart
FQCN:
inputs.proc.tag_extraction.exec_username
Upgrade from old version: static_config.os-app-tag-exec-user
Default value:
inputs:
proc:
tag_extraction:
exec_username: deepflow
2
3
4
Schema:
Key | Value |
---|---|
Type | string |
Description:
The user who should execute the os-app-tag-exec
command.
#2.1.6 Process Matcher
Tags:
agent_restart
FQCN:
inputs.proc.process_matcher
Upgrade from old version: static_config.os-proc-regex
Default value:
inputs:
proc:
process_matcher:
- enabled_features:
- ebpf.profile.on_cpu
- ebpf.profile.off_cpu
match_regex: deepflow-*
only_in_container: false
2
3
4
5
6
7
8
Schema:
Key | Value |
---|---|
Type | dict |
Description:
Will traverse over the entire array, so the previous ones will be matched first.
when match_type is parent_process_name, will recursive to match parent proc name,
and rewrite_name field will ignore. rewrite_name can replace by regexp capture group
and windows style environment variable, for example: $1-py-script-%HOSTNAME%
will
replace regexp capture group 1 and HOSTNAME env var. If proc not match any regexp
will be accepted (essentially will auto append - match_regex: .*
at the end).
Configuration Item:
- match_regex: The regexp use for match the process, default value is
.*
- match_type: regexp match field, default value is
process_name
, options are [process_name, cmdline, cmdline_with_args, parent_process_name, tag] - ignore: Whether to ignore when regex match, default value is
false
- rewrite_name: The name will replace the process name or cmd use regexp replace.
Default value
""
means no replacement.
Example:
inputs:
proc:
process_matcher:
- match_regex: python3 (.*)\.py
match_type: cmdline
match_languages: []
match_usernames: []
only_in_container: true
only_with_tag: false
ignore: false
rewrite_name: $1-py-script
enabled_features: [ebpf.socket.uprobe.golang, ebpf.profile.on_cpu]
- match_regex: (?P<PROC_NAME>nginx)
match_type: process_name
rewrite_name: ${PROC_NAME}-%HOSTNAME%
- match_regex: "nginx"
match_type: parent_process_name
ignore: true
- match_regex: .*sleep.*
match_type: process_name
ignore: true
- match_regex: .+ # match after concatenating a tag key and value pair using colon,
# i.e., an regex `app:.+` can match all processes has a `app` tag
match_type: tag
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#2.1.6.1 Match Regex
Tags:
agent_restart
FQCN:
inputs.proc.process_matcher.match_regex
Upgrade from old version: static_config.os-proc-regex.match-regex
Default value:
inputs:
proc:
process_matcher:
- match_regex: ''
2
3
4
Schema:
Key | Value |
---|---|
Type | string |
Description:
The regex of matcher.
#2.1.6.2 Match Type
Tags:
agent_restart
FQCN:
inputs.proc.process_matcher.match_type
Upgrade from old version: static_config.os-proc-regex.match-regex
Default value:
inputs:
proc:
process_matcher:
- match_type: ''
2
3
4
Enum options:
Value | Note |
---|---|
process_name | |
cmdline | |
parent_process_name | |
tag | |
cmdline_with_args |
Schema:
Key | Value |
---|---|
Type | string |
Description:
The type of matcher.
#2.1.6.3 Match Languages
Tags:
agent_restart
FQCN:
inputs.proc.process_matcher.match_languages
Default value:
inputs:
proc:
process_matcher:
- match_languages: []
2
3
4
Enum options:
Value | Note |
---|---|
java | |
golang | |
python | |
nodejs | |
dotnet |
Schema:
Key | Value |
---|---|
Type | string |
Description:
Default value []
match all languages.
#2.1.6.4 Match Usernames
Tags:
agent_restart
FQCN:
inputs.proc.process_matcher.match_usernames
Default value:
inputs:
proc:
process_matcher:
- match_usernames: []
2
3
4
Schema:
Key | Value |
---|---|
Type | string |
Description:
Default value []
match all usernames.
#2.1.6.5 Only in Container
Tags:
agent_restart
FQCN:
inputs.proc.process_matcher.only_in_container
Default value:
inputs:
proc:
process_matcher:
- only_in_container: true
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Default value true means only match processes in container.
#2.1.6.6 Only with Tag
Tags:
agent_restart
FQCN:
inputs.proc.process_matcher.only_with_tag
Upgrade from old version: static_config.os-proc-sync-tagged-only
Default value:
inputs:
proc:
process_matcher:
- only_with_tag: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Default value false means match processes with or without tags.
#2.1.6.7 Ignore
Tags:
agent_restart
FQCN:
inputs.proc.process_matcher.ignore
Upgrade from old version: static_config.os-proc-regex.action
Default value:
inputs:
proc:
process_matcher:
- ignore: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Whether to ingore matched processes..
#2.1.6.8 Rewrite Name
Tags:
agent_restart
FQCN:
inputs.proc.process_matcher.rewrite_name
Upgrade from old version: static_config.os-proc-regex.rewrite-name
Default value:
inputs:
proc:
process_matcher:
- rewrite_name: ''
2
3
4
Schema:
Key | Value |
---|---|
Type | string |
Description:
New name after matched.
#2.1.6.9 Enabled Features
Tags:
agent_restart
FQCN:
inputs.proc.process_matcher.enabled_features
Upgrade from old version: static_config.ebpf.on-cpu-profile.regex, static_config.ebpf.off-cpu-profile.regex
Default value:
inputs:
proc:
process_matcher:
- enabled_features: []
2
3
4
Enum options:
Value | Note |
---|---|
proc.socket_list | |
proc.symbol_table | |
proc.proc_event | |
ebpf.socket.uprobe.golang | |
ebpf.socket.uprobe.tls | |
ebpf.socket.uprobe.rdma | |
ebpf.file.io_event | |
ebpf.file.management_event | |
ebpf.profile.on_cpu | |
ebpf.profile.off_cpu | |
ebpf.profile.memory | |
ebpf.profile.cuda | |
ebpf.profile.hbm |
Schema:
Key | Value |
---|---|
Type | string |
Description:
Enabled feature list.
#2.1.7 Symbol Table
#2.1.7.1 Golang-specific
#2.1.7.1.1 Enabled
Tags:
agent_restart
FQCN:
inputs.proc.symbol_table.golang_specific.enabled
Upgrade from old version: static_config.ebpf.uprobe-process-name-regexs.golang-symbol
Default value:
inputs:
proc:
symbol_table:
golang_specific:
enabled: false
2
3
4
5
Schema:
Key | Value |
---|---|
Type | string |
Description:
Whether to enable Golang-specific symbol table parsing.
This feature acts on Golang processes that have trimmed the standard symbol
table. When this feature is enabled, for processes with Golang
version >= 1.13 and < 1.18, when the standard symbol table is missing, the
Golang-specific symbol table will be parsed to complete uprobe data collection.
Note that enabling this feature may cause the eBPF initialization process to
take ten minutes. The golang-symbol
configuration item depends on the golang
configuration item, the golang-symbol
is a subset of the golang
configuration item.
Example:
- Ensure that the regular expression matching for the 'golang' configuration
item is enabled, for example:
golang: .*
- You've encountered the following warning log:Suppose there is a Golang process with a process ID of '1946.'
[eBPF] WARNING: func resolve_bin_file() [user/go_tracer.c:558] Go process pid 1946 [path: /proc/1946/root/usr/local/bin/kube-controller-manager] (version: go1.16). Not find any symbols!
1
2 - To initially confirm whether the executable file for this process has a symbol table:
- Retrieve the executable file's path using the process ID:
# ls -al /proc/1946/exe /proc/1946/exe -> /usr/local/bin/kube-controller-manager
1
2 - Check if there is a symbol table:
# nm /proc/1946/root/usr/local/bin/kube-controller-manager nm: /proc/1946/root/usr/local/bin/kube-controller-manager: no symbols
1
2
- Retrieve the executable file's path using the process ID:
- If "no symbols" is encountered, it indicates the absence of a symbol table. In such a scenario, we need to configure the "golang-symbol" setting.
- During the agent startup process, you will observe the following log information: (The entry
address for the function
crypto/tls.(*Conn).Write
has already been resolved, i.e., entry:0x25fca0).The logs indicate that the Golang program has been successfully hooked.[eBPF] INFO Uprobe [/proc/1946/root/usr/local/bin/kube-controller-manager] pid:1946 go1.16.0 entry:0x25fca0 size:1952 symname:crypto/tls.(*Conn).Write probe_func:uprobe_go_tls_write_enter rets_count:0
1
2
#2.1.7.2 Java
#2.1.7.2.1 Refresh Defer Duration
Tags:
agent_restart
FQCN:
inputs.proc.symbol_table.java.refresh_defer_duration
Upgrade from old version: static_config.ebpf.java-symbol-file-refresh-defer-interval
Default value:
inputs:
proc:
symbol_table:
java:
refresh_defer_duration: 60s
2
3
4
5
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['5s', '3600s'] |
Description:
When deepflow-agent finds that an unresolved function name appears in the function call stack of a Java process, it will trigger the regeneration of the symbol file of the process. Because Java utilizes the Just-In-Time (JIT) compilation mechanism, to obtain more symbols for Java processes, the regeneration will be deferred for a period of time.
At the startup of a Java program, the JVM and JIT compiler are in a "warm-up" phase. During this period, symbol changes are typically frequent due to the dynamic compilation and optimization processes. Therefore, deepflow-agent delay symbol collection for one minute after the Java program starts, allowing the JVM and JIT to "warm up" and for symbol churn to be minimized before proceeding with the collection.
#2.1.7.2.2 Maximum Symbol File Size
Tags:
agent_restart
FQCN:
inputs.proc.symbol_table.java.max_symbol_file_size
Upgrade from old version: static_config.ebpf.java-symbol-file-max-space-limit
Default value:
inputs:
proc:
symbol_table:
java:
max_symbol_file_size: 10
2
3
4
5
Schema:
Key | Value |
---|---|
Type | int |
Unit | MiB |
Range | [2, 100] |
Description:
All Java symbol files are stored in the '/tmp' directory mounted by the deepflow-agent. To prevent excessive occupation of host node space due to large Java symbol files, a maximum size limit is set for each generated Java symbol file.
#2.2 cBPF
#2.2.1 Common
#2.2.1.1 Packet Capture Mode
Tags:
hot_update
FQCN:
inputs.cbpf.common.capture_mode
Upgrade from old version: tap_mode
Default value:
inputs:
cbpf:
common:
capture_mode: 0
2
3
4
Enum options:
Value | Note |
---|---|
0 | Local |
1 | Virtual Mirror |
2 | Physical Mirror |
Schema:
Key | Value |
---|---|
Type | int |
Description:
Mirror mode is used when deepflow-agent cannot directly capture the traffic from the source. For example:
- in the K8s macvlan environment, capture the Pod traffic through the Node NIC
- in the Hyper-V environment, capture the VM traffic through the Hypervisor NIC
- in the ESXi environment, capture traffic through VDS/VSS local SPAN
- in the DPDK environment, capture traffic through DPDK ring buffer
Use Physical Mirror mode when deepflow-agent captures traffic through physical switch mirroring.
Physical Mirror
is only supported in the Enterprise Edition.
#2.2.2 Capture via AF_PACKET
#2.2.2.1 Interface Regex
Tags:
hot_update
FQCN:
inputs.cbpf.af_packet.interface_regex
Upgrade from old version: tap_interface_regex
Default value:
inputs:
cbpf:
af_packet:
interface_regex: ^(tap.*|cali.*|veth.*|eth.*|en[osipx].*|lxc.*|lo|[0-9a-f]+_h)$
2
3
4
Schema:
Key | Value |
---|---|
Type | string |
Range | [0, 65535] |
Description:
Regular expression of NIC name for collecting traffic.
Explanation of the default configuration:
Localhost: lo
Common NIC: eth.*|en[osipx].*
QEMU VM NIC: tap.*
Flannel: veth.*
Calico: cali.*
Cilium lxc.*
Kube-OVN [0-9a-f]+_h$
2
3
4
5
6
7
When the tap_interface_regex
is not configured, it indicates
that network card traffic is not being collected
#2.2.2.2 Bond Interfaces
Tags:
agent_restart
FQCN:
inputs.cbpf.af_packet.bond_interfaces
Upgrade from old version: static_config.tap-interface-bond-groups
Default value:
inputs:
cbpf:
af_packet:
bond_interfaces: []
2
3
4
Schema:
Key | Value |
---|---|
Type | dict |
Description:
Packets of interfaces in the same group can be aggregated together, Only effective when capture_mode is 0.
Example:
inputs:
cbpf:
af_packet:
bond_interfaces:
- slave_interfaces: [eth0, eth1]
- slave_interfaces: [eth2, eth3]
2
3
4
5
6
#2.2.2.2.1 Slave Interfaces
Tags:
agent_restart
FQCN:
inputs.cbpf.af_packet.bond_interfaces.slave_interfaces
Upgrade from old version: static_config.tap-interface-bond-groups.tap-interfaces
Default value:
inputs:
cbpf:
af_packet:
bond_interfaces:
- slave_interfaces: []
2
3
4
5
Schema:
Key | Value |
---|---|
Type | string |
Description:
The slave interfaces of one bond interface.
#2.2.2.3 Extra Network Namespace Regex
Tags:
hot_update
ee_feature
FQCN:
inputs.cbpf.af_packet.extra_netns_regex
Upgrade from old version: extra_netns_regex
Default value:
inputs:
cbpf:
af_packet:
extra_netns_regex: ''
2
3
4
Schema:
Key | Value |
---|---|
Type | string |
Description:
Packet will be captured in regex matched namespaces besides the default
namespace. NICs captured in extra namespaces are also filtered with
tap_interface_regex
.
Default value ""
means no extra network namespace (default namespace only).
#2.2.2.4 Extra BPF Filter
Tags:
hot_update
FQCN:
inputs.cbpf.af_packet.extra_bpf_filter
Upgrade from old version: capture_bpf
Default value:
inputs:
cbpf:
af_packet:
extra_bpf_filter: ''
2
3
4
Schema:
Key | Value |
---|---|
Type | string |
Range | [0, 512] |
Description:
If not configured, all traffic will be collected. Please refer to BPF syntax: https://biot.com/capstats/bpf.html
#2.2.2.5 TAP Interfaces
Tags:
deprecated
FQCN:
inputs.cbpf.af_packet.src_interfaces
Upgrade from old version: static_config.src-interfaces
Default value:
inputs:
cbpf:
af_packet:
src_interfaces: []
2
3
4
Schema:
Key | Value |
---|---|
Type | string |
#2.2.2.6 VLAN PCP in Physical Mirror Traffic
Tags:
agent_restart ee_feature
FQCN:
inputs.cbpf.af_packet.vlan_pcp_in_physical_mirror_traffic
Upgrade from old version: static_config.mirror-traffic-pcp
Default value:
inputs:
cbpf:
af_packet:
vlan_pcp_in_physical_mirror_traffic: 0
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [0, 8] |
Description:
When mirror-traffic-pcp <= 7 calculate TAP value from vlan tag only if vlan pcp matches this value. when mirror-traffic-pcp is 8 calculate TAP value from outer vlan tag, when mirror-traffic-pcp is 9 calculate TAP value from inner vlan tag.
#2.2.2.7 BPF Filter Disabled
Tags:
agent_restart
FQCN:
inputs.cbpf.af_packet.bpf_filter_disabled
Upgrade from old version: static_config.bpf-disabled
Default value:
inputs:
cbpf:
af_packet:
bpf_filter_disabled: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
It is found that there may be bugs in BPF traffic filtering under some versions of Linux Kernel. After this configuration is enabled, deepflow-agent will not use the filtering capabilities of BPF, and will filter by itself after capturing full traffic. Note that this may significantly increase the resource overhead of deepflow-agent.
#2.2.2.8 Tunning
#2.2.2.8.1 Socket Version
Tags:
hot_update
FQCN:
inputs.cbpf.af_packet.tunning.socket_version
Upgrade from old version: capture_socket_type
Default value:
inputs:
cbpf:
af_packet:
tunning:
socket_version: 0
2
3
4
5
Enum options:
Value | Note |
---|---|
0 | Adaptive |
2 | AF_PACKET V2 |
3 | AF_PACKET V3 |
Schema:
Key | Value |
---|---|
Type | int |
Description:
AF_PACKET socket version in Linux environment.
#2.2.2.8.2 Ring Blocks Config Enabled
Tags:
agent_restart
FQCN:
inputs.cbpf.af_packet.tunning.ring_blocks_enabled
Upgrade from old version: static_config.afpacket-blocks-enabled
Default value:
inputs:
cbpf:
af_packet:
tunning:
ring_blocks_enabled: false
2
3
4
5
Schema:
Key | Value |
---|---|
Type | bool |
Description:
When capture_mode != 2, you need to explicitly turn on this switch to configure 'afpacket-blocks'.
#2.2.2.8.3 Ring Blocks
Tags:
agent_restart
FQCN:
inputs.cbpf.af_packet.tunning.ring_blocks
Upgrade from old version: static_config.afpacket-blocks
Default value:
inputs:
cbpf:
af_packet:
tunning:
ring_blocks: 128
2
3
4
5
Schema:
Key | Value |
---|---|
Type | int |
Range | [8, 1000000] |
Description:
deepflow-agent will automatically calculate the number of blocks used by AF_PACKET according to max_memory, which can also be specified using this configuration item. The size of each block is fixed at 1MB.
#2.2.2.8.4 Packet Fanout Count
Tags:
agent_restart
FQCN:
inputs.cbpf.af_packet.tunning.packet_fanout_count
Upgrade from old version: static_config.local-dispatcher-count
Default value:
inputs:
cbpf:
af_packet:
tunning:
packet_fanout_count: 1
2
3
4
5
Schema:
Key | Value |
---|---|
Type | int |
Range | [1, 64] |
Description:
The configuration takes effect when capture_mode is 0 and extra_netns_regex is null,
PACKET_FANOUT is to enable load balancing and parallel processing, which can improve
the performance and scalability of network applications. When the local-dispatcher-count
is greater than 1, multiple dispatcher threads will be launched, consuming more CPU and
memory. Increasing the local-dispatcher-count
helps to reduce the operating system's
software interrupts on multi-core CPU servers.
Attention: only valid for traffic_capture_mode
= Local
#2.2.2.8.5 Packet Fanout Mode
Tags:
agent_restart
FQCN:
inputs.cbpf.af_packet.tunning.packet_fanout_mode
Upgrade from old version: static_config.packet-fanout-mode
Default value:
inputs:
cbpf:
af_packet:
tunning:
packet_fanout_mode: 0
2
3
4
5
Enum options:
Value | Note |
---|---|
0 | PACKET_FANOUT_HASH |
1 | PACKET_FANOUT_LB |
2 | PACKET_FANOUT_CPU |
3 | PACKET_FANOUT_ROLLOVER |
4 | PACKET_FANOUT_RND |
5 | PACKET_FANOUT_QM |
6 | PACKET_FANOUT_CBPF |
7 | PACKET_FANOUT_EBPF |
Schema:
Key | Value |
---|---|
Type | int |
Description:
The configuration is a parameter used with the PACKET_FANOUT feature in the Linux kernel to specify the desired packet distribution algorithm. Refer to:
- https://github.com/torvalds/linux/blob/afcd48134c58d6af45fb3fdb648f1260b20f2326/include/uapi/linux/if_packet.h#L71
- https://www.stackpath.com/blog/bpf-hook-points-part-1/
#2.2.3 Special Network
#2.2.3.1 DPDK
#2.2.3.1.1 Enabled
Tags:
agent_restart ee_feature
FQCN:
inputs.cbpf.special_network.dpdk.enabled
Upgrade from old version: static_config.dpdk-enabled
Default value:
inputs:
cbpf:
special_network:
dpdk:
enabled: false
2
3
4
5
Schema:
Key | Value |
---|---|
Type | bool |
Description:
The DPDK RecvEngine is only started when this configuration item is turned on. Note that you also need to set capture_mode to 1. Please refer to https://dpdk-docs.readthedocs.io/en/latest/prog_guide/multi_proc_support.html
#2.2.3.2 Libpcap
#2.2.3.2.1 Enabled
Tags:
agent_restart ee_feature
FQCN:
inputs.cbpf.special_network.libpcap.enabled
Upgrade from old version: static_config.libpcap-enabled
Default value:
inputs:
cbpf:
special_network:
libpcap:
enabled: false
2
3
4
5
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Supports running on Windows and Linux, Low performance when using multiple interfaces. Default to true in Windows, false in Linux.
#2.2.3.3 vHost User
#2.2.3.3.1 vHost Socket Path
Tags:
agent_restart ee_feature
FQCN:
inputs.cbpf.special_network.vhost_user.vhost_socket_path
Upgrade from old version: static_config.vhost-socket-path
Default value:
inputs:
cbpf:
special_network:
vhost_user:
vhost_socket_path: ''
2
3
4
5
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Supports running on Linux with mirror mode.
#2.2.3.4 Physical Switch
#2.2.3.4.1 sFlow Receiving Ports
Tags:
agent_restart ee_feature
FQCN:
inputs.cbpf.special_network.physical_switch.sflow_ports
Upgrade from old version: static_config.xflow-collector.sflow-ports
Default value:
inputs:
cbpf:
special_network:
physical_switch:
sflow_ports: []
2
3
4
5
Schema:
Key | Value |
---|---|
Type | int |
Range | [1, 65535] |
Description:
This feature is only supported by the Enterprise Edition of Trident.
In general, sFlow uses port 6343. Default value []
means that no sFlow
data will be collected.
#2.2.3.4.2 NetFlow Receiving Ports
Tags:
agent_restart ee_feature
FQCN:
inputs.cbpf.special_network.physical_switch.netflow_ports
Upgrade from old version: static_config.xflow-collector.netflow-ports
Default value:
inputs:
cbpf:
special_network:
physical_switch:
netflow_ports: []
2
3
4
5
Schema:
Key | Value |
---|---|
Type | int |
Range | [1, 65535] |
Description:
This feature is only supported by the Enterprise Edition of Trident.
Additionally, only NetFlow v5 is currently supported. In general, NetFlow
uses port 2055. Default value []
means that no NetFlow data will be collected.
#2.2.4 Tunning
#2.2.4.1 Dispatcher Queue Enabled
Tags:
agent_restart
FQCN:
inputs.cbpf.tunning.dispatcher_queue_enabled
Upgrade from old version: static_config.dispatcher-queue
Default value:
inputs:
cbpf:
tunning:
dispatcher_queue_enabled: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
The configuration takes effect when capture_mode is 0 or 2, dispatcher-queue is always true when capture_mode is 2.
Available for all recv_engines.
#2.2.4.2 Maximum Capture Packet Size
Tags:
hot_update
FQCN:
inputs.cbpf.tunning.max_capture_packet_size
Upgrade from old version: capture_packet_size
Default value:
inputs:
cbpf:
tunning:
max_capture_packet_size: 65535
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Unit | byte |
Range | [128, 65535] |
Description:
DPDK environment does not support this configuration.
#2.2.4.3 Raw Packet Buffer Block Size
Tags:
agent_restart ee_feature
FQCN:
inputs.cbpf.tunning.raw_packet_buffer_block_size
Upgrade from old version: static_config.analyzer-raw-packet-block-size
Default value:
inputs:
cbpf:
tunning:
raw_packet_buffer_block_size: 65536
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [65536, 16000000] |
Description:
Larger value will reduce memory allocation for raw packet, but will also delay memory free.
#2.2.4.4 Raw Packet Queue Size
Tags:
agent_restart ee_feature
FQCN:
inputs.cbpf.tunning.raw_packet_queue_size
Upgrade from old version: static_config.analyzer-queue-size
Default value:
inputs:
cbpf:
tunning:
raw_packet_queue_size: 131072
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [65536, 64000000] |
Description:
The length of the following queues (only for capture_mode = 2):
- 0.1-bytes-to-parse
- 0.2-packet-to-flowgenerator
- 0.3-packet-to-pipeline
#2.2.4.5 Max Capture PPS
Tags:
hot_update
FQCN:
inputs.cbpf.tunning.max_capture_pps
Upgrade from old version: max_collect_pps
Default value:
inputs:
cbpf:
tunning:
max_capture_pps: 200
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Unit | Kpps |
Range | [1, 1000000] |
Description:
Maximum packet rate allowed for collection.
Available for all recv_engines.
#2.2.5 Preprocess
#2.2.5.1 Tunnel Decap Protocols
Tags:
hot_update
FQCN:
inputs.cbpf.preprocess.tunnel_decap_protocols
Upgrade from old version: decap_type
Default value:
inputs:
cbpf:
preprocess:
tunnel_decap_protocols:
- 1
- 2
2
3
4
5
6
Enum options:
Value | Note |
---|---|
1 | VXLAN |
2 | IPIP |
3 | GRE |
4 | Geneve |
Schema:
Key | Value |
---|---|
Type | int |
Description:
Decapsulation tunnel protocols.
#2.2.5.2 Tunnel Trim Protocols
Tags:
agent_restart
FQCN:
inputs.cbpf.preprocess.tunnel_trim_protocols
Upgrade from old version: static_config.trim-tunnel-types
Default value:
inputs:
cbpf:
preprocess:
tunnel_trim_protocols: []
2
3
4
Enum options:
Value | Note |
---|---|
ERSPAN | |
VXLAN | |
TEB |
Schema:
Key | Value |
---|---|
Type | string |
Description:
Whether to remove the tunnel header in mirrored traffic.
#2.2.6 Physical Mirror Traffic
#2.2.6.1 Default Capture Network Type
Tags:
agent_restart ee_feature
FQCN:
inputs.cbpf.physical_mirror.default_capture_network_type
Upgrade from old version: static_config.default-tap-type
Default value:
inputs:
cbpf:
physical_mirror:
default_capture_network_type: 3
2
3
4
Enum options:
Value | Note |
---|---|
3 | Cloud Network |
DYNAMIC_OPTIONS | DYNAMIC_OPTIONS |
Schema:
Key | Value |
---|---|
Type | int |
Description:
deepflow-agent will mark the TAP (Traffic Access Point) location
according to the outer vlan tag in the mirrored traffic of the physical
switch. When the vlan tag has no corresponding TAP value, or the vlan
pcp does not match the 'mirror-traffic-pcp', it will assign the TAP value.
This configuration item. Default value 3
means Cloud Network.
#2.2.6.2 Packet Dedup Disabled
Tags:
agent_restart ee_feature
FQCN:
inputs.cbpf.physical_mirror.packet_dedup_disabled
Upgrade from old version: static_config.analyzer-dedup-disabled
Default value:
inputs:
cbpf:
physical_mirror:
packet_dedup_disabled: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Whether to enable mirror traffic deduplication when capture_mode = 2.
#2.2.6.3 Gateway Traffic of Private Cloud
Tags:
agent_restart ee_feature
FQCN:
inputs.cbpf.physical_mirror.private_cloud_gateway_traffic
Upgrade from old version: static_config.cloud-gateway-traffic
Default value:
inputs:
cbpf:
physical_mirror:
private_cloud_gateway_traffic: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Whether it is the mirrored traffic of NFVGW (cloud gateway).
#2.3 eBPF
#2.3.1 Disabled
Tags:
agent_restart
FQCN:
inputs.ebpf.disabled
Upgrade from old version: static_config.ebpf.disabled
Default value:
inputs:
ebpf:
disabled: false
2
3
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Whether to enable eBPF features.
#2.3.2 Socket
#2.3.2.1 Uprobe
#2.3.2.1.1 Golang
#2.3.2.1.1.1 Enabled
Tags:
agent_restart
FQCN:
inputs.ebpf.socket.uprobe.golang.enabled
Upgrade from old version: static_config.ebpf.uprobe-process-name-regexs.golang
Default value:
inputs:
ebpf:
socket:
uprobe:
golang:
enabled: false
2
3
4
5
6
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Whether golang process enables HTTP2/HTTPS protocol data collection and auto-tracing. go auto-tracing also dependent go-tracing-timeout.
#2.3.2.1.1.2 Tracing Timeout
Tags:
agent_restart
FQCN:
inputs.ebpf.socket.uprobe.golang.tracing_timeout
Upgrade from old version: static_config.ebpf.go-tracing-timeout
Default value:
inputs:
ebpf:
socket:
uprobe:
golang:
tracing_timeout: 120s
2
3
4
5
6
Schema:
Key | Value |
---|---|
Type | duration |
Range | [0, '1d'] |
Description:
The expected maximum time interval between the server receiving the request and returning the response, If the value is 0, this feature is disabled. Tracing only considers the thread number.
#2.3.2.1.2 TLS
#2.3.2.1.2.1 Enabled
Tags:
agent_restart
FQCN:
inputs.ebpf.socket.uprobe.tls.enabled
Upgrade from old version: static_config.ebpf.uprobe-process-name-regexs.openssl
Default value:
inputs:
ebpf:
socket:
uprobe:
tls:
enabled: false
2
3
4
5
6
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Whether the process that uses the openssl library to enable HTTPS protocol data collection.
One can use the following method to determine whether an application process can use
Uprobe hook openssl library
to access encrypted data:
Use the command cat /proc/<PID>/maps | grep "libssl.so"
to check if it contains
information about openssl. If it does, it indicates that this process is using the
openssl library. After configuring the openssl options, deepflow-agent will retrieve process
information that matches the regular expression, hooking the corresponding encryption/decryption
interfaces of the openssl library.
In the logs, you will encounter a message similar to the following:
[eBPF] INFO openssl uprobe, pid:1005, path:/proc/1005/root/usr/lib64/libssl.so.1.0.2k
#2.3.2.1.3 DPDK
#2.3.2.1.3.1 Enabled
Tags:
agent_restart
FQCN:
inputs.ebpf.socket.uprobe.dpdk.enabled
Default value:
inputs:
ebpf:
socket:
uprobe:
dpdk:
enabled: false
2
3
4
5
6
Schema:
Key | Value |
---|---|
Type | bool |
Description:
The toggle for enabling DPDK packet capture feature.
#2.3.2.1.3.2 Command
Tags:
agent_restart
FQCN:
inputs.ebpf.socket.uprobe.dpdk.command
Default value:
inputs:
ebpf:
socket:
uprobe:
dpdk:
command: ""
2
3
4
5
6
Schema:
Key | Value |
---|---|
Type | string |
Description:
Set the command name of the DPDK application, eBPF will automatically locate and trace packets for data collection. Example: In the command line '/usr/bin/mydpdk', it can be set as "command: mydpdk"
#2.3.2.1.3.3 Command
Tags:
agent_restart
FQCN:
inputs.ebpf.socket.uprobe.dpdk.rx_hooks
Default value:
inputs:
ebpf:
socket:
uprobe:
dpdk:
rx_hooks: []
2
3
4
5
6
Schema:
Key | Value |
---|---|
Type | string |
Description:
Fill in the appropriate packet reception hook point according to the actual network card driver. You can use the command 'lspci -vmmk' to find the network card driver type. For example:
Slot: 04:00.0
Class: Ethernet controller
Vendor: Intel Corporation
Device: Ethernet Controller XL710 for 40GbE QSFP+
SVendor: Unknown vendor 1e18
SDevice: Device 4712
Rev: 02
Driver: igb_uio
Module: i40e
In the example above, "Driver: igb_uio" indicates a DPDK-managed device (other options include "vfio-pci" and "uio_pci_generic", which are also managed by DPDK). The actual driver is 'i40e' (derived from 'Module: i40e').
Below are some common interface names for different drivers, for reference only:
1. Physical NIC Drivers:
- Intel Drivers:
- ixgbe: Supports Intel 82598/82599/X520/X540/X550 series NICs.
- rx: ixgbe_recv_pkts
- tx: ixgbe_xmit_pkts
- i40e: Supports Intel X710, XL710 series NICs.
- rx: i40e_recv_pkts
- tx: i40e_xmit_pkts
- ice: Supports Intel E810 series NICs.
- rx: ice_recv_pkts
- tx: ice_xmit_pkts
- Mellanox Drivers:
- mlx4: Supports Mellanox ConnectX-3 series NICs.
- rx: mlx4_rx_burst
- tx: mlx4_tx_burst
- mlx5: Supports Mellanox ConnectX-4, ConnectX-5, ConnectX-6 series NICs.
- rx: mlx5_rx_burst, mlx5_rx_burst_vec, mlx5_rx_burst_mprq
- tx: Pending confirmation
- Broadcom Drivers:
- bnxt: Supports Broadcom NetXtreme series NICs.
- rx: bnxt_recv_pkts, bnxt_recv_pkts_vec (x86, Vector mode receive)
- tx: bnxt_xmit_pkts, bnxt_xmit_pkts_vec (x86, Vector mode transmit)
2. Virtual NIC Drivers:
- Virtio Driver:
- virtio: Supports Virtio-based virtual network interfaces.
- rx: virtio_recv_pkts, virtio_recv_mergeable_pkts_packed, virtio_recv_pkts_packed,
virtio_recv_pkts_vec, virtio_recv_pkts_inorder, virtio_recv_mergeable_pkts
- tx: virtio_xmit_pkts_packed, virtio_xmit_pkts
- VMXNET3 Driver:
- vmxnet3: Supports VMware's VMXNET3 virtual NICs.
- rx: vmxnet3_recv_pkts
- tx: vmxnet3_xmit_pkts
Example: "rx_hooks: [ixgbe_recv_pkts, i40e_recv_pkts, virtio_recv_pkts, virtio_recv_mergeable_pkts]"
#2.3.2.1.3.4 Command
Tags:
agent_restart
FQCN:
inputs.ebpf.socket.uprobe.dpdk.tx_hooks
Default value:
inputs:
ebpf:
socket:
uprobe:
dpdk:
tx_hooks: []
2
3
4
5
6
Schema:
Key | Value |
---|---|
Type | string |
Description:
Specify the appropriate packet transmission hook point according to the actual network card driver. To obtain the driver method and configure the transmission hook point, refer to the description of 'rx_hooks'.
Example: "tx_hooks: [i40e_xmit_pkts, virtio_xmit_pkts_packed, virtio_xmit_pkts]"
#2.3.2.2 Kprobe
#2.3.2.2.1 Blacklist
#2.3.2.2.1.1 Port Numbers
Tags:
agent_restart
FQCN:
inputs.ebpf.socket.kprobe.blacklist.ports
Upgrade from old version: static_config.ebpf.kprobe-blacklist.port-list
Default value:
inputs:
ebpf:
socket:
kprobe:
blacklist:
ports: ''
2
3
4
5
6
Schema:
Key | Value |
---|---|
Type | string |
Description:
TCP&UDP Port Blacklist, Priority higher than kprobe-whitelist.
Example: ports: 80,1000-2000
#2.3.2.2.2 Whitelist
#2.3.2.2.2.1 Port Numbers
Tags:
agent_restart
FQCN:
inputs.ebpf.socket.kprobe.whitelist.port
Upgrade from old version: static_config.ebpf.kprobe-whitelist.port-list
Default value:
inputs:
ebpf:
socket:
kprobe:
whitelist:
port: ''
2
3
4
5
6
Schema:
Key | Value |
---|---|
Type | string |
Description:
TCP&UDP Port Whitelist, Priority lower than kprobe-blacklist.
Example: ports: 80,1000-2000
#2.3.2.3 Tunning
#2.3.2.3.1 Max Capture Rate
Tags:
hot_update
FQCN:
inputs.ebpf.socket.tunning.max_capture_rate
Upgrade from old version: static_config.ebpf.global-ebpf-pps-threshold
Default value:
inputs:
ebpf:
socket:
tunning:
max_capture_rate: 0
2
3
4
5
Schema:
Key | Value |
---|---|
Type | int |
Unit | Per Second |
Range | [0, 64000000] |
Description:
Default value 0
means no limitation.
#2.3.2.3.2 Syscall_trace_id Disabled
Tags:
agent_restart
FQCN:
inputs.ebpf.socket.tunning.syscall_trace_id_disabled
Default value:
inputs:
ebpf:
socket:
tunning:
syscall_trace_id_disabled: false
2
3
4
5
Schema:
Key | Value |
---|---|
Type | bool |
Description:
When the trace_id is injected into all requests, the computation logic for all syscall_trace_id can be turned off. This will significantly reduce the impact of the eBPF hook on the CPU consumption of the application process.
#2.3.2.3.3 Disable Pre-allocating Memory
Tags:
agent_restart
FQCN:
inputs.ebpf.socket.tunning.map_prealloc_disabled
Default value:
inputs:
ebpf:
socket:
tunning:
map_prealloc_disabled: false
2
3
4
5
Schema:
Key | Value |
---|---|
Type | bool |
Description:
When full map preallocation is too expensive, setting 'map_prealloc_disabled' to true will prevent memory pre-allocation during map definition, but it may result in some performance degradation. This configuration only applies to maps of type 'BPF_MAP_TYPE_HASH'. Currently applicable to socket trace and uprobe Golang/OpenSSL trace functionalities. Disabling memory preallocation will approximately reduce memory usage by 45MB.
#2.3.2.4 Preprocess
#2.3.2.4.1 OOOR Cache Size
Tags:
agent_restart ee_feature
FQCN:
inputs.ebpf.socket.preprocess.out_of_order_reassembly_cache_size
Upgrade from old version: static_config.ebpf.syscall-out-of-order-cache-size
Default value:
inputs:
ebpf:
socket:
preprocess:
out_of_order_reassembly_cache_size: 16
2
3
4
5
Schema:
Key | Value |
---|---|
Type | int |
Range | [8, 1024] |
Description:
OOOR: Out Of Order Reassembly
When syscall-out-of-order-reassembly
is enabled, up to syscall-out-of-order-cache-size
eBPF socket events (each event consuming up to l7_log_packet_size
bytes) will be cached
in each TCP/UDP flow to prevent out-of-order events from impacting application protocol
parsing. Since eBPF socket events are sent to user space in batches, out-of-order scenarios
mainly occur when requests and responses within a single session are processed by different
CPUs, causing the response to reach user space before the request.
#2.3.2.4.2 OOOR Protocols
Tags:
agent_restart ee_feature
FQCN:
inputs.ebpf.socket.preprocess.out_of_order_reassembly_protocols
Upgrade from old version: static_config.ebpf.syscall-out-of-order-reassembly
Default value:
inputs:
ebpf:
socket:
preprocess:
out_of_order_reassembly_protocols: []
2
3
4
5
Enum options:
Value | Note |
---|---|
DYNAMIC_OPTIONS | DYNAMIC_OPTIONS |
Schema:
Key | Value |
---|---|
Type | string |
Description:
OOOR: Out Of Order Reassembly
When this capability is enabled for a specific application protocol, the agent will add out-of-order-reassembly processing for it. Note that the agent will consume more memory in this case, so please adjust the syscall-out-of-order-cache-size accordingly and monitor the agent's memory usage.
Supported protocols: https://www.deepflow.io/docs/features/l7-protocols/overview/
Attention: use HTTP2
for gRPC
Protocol.
#2.3.2.4.3 SR Protocols
Tags:
agent_restart ee_feature
FQCN:
inputs.ebpf.socket.preprocess.segmentation_reassembly_protocols
Upgrade from old version: static_config.ebpf.syscall-segmentation-reassembly
Default value:
inputs:
ebpf:
socket:
preprocess:
segmentation_reassembly_protocols: []
2
3
4
5
Enum options:
Value | Note |
---|---|
DYNAMIC_OPTIONS | DYNAMIC_OPTIONS |
Schema:
Key | Value |
---|---|
Type | string |
Description:
SR: Segmentation Reassembly
When this capability is enabled for a specific application protocol, the agent will add
segmentation-reassembly processing to merge application protocol content spread across
multiple syscalls before parsing it. This enhances the success rate of application
protocol parsing. Note that syscall-out-of-order-reassembly
must also be enabled for
this feature to be effective.
Supported protocols: https://www.deepflow.io/docs/features/l7-protocols/overview/
Attention: use HTTP2
for gRPC
Protocol.
#2.3.3 File
#2.3.3.1 IO Event
#2.3.3.1.1 Collect Mode
Tags:
agent_restart
FQCN:
inputs.ebpf.file.io_event.collect_mode
Upgrade from old version: static_config.ebpf.io-event-collect-mode
Default value:
inputs:
ebpf:
file:
io_event:
collect_mode: 1
2
3
4
5
Enum options:
Value | Note |
---|---|
0 | Disabled |
1 | Request Life Cycle |
2 | All |
Schema:
Key | Value |
---|---|
Type | int |
Description:
Collection modes:
- 0: Indicates that no IO events are collected.
- 1: Indicates that only IO events within the request life cycle are collected.
- 2: Indicates that all IO events are collected.
#2.3.3.1.2 Minimal Duration
Tags:
agent_restart
FQCN:
inputs.ebpf.file.io_event.minimal_duration
Upgrade from old version: static_config.ebpf.io-event-minimal-duration
Default value:
inputs:
ebpf:
file:
io_event:
minimal_duration: 1ms
2
3
4
5
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['1ns', '1s'] |
Description:
Only collect IO events with delay exceeding this threshold.
#2.3.4 Profile
#2.3.4.1 On-CPU
#2.3.4.1.1 Disabled
Tags:
agent_restart
FQCN:
inputs.ebpf.profile.on_cpu.disabled
Upgrade from old version: static_config.ebpf.on-cpu-profile.disabled
Default value:
inputs:
ebpf:
profile:
on_cpu:
disabled: false
2
3
4
5
Schema:
Key | Value |
---|---|
Type | bool |
Description:
eBPF On-CPU profile switch.
#2.3.4.1.2 Sampling Frequency
Tags:
agent_restart
FQCN:
inputs.ebpf.profile.on_cpu.sampling_frequency
Upgrade from old version: static_config.ebpf.on-cpu-profile.frequency
Default value:
inputs:
ebpf:
profile:
on_cpu:
sampling_frequency: 99
2
3
4
5
Schema:
Key | Value |
---|---|
Type | int |
Range | [1, 1000] |
Description:
eBPF On-CPU profile sampling frequency.
#2.3.4.1.3 Aggregate by CPU
Tags:
agent_restart
FQCN:
inputs.ebpf.profile.on_cpu.aggregate_by_cpu
Upgrade from old version: static_config.ebpf.on-cpu-profile.cpu
Default value:
inputs:
ebpf:
profile:
on_cpu:
aggregate_by_cpu: false
2
3
4
5
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Whether to obtain the value of CPUID and decide whether to participate in aggregation.
- Set to 1: Obtain the value of CPUID and will be included in the aggregation of stack trace data.
- Set to 0: It will not be included in the aggregation. Any other value is considered
invalid, the CPU value for stack trace data reporting is a special value
CPU_INVALID: 0xfff
used to indicate that it is an invalid value.
#2.3.4.2 Off-CPU
#2.3.4.2.1 Disabled
Tags:
agent_restart ee_feature
FQCN:
inputs.ebpf.profile.off_cpu.disabled
Upgrade from old version: static_config.ebpf.off-cpu-profile.disabled
Default value:
inputs:
ebpf:
profile:
off_cpu:
disabled: true
2
3
4
5
Schema:
Key | Value |
---|---|
Type | bool |
Description:
eBPF Off-CPU profile switch.
#2.3.4.2.2 Aggregate by CPU
Tags:
agent_restart ee_feature
FQCN:
inputs.ebpf.profile.off_cpu.aggregate_by_cpu
Upgrade from old version: static_config.ebpf.off-cpu-profile.cpu
Default value:
inputs:
ebpf:
profile:
off_cpu:
aggregate_by_cpu: false
2
3
4
5
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Whether to obtain the value of CPUID and decide whether to participate in aggregation.
- Set to 1: Obtain the value of CPUID and will be included in the aggregation of stack trace data.
- Set to 0: It will not be included in the aggregation. Any other value is considered
invalid, the CPU value for stack trace data reporting is a special value
CPU_INVALID: 0xfff
used to indicate that it is an invalid value.
#2.3.4.2.3 Minimum Blocking Time
Tags:
agent_restart ee_feature
FQCN:
inputs.ebpf.profile.off_cpu.min_blocking_time
Upgrade from old version: static_config.ebpf.off-cpu-profile.minblock
Default value:
inputs:
ebpf:
profile:
off_cpu:
min_blocking_time: 50us
2
3
4
5
Schema:
Key | Value |
---|---|
Type | duration |
Range | [0, '1h'] |
Description:
If set to 0, there will be no minimum value limitation. Scheduler events are still high-frequency events, as their rate may exceed 1 million events per second, so caution should still be exercised.
If overhead remains an issue, you can configure the 'minblock' tunable parameter here. If the off-CPU time is less than the value configured in this item, the data will be discarded. If your goal is to trace longer blocking events, increasing this parameter can filter out shorter blocking events, further reducing overhead. Additionally, we will not collect events with a blocking time exceeding 1 hour.
#2.3.4.3 Memory
#2.3.4.3.1 Disabled
Tags:
agent_restart ee_feature
FQCN:
inputs.ebpf.profile.memory.disabled
Upgrade from old version: static_config.ebpf.memory-profile.disabled
Default value:
inputs:
ebpf:
profile:
memory:
disabled: true
2
3
4
5
Schema:
Key | Value |
---|---|
Type | bool |
Description:
eBPF memory profile switch.
#2.3.4.4 Preprocess
#2.3.4.4.1 Stack Compression
Tags:
agent_restart
FQCN:
inputs.ebpf.profile.preprocess.stack_compression
Default value:
inputs:
ebpf:
profile:
preprocess:
stack_compression: true
2
3
4
5
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Compress the call stack before sending data. Compression can effectively reduce the agent's
memory usage, data transmission bandwidth consumption, and ingester's CPU overhead. However,
it also increases the CPU usage of the agent. Tests have shown that compressing the on-cpu
function call stack of the deepflow-agent can reduce bandwidth consumption by x
times, but
it will result in an additional y%
CPU usage for the agent.
#2.3.5 Tunning
#2.3.5.1 Collector Queue Size
Tags:
agent_restart
FQCN:
inputs.ebpf.tunning.collector_queue_size
Upgrade from old version: static_config.ebpf-collector-queue-size
Default value:
inputs:
ebpf:
tunning:
collector_queue_size: 65535
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [4096, 64000000] |
Description:
The length of the following queues:
- 0-ebpf-to-ebpf-collector
- 1-proc-event-to-sender
- 1-profile-to-sender
#2.3.5.2 Userspace Worker Threads
Tags:
agent_restart
FQCN:
inputs.ebpf.tunning.userspace_worker_threads
Upgrade from old version: static_config.ebpf.thread-num
Default value:
inputs:
ebpf:
tunning:
userspace_worker_threads: 1
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [1, 1024] |
Description:
The number of worker threads refers to how many threads participate in data processing in user-space. The actual maximal value is the number of CPU logical cores on the host.
#2.3.5.3 Perf Pages Count
Tags:
agent_restart
FQCN:
inputs.ebpf.tunning.perf_pages_count
Upgrade from old version: static_config.ebpf.perf-pages-count
Default value:
inputs:
ebpf:
tunning:
perf_pages_count: 128
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [32, 8192] |
Description:
The number of page occupied by the shared memory of the kernel. The
value is 2^n (5 <= n <= 13)
. Used for perf data transfer. If the
value is between 2^n
and 2^(n+1)
, it will be automatically adjusted
by the ebpf configurator to the minimum value 2^n
.
#2.3.5.4 Kernel Ring Size
Tags:
agent_restart
FQCN:
inputs.ebpf.tunning.kernel_ring_size
Upgrade from old version: static_config.ebpf.ring-size
Default value:
inputs:
ebpf:
tunning:
kernel_ring_size: 65536
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [8192, 131072] |
Description:
The size of the ring cache queue, The value is 2^n (13 <= n <= 17)
.
If the value is between 2^n
and 2^(n+1)
, it will be automatically
adjusted by the ebpf configurator to the minimum value 2^n
.
#2.3.5.5 Maximum Socket Entries
Tags:
agent_restart
FQCN:
inputs.ebpf.tunning.max_socket_entries
Upgrade from old version: static_config.ebpf.max-socket-entries
Default value:
inputs:
ebpf:
tunning:
max_socket_entries: 131072
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [10000, 2000000] |
Description:
Set the maximum value of hash table entries for socket tracking, depending on the number of concurrent requests in the actual scenario
#2.3.5.6 Socket Map Reclaim Threshold
Tags:
agent_restart
FQCN:
inputs.ebpf.tunning.socket_map_reclaim_threshold
Upgrade from old version: static_config.ebpf.socket-map-max-reclaim
Default value:
inputs:
ebpf:
tunning:
socket_map_reclaim_threshold: 120000
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [8000, 2000000] |
Description:
The threshold for cleaning socket map table entries.
#2.3.5.7 Maximum Trace Entries
Tags:
agent_restart
FQCN:
inputs.ebpf.tunning.max_trace_entries
Upgrade from old version: static_config.ebpf.max-trace-entries
Default value:
inputs:
ebpf:
tunning:
max_trace_entries: 131072
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [10000, 2000000] |
Description:
Set the maximum value of hash table entries for thread/coroutine tracking sessions.
#2.4 Resources
#2.4.1 Push Interval
Tags:
hot_update
FQCN:
inputs.resources.push_interval
Upgrade from old version: platform_sync_interval
Default value:
inputs:
resources:
push_interval: 10s
2
3
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['10s', '3600s'] |
Description:
The interval at which deepflow-agent actively reports resource information to deepflow-server.
#2.4.2 Collect Private Cloud Resource
#2.4.2.1 Hypervisor Resource Enabled
Tags:
hot_update
FQCN:
inputs.resources.private_cloud.hypervisor_resource_enabled
Upgrade from old version: platform_enabled
Default value:
inputs:
resources:
private_cloud:
hypervisor_resource_enabled: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
When enabled, deepflow-agent will automatically synchronize virtual machine and network information on the KVM (or Host) to deepflow-server.
#2.4.2.2 VM MAC Source
Tags:
hot_update
FQCN:
inputs.resources.private_cloud.vm_mac_source
Upgrade from old version: if_mac_source
Default value:
inputs:
resources:
private_cloud:
vm_mac_source: 0
2
3
4
Enum options:
Value | Note |
---|---|
0 | Interface MAC Address |
1 | Interface Name |
2 | Qemu XML File |
Schema:
Key | Value |
---|---|
Type | int |
Description:
How to extract the real MAC address of the virtual machine when the agent runs on the KVM host.
Explanation of the options:
- 0: extracted from tap interface MAC address
- 1: extracted from tap interface name
- 2: extracted from the XML file of the virtual machine
#2.4.2.3 VM XML Directory
Tags:
hot_update
FQCN:
inputs.resources.private_cloud.vm_xml_directory
Upgrade from old version: vm_xml_path
Default value:
inputs:
resources:
private_cloud:
vm_xml_directory: /etc/libvirt/qemu/
2
3
4
Schema:
Key | Value |
---|---|
Type | string |
Range | [0, 100] |
Description:
VM XML file directory.
#2.4.2.4 VM MAC Mapping Script
Tags:
agent_restart
FQCN:
inputs.resources.private_cloud.vm_mac_mapping_script
Upgrade from old version: static_config.tap-mac-script
Default value:
inputs:
resources:
private_cloud:
vm_mac_mapping_script: ''
2
3
4
Schema:
Key | Value |
---|---|
Type | string |
Range | [0, 100] |
Description:
The MAC address mapping relationship of TAP NIC in complex environment can be constructed by writing a script. The following conditions must be met to use this script:
- if_mac_source = 2
- tap_mode = 0
- The name of the TAP NIC is the same as in the virtual machine XML file
- The format of the script output is as follows:
- tap2d283dfe,11:22:33:44:55:66
- tap2d283223,aa:bb:cc:dd:ee:ff
#2.4.3 Collect K8s Resource
#2.4.3.1 Enabled
Tags:
hot_update
FQCN:
inputs.resources.kubernetes.enabled
Upgrade from old version: kubernetes_api_enabled
Default value:
inputs:
resources:
kubernetes:
enabled: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
When there are multiple deepflow-agents in the same K8s cluster, only one deepflow-agent will be enabled to collect K8s resources.
#2.4.3.2 K8s Namespace
Tags:
agent_restart
FQCN:
inputs.resources.kubernetes.kubernetes_namespace
Upgrade from old version: static_config.kubernetes-namespace
Default value:
inputs:
resources:
kubernetes:
kubernetes_namespace: null
2
3
4
Schema:
Key | Value |
---|---|
Type | string |
Description:
TODO
#2.4.3.3 K8s API Resources
Tags:
agent_restart
FQCN:
inputs.resources.kubernetes.api_resources
Upgrade from old version: static_config.kubernetes-resources
Default value:
inputs:
resources:
kubernetes:
api_resources:
- name: namespaces
- name: nodes
- name: pods
- name: replicationcontrollers
- name: services
- name: daemonsets
- name: deployments
- name: replicasets
- name: statefulsets
- name: ingresses
2
3
4
5
6
7
8
9
10
11
12
13
14
Schema:
Key | Value |
---|---|
Type | dict |
Description:
Specify kubernetes resources to watch.
To disable a resource, add an entry to the list with disabled: true
:
inputs:
resources:
kubernetes:
api_resources:
- name: services
disabled: true
2
3
4
5
6
To enable a resource, add an entry of this resource to the list. Be advised that
this setting overrides the default of the same resource. For example, to enable
statefulsets
in both group apps
(the default) and apps.kruise.io
will require
two entries:
inputs:
resources:
kubernetes:
api_resources:
- name: statefulsets
group: apps
- name: statefulsets
group: apps.kruise.io
version: v1beta1
2
3
4
5
6
7
8
9
To watching routes
in openshift you can use the following settings:
inputs:
resources:
kubernetes:
api_resources:
- name: ingresses
disabled: true
- name: routes
2
3
4
5
6
7
#2.4.3.3.1 Name
Tags:
agent_restart
FQCN:
inputs.resources.kubernetes.api_resources.name
Upgrade from old version: static_config.kubernetes-resources.name
Default value:
inputs:
resources:
kubernetes:
api_resources:
- name: ''
2
3
4
5
Enum options:
Value | Note |
---|---|
namespaces | |
nodes | |
pods | |
replicationcontrollers | |
services | |
daemonsets | |
deployments | |
replicasets | |
statefulsets | |
ingresses |
Schema:
Key | Value |
---|---|
Type | string |
Description:
K8s API resource name.
#2.4.3.3.2 Group
Tags:
agent_restart
FQCN:
inputs.resources.kubernetes.api_resources.group
Upgrade from old version: static_config.kubernetes-resources.group
Default value:
inputs:
resources:
kubernetes:
api_resources:
- group: ''
2
3
4
5
Schema:
Key | Value |
---|---|
Type | string |
Description:
K8s API resource group.
#2.4.3.3.3 Version
Tags:
agent_restart
FQCN:
inputs.resources.kubernetes.api_resources.version
Upgrade from old version: static_config.kubernetes-resources.version
Default value:
inputs:
resources:
kubernetes:
api_resources:
- version: ''
2
3
4
5
Schema:
Key | Value |
---|---|
Type | string |
Description:
K8s API version.
#2.4.3.3.4 Disabled
Tags:
agent_restart
FQCN:
inputs.resources.kubernetes.api_resources.disabled
Upgrade from old version: static_config.kubernetes-resources.disabled
Default value:
inputs:
resources:
kubernetes:
api_resources:
- disabled: false
2
3
4
5
Schema:
Key | Value |
---|---|
Type | bool |
Description:
K8s API resource disabled.
#2.4.3.3.5 Field Selector
Tags:
agent_restart
FQCN:
inputs.resources.kubernetes.api_resources.field_selector
Upgrade from old version: static_config.kubernetes-resources.field-selector
Default value:
inputs:
resources:
kubernetes:
api_resources:
- field_selector: ''
2
3
4
5
Schema:
Key | Value |
---|---|
Type | string |
Description:
K8s API resource field selector.
#2.4.3.4 K8s API List Page Size
Tags:
agent_restart
FQCN:
inputs.resources.kubernetes.api_list_page_size
Upgrade from old version: static_config.kubernetes-api-list-limit
Default value:
inputs:
resources:
kubernetes:
api_list_page_size: 1000
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [10, 4294967295] |
Description:
Used when limit k8s api list entry size.
#2.4.3.5 K8s API List Maximum Interval
Tags:
agent_restart
FQCN:
inputs.resources.kubernetes.api_list_max_interval
Upgrade from old version: static_config.kubernetes-api-list-interval
Default value:
inputs:
resources:
kubernetes:
api_list_max_interval: 10m
2
3
4
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['10m', '30d'] |
Description:
Interval of listing resource when watcher idles
#2.4.3.6 Ingress Flavour
Tags:
deprecated
FQCN:
inputs.resources.kubernetes.ingress_flavour
Upgrade from old version: static_config.ingress-flavour
Default value:
inputs:
resources:
kubernetes:
ingress_flavour: kubernetes
2
3
4
Schema:
Key | Value |
---|---|
Type | string |
#2.4.3.7 Pod MAC Collection Method
Tags:
agent_restart
FQCN:
inputs.resources.kubernetes.pod_mac_collection_method
Upgrade from old version: static_config.kubernetes-poller-type
Default value:
inputs:
resources:
kubernetes:
pod_mac_collection_method: adaptive
2
3
4
Enum options:
Value | Note |
---|---|
adaptive | |
active | |
passive |
Schema:
Key | Value |
---|---|
Type | string |
Description:
In active mode, deepflow-agent enters the netns of other Pods through setns syscall to query the MAC and IP addresses. In this mode, the setns operation requires the SYS_ADMIN permission. In passive mode deepflow-agent calculates the MAC and IP addresses used by Pods by capturing ARP/ND traffic. When set to adaptive, active mode will be used first.
#2.4.4 Pull Resource From Controller
#2.4.4.1 Domain Filter
Tags:
hot_update
FQCN:
inputs.resources.pull_resource_from_controller.domain_filter
Upgrade from old version: domains
Default value:
inputs:
resources:
pull_resource_from_controller:
domain_filter:
- 0
2
3
4
5
Enum options:
Value | Note |
---|---|
DYNAMIC_OPTIONS | DYNAMIC_OPTIONS |
Schema:
Key | Value |
---|---|
Type | int |
Description:
Default value 0
means all domains, or can be set to a list of lcuuid of a
series of domains, you can get lcuuid through 'deepflow-ctl domain list'.
Note: The list of MAC and IP addresses is used by deepflow-agent to inject tags into data. This configuration can reduce the number and frequency of MAC and IP addresses delivered by deepflow-server to deepflow-agent. When there is no cross-domain service request, deepflow-server can be configured to only deliver the information in the local domain to deepflow-agent.
#2.4.4.2 Only K8s Pod IP in Local Cluster
Tags:
hot_update
FQCN:
inputs.resources.pull_resource_from_controller.only_kubernetes_pod_ip_in_local_cluster
Upgrade from old version: pod_cluster_internal_ip
Default value:
inputs:
resources:
pull_resource_from_controller:
only_kubernetes_pod_ip_in_local_cluster: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
The list of MAC and IP addresses is used by deepflow-agent to inject tags into data. This configuration can reduce the number and frequency of MAC and IP addresses delivered by deepflow-server to deepflow-agent. When the Pod IP is not used for direct communication between the K8s cluster and the outside world, deepflow-server can be configured to only deliver the information in the local K8s cluster to deepflow-agent.
#2.5 Integration
#2.5.1 Enabled
Tags:
hot_update
FQCN:
inputs.integration.enabled
Upgrade from old version: external_agent_http_proxy_enabled
Default value:
inputs:
integration:
enabled: true
2
3
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Whether to enable receiving external data sources such as Prometheus, Telegraf, OpenTelemetry, and SkyWalking.
#2.5.2 Listen Port
Tags:
hot_update
FQCN:
inputs.integration.listen_port
Upgrade from old version: external_agent_http_proxy_port
Default value:
inputs:
integration:
listen_port: 38086
2
3
Schema:
Key | Value |
---|---|
Type | int |
Range | [1, 65535] |
Description:
Listen port of the data integration socket.
#2.5.3 Compression
#2.5.3.1 Trace
Tags:
agent_restart
FQCN:
inputs.integration.compression.trace
Upgrade from old version: static_config.external-agent-http-proxy-compressed
Default value:
inputs:
integration:
compression:
trace: true
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Whether to compress the integrated trace data received by deepflow-agent. The compression ratio is about 5:1~10:1. Turning on this feature will result in higher CPU consumption of deepflow-agent.
#2.5.3.2 Profile
Tags:
agent_restart
FQCN:
inputs.integration.compression.profile
Upgrade from old version: static_config.external-agent-http-proxy-compressed
Default value:
inputs:
integration:
compression:
profile: true
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Whether to compress the integrated profile data received by deepflow-agent. The compression ratio is about 5:1~10:1. Turning on this feature will result in higher CPU consumption of deepflow-agent.
#2.5.4 Prometheus Extra Labels
Support for getting extra labels from headers in http requests from RemoteWrite.
#2.5.4.1 Enabled
Tags:
agent_restart
FQCN:
inputs.integration.prometheus_extra_labels.enabled
Upgrade from old version: static_config.prometheus-extra-config.enabled
Default value:
inputs:
integration:
prometheus_extra_labels:
enabled: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Prometheus extra labels switch.
#2.5.4.2 Extra Labels
Tags:
agent_restart
FQCN:
inputs.integration.prometheus_extra_labels.extra_labels
Upgrade from old version: static_config.prometheus-extra-config.labels
Default value:
inputs:
integration:
prometheus_extra_labels:
extra_labels: []
2
3
4
Schema:
Key | Value |
---|---|
Type | string |
Description:
Labels list. Labels in this list are sent. Label is a string
matching the regular expression [a-zA-Z_][a-zA-Z0-9_]*
#2.5.4.3 Label Length Limit
Tags:
agent_restart
FQCN:
inputs.integration.prometheus_extra_labels.label_length
Upgrade from old version: static_config.prometheus-extra-config.labels-limit
Default value:
inputs:
integration:
prometheus_extra_labels:
label_length: 1024
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Unit | byte |
Range | [1024, 1048576] |
Description:
The size limit of the parsed key.
#2.5.4.4 Value Length Limit
Tags:
agent_restart
FQCN:
inputs.integration.prometheus_extra_labels.value_length
Upgrade from old version: static_config.prometheus-extra-config.values-limit
Default value:
inputs:
integration:
prometheus_extra_labels:
value_length: 4096
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Unit | byte |
Range | [4096, 4194304] |
Description:
The size limit of the parsed value.
#2.5.5 Feature Control
#2.5.5.1 Profile Integration Disabled
Tags:
agent_restart
FQCN:
inputs.integration.feature_control.profile_integration_disabled
Upgrade from old version: static_config.external-profile-integration-disabled
Default value:
inputs:
integration:
feature_control:
profile_integration_disabled: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
#2.5.5.2 Trace Integration Disabled
Tags:
agent_restart
FQCN:
inputs.integration.feature_control.trace_integration_disabled
Upgrade from old version: static_config.external-trace-integration-disabled
Default value:
inputs:
integration:
feature_control:
trace_integration_disabled: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
#2.5.5.3 Metric Integration Disabled
Tags:
agent_restart
FQCN:
inputs.integration.feature_control.metric_integration_disabled
Upgrade from old version: static_config.external-metric-integration-disabled
Default value:
inputs:
integration:
feature_control:
metric_integration_disabled: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
#2.5.5.4 Log Integration Disabled
Tags:
agent_restart
FQCN:
inputs.integration.feature_control.log_integration_disabled
Upgrade from old version: static_config.external-log-integration-disabled
Default value:
inputs:
integration:
feature_control:
log_integration_disabled: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
#3. Processors
#3.1 Packet
#3.1.1 Policy
#3.1.1.1 Fast-path Map Size
Tags:
agent_restart
FQCN:
processors.packet.policy.fast_path_map_size
Upgrade from old version: static_config.fast-path-map-size
Default value:
processors:
packet:
policy:
fast_path_map_size: 0
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Description:
When set to 0, deepflow-agent will automatically adjust the map size according to max_memory.
#3.1.1.2 Fast-path Disabled
Tags:
agent_restart
FQCN:
processors.packet.policy.fast_path_disabled
Upgrade from old version: static_config.fast-path-disabled
Default value:
processors:
packet:
policy:
fast_path_disabled: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
When set to true, deepflow-agent will not use fast path.
#3.1.1.3 Forward Table Capacity
Tags:
agent_restart
FQCN:
processors.packet.policy.forward_table_capacity
Upgrade from old version: static_config.forward-capacity
Default value:
processors:
packet:
policy:
forward_table_capacity: 16384
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [16384, 64000000] |
Description:
When this value is larger, the more memory usage may be.
#3.1.1.4 Max First-path Level
Tags:
agent_restart
FQCN:
processors.packet.policy.max_first_path_level
Upgrade from old version: static_config.first-path-level
Default value:
processors:
packet:
policy:
max_first_path_level: 8
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [1, 16] |
Description:
When this value is larger, the memory overhead is smaller, but the performance of policy matching is worse.
#3.1.2 TCP Header
#3.1.2.1 Block Size
Tags:
agent_restart ee_feature
FQCN:
processors.packet.tcp_header.block_size
Upgrade from old version: static_config.packet-sequence-block-size
Default value:
processors:
packet:
tcp_header:
block_size: 256
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [16, 8192] |
Description:
When generating TCP header data, each flow uses one block to compress and store multiple TCP headers, and the block size can be set here.
#3.1.2.2 Sender Queue Size
Tags:
agent_restart ee_feature
FQCN:
processors.packet.tcp_header.sender_queue_size
Upgrade from old version: static_config.packet-sequence-queue-size
Default value:
processors:
packet:
tcp_header:
sender_queue_size: 65536
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [65536, 64000000] |
Description:
The length of the following queues (to UniformCollectSender):
- 1-packet-sequence-block-to-uniform-collect-sender
#3.1.2.3 Sender Queue Count
Tags:
agent_restart ee_feature
FQCN:
processors.packet.tcp_header.sender_queue_count
Upgrade from old version: static_config.packet-sequence-queue-count
Default value:
processors:
packet:
tcp_header:
sender_queue_count: 1
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [1, 64] |
Description:
The number of replicas for each output queue of the PacketSequence.
#3.1.2.4 Header Fields Flag
Tags:
agent_restart ee_feature
FQCN:
processors.packet.tcp_header.header_fields_flag
Upgrade from old version: static_config.packet-sequence-flag
Default value:
processors:
packet:
tcp_header:
header_fields_flag: 0
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [0, 255] |
Description:
packet-sequence-flag determines which fields need to be reported, the default value is 0, which means the feature is disabled, and 255, which means all fields need to be reported all fields corresponding to each bit:
| FLAG | SEQ | ACK | PAYLOAD_SIZE | WINDOW_SIZE | OPT_MSS | OPT_WS | OPT_SACK |
7 6 5 4 3 2 1 0
2
#3.1.3 PCAP Stream
#3.1.3.1 Receiver Queue Size
Tags:
agent_restart ee_feature
FQCN:
processors.packet.pcap_stream.receiver_queue_size
Upgrade from old version: static_config.pcap.queue-size
Default value:
processors:
packet:
pcap_stream:
receiver_queue_size: 65536
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [65536, 64000000] |
Description:
The length of the following queues:
- 1-mini-meta-packet-to-pcap
#3.1.3.2 Buffer Size Per Flow
Tags:
agent_restart ee_feature
FQCN:
processors.packet.pcap_stream.buffer_size_per_flow
Upgrade from old version: static_config.pcap.flow-buffer-size
Default value:
processors:
packet:
pcap_stream:
buffer_size_per_flow: 65536
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [64, 64000000] |
Description:
Buffer flushes when one of the flows reach this limit.
#3.1.3.3 Total Buffer Size
Tags:
agent_restart ee_feature
FQCN:
processors.packet.pcap_stream.total_buffer_size
Upgrade from old version: static_config.pcap.buffer-size
Default value:
processors:
packet:
pcap_stream:
total_buffer_size: 88304
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [65536, 64000000] |
Description:
Buffer flushes when total data size reach this limit, cannot exceed sender buffer size 128K.
#3.1.3.4 Flush Interval
Tags:
agent_restart ee_feature
FQCN:
processors.packet.pcap_stream.flush_interval
Upgrade from old version: static_config.pcap.flush-interval
Default value:
processors:
packet:
pcap_stream:
flush_interval: 1m
2
3
4
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['1s', '10m'] |
Description:
Flushes a flow if its first packet were older then this interval.
#3.1.4 TOA (TCP Option Address)
#3.1.4.1 Sender Queue Size
Tags:
agent_restart
FQCN:
processors.packet.toa.sender_queue_size
Upgrade from old version: static_config.toa-sender-queue-size
Default value:
processors:
packet:
toa:
sender_queue_size: 65536
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [65536, 64000000] |
Description:
TODO
#3.1.4.2 Cache Size
Tags:
agent_restart
FQCN:
processors.packet.toa.cache_size
Upgrade from old version: static_config.toa-lru-cache-size
Default value:
processors:
packet:
toa:
cache_size: 65536
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [1, 64000000] |
Description:
Size of tcp option address info cache size.
#3.2 Request Log
#3.2.1 Application Protocol Inference
#3.2.1.1 Inference Maximum Retries
Tags:
agent_restart
FQCN:
processors.request_log.application_protocol_inference.inference_max_retries
Upgrade from old version: static_config.l7-protocol-inference-max-fail-count
Default value:
processors:
request_log:
application_protocol_inference:
inference_max_retries: 5
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [0, 10000] |
Description:
deepflow-agent will mark the long live stream and application protocol for each <vpc, ip, protocol, port> tuple, when the traffic corresponding to a tuple fails to be identified for many times (for multiple packets, Socket Data, Function Data), the tuple will be marked as an unknown type to avoid deepflow-agent continuing to try (incurring significant computational overhead) until the duration exceeds l7-protocol-inference-ttl.
#3.2.1.2 Inference Result TTL
Tags:
agent_restart
FQCN:
processors.request_log.application_protocol_inference.inference_result_ttl
Upgrade from old version: static_config.l7-protocol-inference-ttl
Default value:
processors:
request_log:
application_protocol_inference:
inference_result_ttl: 60
2
3
4
Schema:
Key | Value |
---|---|
Type | duration |
Range | [0, '1d'] |
Description:
deepflow-agent will mark the application protocol for each <vpc, ip, protocol, port> tuple. In order to avoid misidentification caused by IP changes, the validity period after successfully identifying the protocol will be limited to this value.
#3.2.1.3 Enabled Protocols
Tags:
agent_restart
FQCN:
processors.request_log.application_protocol_inference.enabled_protocols
Upgrade from old version: static_config.l7-protocol-enabled
Default value:
processors:
request_log:
application_protocol_inference:
enabled_protocols:
- HTTP
- HTTP2
- Dubbo
- SofaRPC
- FastCGI
- bRPC
- MySQL
- PostgreSQL
- Oracle
- Redis
- MongoDB
- Kafka
- MQTT
- AMQP
- OpenWire
- NATS
- Pulsar
- ZMTP
- DNS
- TLS
- Custom
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Enum options:
Value | Note |
---|---|
DYNAMIC_OPTIONS | DYNAMIC_OPTIONS |
Schema:
Key | Value |
---|---|
Type | string |
Description:
Turning off some protocol identification can reduce deepflow-agent resource consumption.
Supported protocols: https://www.deepflow.io/docs/features/l7-protocols/overview/
Oracle and TLS is only supported in the Enterprise Edition.
#3.2.1.4 Protocol Special Config
#3.2.1.4.1 Oracle
#3.2.1.4.1.1 Integer Byte Order
Tags:
agent_restart
FQCN:
processors.request_log.application_protocol_inference.protocol_special_config.oracle.is_be
Upgrade from old version: static_config.oracle-parse-config.is-be
Default value:
processors:
request_log:
application_protocol_inference:
protocol_special_config:
oracle:
is_be: true
2
3
4
5
6
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Whether the oracle integer encode is big endian.
#3.2.1.4.1.2 Integer Compressed
Tags:
agent_restart
FQCN:
processors.request_log.application_protocol_inference.protocol_special_config.oracle.int_compressed
Upgrade from old version: static_config.oracle-parse-config.int-compress
Default value:
processors:
request_log:
application_protocol_inference:
protocol_special_config:
oracle:
int_compressed: true
2
3
4
5
6
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Whether the oracle integer encode is compress.
#3.2.1.4.1.3 Response 0x04 with Extra Byte
Tags:
agent_restart
FQCN:
processors.request_log.application_protocol_inference.protocol_special_config.oracle.resp_0x04_extra_byte
Upgrade from old version: static_config.oracle-parse-config.resp-0x04-extra-byte
Default value:
processors:
request_log:
application_protocol_inference:
protocol_special_config:
oracle:
resp_0x04_extra_byte: false
2
3
4
5
6
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Due to the response with data id 0x04 has different struct in different version, it may has one byte before row affect.
#3.2.2 Filters
#3.2.2.1 Port Number Pre-filters
Tags:
agent_restart
FQCN:
processors.request_log.filters.port_number_prefilters
Upgrade from old version: static_config.l7-protocol-ports
Default value:
processors:
request_log:
filters:
port_number_prefilters:
AMQP: 1-65535
Custom: 1-65535
DNS: 53,5353
Dubbo: 1-65535
FastCGI: 1-65535
HTTP: 1-65535
HTTP2: 1-65535
Kafka: 1-65535
MQTT: 1-65535
MongoDB: 1-65535
MySQL: 1-65535
NATS: 1-65535
OpenWire: 1-65535
Oracle: 1521
PostgreSQL: 1-65535
Pulsar: 1-65535
Redis: 1-65535
SofaRPC: 1-65535
TLS: 443,6443
ZMTP: 1-65535
bRPC: 1-65535
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Enum options:
Value | Note |
---|---|
DYNAMIC_OPTIONS |
Schema:
Key | Value |
---|---|
Type | dict |
Description:
Port-list example: 80,1000-2000
HTTP2 and TLS are only used for kprobe, not applicable to uprobe. All data obtained through uprobe is not subject to port restrictions.
Supported protocols: https://www.deepflow.io/docs/features/l7-protocols/overview/
Oracle and TLS is only supported in the Enterprise Edition.
Attention: use HTTP2
for gRPC
Protocol.
#3.2.2.2 Tag Filters
Tags:
agent_restart
FQCN:
processors.request_log.filters.tag_filters
Upgrade from old version: static_config.l7-log-blacklist
Default value:
processors:
request_log:
filters:
tag_filters:
AMQP: []
DNS: []
Dubbo: []
FastCGI: []
HTTP: []
HTTP2: []
Kafka: []
MQTT: []
MongoDB: []
MySQL: []
NATS: []
OpenWire: []
Oracle: []
PostgreSQL: []
Pulsar: []
Redis: []
SOFARPC: []
TLS: []
ZMTP: []
bRPC: []
gRPC: []
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Enum options:
Value | Note |
---|---|
DYNAMIC_OPTIONS |
Schema:
Key | Value |
---|---|
Type | dict |
Description:
Tag filter example:
processors:
request_log:
filters:
tag_filters:
HTTP:
- field-name: request_resource # endpoint, request_type, request_domain, request_resource
operator: equal # equal, prefix
value: somevalue
HTTP2: []
# other protocols
2
3
4
5
6
7
8
9
10
A l7_flow_log blacklist can be configured for each protocol, preventing request logs matching the blacklist from being collected by the agent or included in application performance metrics. It's recommended to only place non-business request logs like heartbeats or health checks in this blacklist. Including business request logs might lead to breaks in the distributed tracing tree.
Supported protocols: https://www.deepflow.io/docs/features/l7-protocols/overview/
Oracle and TLS is only supported in the Enterprise Edition.
#3.2.2.2.1 $HTTP Tag Filters
Tags:
agent_restart
FQCN:
processors.request_log.filters.tag_filters.HTTP
Upgrade from old version: static_config.l7-log-blacklist.$protocol
Default value:
processors:
request_log:
filters:
tag_filters:
HTTP: []
2
3
4
5
Schema:
Key | Value |
---|---|
Type | dict |
Description:
HTTP Tag filter example:
processors:
request_log:
filters:
tag_filters:
HTTP:
- field-name: request_resource # endpoint, request_type, request_domain, request_resource
operator: equal # equal, prefix
value: somevalue
2
3
4
5
6
7
8
A l7_flow_log tag_filter can be configured for each protocol, preventing request logs matching the blacklist from being collected by the agent or included in application performance metrics. It's recommended to only place non-business request logs like heartbeats or health checks in this blacklist. Including business request logs might lead to breaks in the distributed tracing tree.
Supported protocols: https://www.deepflow.io/docs/features/l7-protocols/overview/
Oracle and TLS is only supported in the Enterprise Edition.
#3.2.2.2.1.1 Field Name
Tags:
agent_restart
FQCN:
processors.request_log.filters.tag_filters.HTTP.field_name
Upgrade from old version: static_config.l7-log-blacklist.$protocol.field-name
Default value:
processors:
request_log:
filters:
tag_filters:
HTTP:
- field_name: ''
2
3
4
5
6
Enum options:
Value | Note |
---|---|
endpoint | |
request_type | |
request_domain | |
request_resource |
Schema:
Key | Value |
---|---|
Type | string |
Description:
Match field name.
#3.2.2.2.1.2 Operator
Tags:
agent_restart
FQCN:
processors.request_log.filters.tag_filters.HTTP.operator
Upgrade from old version: static_config.l7-log-blacklist.$protocol.operator
Default value:
processors:
request_log:
filters:
tag_filters:
HTTP:
- operator: ''
2
3
4
5
6
Enum options:
Value | Note |
---|---|
equal | |
prefix |
Schema:
Key | Value |
---|---|
Type | string |
Description:
Match operator.
#3.2.2.2.1.3 Field Value
Tags:
agent_restart
FQCN:
processors.request_log.filters.tag_filters.HTTP.field_value
Upgrade from old version: static_config.l7-log-blacklist.$protocol.value
Default value:
processors:
request_log:
filters:
tag_filters:
HTTP:
- field_value: ''
2
3
4
5
6
Schema:
Key | Value |
---|---|
Type | string |
Description:
Match field value.
#3.2.2.3 Unconcerned DNS NXDOMAIN
Tags:
agent_restart
FQCN:
processors.request_log.filters.unconcerned_dns_nxdomain_response_suffixes
Upgrade from old version: static_config.l7-protocol-advanced-features.unconcerned-dns-nxdomain-response-suffixes
Default value:
processors:
request_log:
filters:
unconcerned_dns_nxdomain_response_suffixes: []
2
3
4
Schema:
Key | Value |
---|---|
Type | string |
Description:
You might not be concerned about certain DNS NXDOMAIN errors and may wish to ignore
them. For example, when a K8s Pod tries to resolve an external domain name, it first
concatenates it with the internal domain suffix of the cluster and attempts to resolve
it. All these attempts will receive an NXDOMAIN reply before it finally requests the
original domain name directly, and these errors may not be of concern to you. In such
cases, you can configure their response_result
suffix here, so that the corresponding
response_status
in the l7_flow_log is forcibly set to Success
.
#3.2.3 Timeouts
#3.2.3.1 TCP Request Timeout
Tags:
agent_restart
FQCN:
processors.request_log.timeouts.tcp_request_timeout
Upgrade from old version: static_config.rrt-tcp-timeout
Default value:
processors:
request_log:
timeouts:
tcp_request_timeout: 1800s
2
3
4
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['10s', '3600s'] |
Description:
The timeout of l7 log info rrt calculate, when rrt exceed the value will act as timeout and will not calculate the sum and average and will not merge the request and response in session aggregate. the value must greater than session aggregate SLOT_TIME (const 10s) and less than 3600 on tcp.
#3.2.3.2 UDP Request Timeout
Tags:
agent_restart
FQCN:
processors.request_log.timeouts.udp_request_timeout
Upgrade from old version: static_config.rrt-udp-timeout
Default value:
processors:
request_log:
timeouts:
udp_request_timeout: 150s
2
3
4
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['10s', '300s'] |
Description:
The timeout of l7 log info rrt calculate, when rrt exceed the value will act as timeout and will not calculate the sum and average and will not merge the request and response in session aggregate. the value must greater than session aggregate SLOT_TIME (const 10s) and less than 300 on udp.
#3.2.3.3 Session Aggregate Window Duration
Tags:
agent_restart
FQCN:
processors.request_log.timeouts.session_aggregate_window_duration
Upgrade from old version: static_config.l7-log-session-aggr-timeout
Default value:
processors:
request_log:
timeouts:
session_aggregate_window_duration: 120s
2
3
4
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['20s', '300s'] |
Description:
l7_flow_log aggregate window.
#3.2.4 Tag Extraction
#3.2.4.1 Tracing Tag
#3.2.4.1.1 HTTP Real Client
Tags:
hot_update
FQCN:
processors.request_log.tag_extraction.tracing_tag.http_real_client
Upgrade from old version: http_log_proxy_client
Default value:
processors:
request_log:
tag_extraction:
tracing_tag:
http_real_client: X_Forwarded_For
2
3
4
5
Schema:
Key | Value |
---|---|
Type | string |
Description:
It is used to extract the real client IP field in the HTTP header, such as X-Forwarded-For, etc. Leave it empty to disable this feature.
#3.2.4.1.2 X-Request-ID
Tags:
hot_update
FQCN:
processors.request_log.tag_extraction.tracing_tag.x_request_id
Upgrade from old version: http_log_x_request_id
Default value:
processors:
request_log:
tag_extraction:
tracing_tag:
x_request_id: X_Request_ID
2
3
4
5
Schema:
Key | Value |
---|---|
Type | string |
Description:
It is used to extract the fields in the HTTP header that are used to uniquely identify the same request before and after the gateway, such as X-Request-ID, etc. This feature can be turned off by setting it to empty.
#3.2.4.1.3 APM TraceID
Tags:
hot_update
FQCN:
processors.request_log.tag_extraction.tracing_tag.apm_trace_id
Upgrade from old version: http_log_trace_id
Default value:
processors:
request_log:
tag_extraction:
tracing_tag:
apm_trace_id:
- traceparent
- sw8
2
3
4
5
6
7
Schema:
Key | Value |
---|---|
Type | string |
Description:
Used to extract the TraceID field in HTTP and RPC headers, supports filling in multiple values separated by commas. This feature can be turned off by setting it to empty.
#3.2.4.1.4 APM SpanID
Tags:
hot_update
FQCN:
processors.request_log.tag_extraction.tracing_tag.apm_span_id
Upgrade from old version: http_log_span_id
Default value:
processors:
request_log:
tag_extraction:
tracing_tag:
apm_span_id:
- traceparent
- sw8
2
3
4
5
6
7
Schema:
Key | Value |
---|---|
Type | string |
Description:
Used to extract the SpanID field in HTTP and RPC headers, supports filling in multiple values separated by commas. This feature can be turned off by setting it to empty.
#3.2.4.2 HTTP Endpoint
#3.2.4.2.1 Extraction Disabled
Tags:
agent_restart
FQCN:
processors.request_log.tag_extraction.http_endpoint.extraction_disabled
Upgrade from old version: static_config.l7-protocol-advanced-features.http-endpoint-extraction.disabled
Default value:
processors:
request_log:
tag_extraction:
http_endpoint:
extraction_disabled: false
2
3
4
5
Schema:
Key | Value |
---|---|
Type | bool |
Description:
HTTP endpoint extration is enabled by default.
#3.2.4.2.2 Match Rules
Tags:
agent_restart
FQCN:
processors.request_log.tag_extraction.http_endpoint.match_rules
Upgrade from old version: static_config.l7-protocol-advanced-features.http-endpoint-extraction.match-rules
Default value:
processors:
request_log:
tag_extraction:
http_endpoint:
match_rules:
- keep_segments: 2
url_prefix: ''
2
3
4
5
6
7
Schema:
Key | Value |
---|---|
Type | dict |
Description:
Extract endpoint according to the following rules:
- Find a longest prefix that can match according to the principle of "longest prefix matching"
- Intercept the first few paragraphs in URL (the content between two / is regarded as one paragraph) as endpoint
By default, two segments are extracted from the URL. For example, the
URL is /a/b/c?query=xxx
, whose segment is 3, extracts /a/b
as the
endpoint.
#3.2.4.2.2.1 URL Prefix
Tags:
agent_restart
FQCN:
processors.request_log.tag_extraction.http_endpoint.match_rules.url_prefix
Upgrade from old version: static_config.l7-protocol-advanced-features.http-endpoint-extraction.match-rules.prefix
Default value:
processors:
request_log:
tag_extraction:
http_endpoint:
match_rules:
- url_prefix: ''
2
3
4
5
6
Schema:
Key | Value |
---|---|
Type | string |
Description:
HTTP URL prefix.
#3.2.4.2.2.2 Keep Segments
Tags:
agent_restart
FQCN:
processors.request_log.tag_extraction.http_endpoint.match_rules.keep_segments
Upgrade from old version: static_config.l7-protocol-advanced-features.http-endpoint-extraction.match-rules.keep-segments
Default value:
processors:
request_log:
tag_extraction:
http_endpoint:
match_rules:
- keep_segments: 0
2
3
4
5
6
Schema:
Key | Value |
---|---|
Type | int |
Description:
Keep how many segments.
#3.2.4.3 Custom Fields
Tags:
agent_restart
FQCN:
processors.request_log.tag_extraction.custom_fields
Upgrade from old version: static_config.l7-protocol-advanced-features.extra-log-fields
Default value:
processors:
request_log:
tag_extraction:
custom_fields:
HTTP: []
HTTP2: []
2
3
4
5
6
Enum options:
Value | Note |
---|---|
HTTP | |
HTTP2 |
Schema:
Key | Value |
---|---|
Type | dict |
Description:
Configuration to extract the customized header fields of HTTP, HTTP2, gRPC protocol etc.
Example:
processors:
request_log:
tag_extraction:
custom_fields:
HTTP:
- field-name: "user-agent"
- field-name: "cookie"
2
3
4
5
6
7
Attention: use HTTP2
for gRPC
Protocol.
#3.2.4.3.1 $HTTP Custom Fields
Tags:
agent_restart
FQCN:
processors.request_log.tag_extraction.custom_fields.HTTP
Upgrade from old version: static_config.l7-protocol-advanced-features.extra-log-fields.$protocol
Default value:
processors:
request_log:
tag_extraction:
custom_fields:
HTTP: []
2
3
4
5
Schema:
Key | Value |
---|---|
Type | dict |
Description:
Configuration to extract the customized header fields of HTTP, HTTP2, gRPC protocol etc.
Example:
processors:
request_log:
tag_extraction:
custom_fields:
HTTP:
- field-name: "user-agent"
- field-name: "cookie"
2
3
4
5
6
7
Attention: use HTTP2
for gRPC
Protocol.
#3.2.4.3.1.1 Field Name
Tags:
agent_restart
FQCN:
processors.request_log.tag_extraction.custom_fields.HTTP.field_name
Upgrade from old version: static_config.l7-protocol-advanced-features.extra-log-fields.$protocol.field-name
Default value:
processors:
request_log:
tag_extraction:
custom_fields:
HTTP:
- field_name: ''
2
3
4
5
6
Schema:
Key | Value |
---|---|
Type | string |
Description:
Field name.
#3.2.4.4 Obfuscate Protocols
Tags:
agent_restart
FQCN:
processors.request_log.tag_extraction.obfuscate_protocols
Upgrade from old version: static_config.l7-protocol-advanced-features.obfuscate-enabled-protocols
Default value:
processors:
request_log:
tag_extraction:
obfuscate_protocols:
- Redis
2
3
4
5
Enum options:
Value | Note |
---|---|
MySQL | |
PostgreSQL | |
HTTP | |
HTTP2 | |
Redis |
Schema:
Key | Value |
---|---|
Type | string |
Description:
For the sake of data security, the data of the protocol that needs to be desensitized is configured here and is not processed by default.
#3.2.5 Tunning
#3.2.5.1 Payload Truncation
Tags:
hot_update
FQCN:
processors.request_log.tunning.payload_truncation
Upgrade from old version: l7_log_packet_size
Default value:
processors:
request_log:
tunning:
payload_truncation: 1024
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Unit | byte |
Range | [256, 65535] |
Description:
The maximum data length used for application protocol identification, note that the effective value is less than or equal to the value of capture_packet_size.
NOTE: For eBPF data, the largest valid value is 16384.
#3.2.5.2 Session Aggregate Slot Capacity
Tags:
agent_restart
FQCN:
processors.request_log.tunning.session_aggregate_slot_capacity
Upgrade from old version: static_config.l7-log-session-slot-capacity
Default value:
processors:
request_log:
tunning:
session_aggregate_slot_capacity: 1024
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [1024, 1000000] |
Description:
By default, unidirectional l7_flow_log is aggregated into bidirectional request_log (session) with a caching time window of 2 minutes. During this period, every 5 seconds is considered as a time slot (i.e., a LRU). This configuration is used to specify the maximum number of unidirectional l7_flow_log entries that can be cached in each time slot.
If the number of l7_flow_log entries cached in a time slot exceeds this configuration, 10% of the data in that time slot will be evicted based on the LRU strategy to reduce memory consumption. Note that the evicted data will not be discarded; instead, they will be sent to the deepflow-server as unidirectional request_log.
The following metrics can be used as reference data for adjusting this configuration:
- Metric
deepflow_system.deepflow_agent_l7_session_aggr.cached-request-resource
Used to record the total memory occupied by the request_resource field of the unidirectional l7_flow_log cached in all time slots at the current moment, in bytes. - Metric
deepflow_system.deepflow_agent_l7_session_aggr.over-limit
Used to record the number of times eviction is triggered due to reaching the LRU capacity limit.
#3.2.5.3 Consistent Timestamp in L7 Metrics
Tags:
agent_restart
FQCN:
processors.request_log.tunning.consistent_timestamp_in_l7_metrics
Default value:
processors:
request_log:
tunning:
consistent_timestamp_in_l7_metrics: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
When this configuration is enabled, for the same session, response-related metrics (such as response count, latency, exceptions) are recorded in the time slot corresponding to when the request occurred, rather than the time slot of the response itself. This means that when calculating metrics for requests and responses within a session, a consistent timestamp based on the time of the request occurrence is used.
#3.3 Flow Log
#3.3.1 Time Window
#3.3.1.1 Maximum Tolerable Packet Delay
Tags:
agent_restart
FQCN:
processors.flow_log.time_window.max_tolerable_packet_delay
Upgrade from old version: static_config.packet-delay
Default value:
processors:
flow_log:
time_window:
max_tolerable_packet_delay: 1s
2
3
4
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['1s', '10s'] |
Description:
Extra tolerance for QuadrupleGenerator receiving 1s-FlowLog.
#3.3.1.2 Extra Tolerable Flow Delay
Tags:
agent_restart
FQCN:
processors.flow_log.time_window.extra_tolerable_flow_delay
Upgrade from old version: static_config.second-flow-extra-delay-second
Default value:
processors:
flow_log:
time_window:
extra_tolerable_flow_delay: 0s
2
3
4
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['1s', '10s'] |
Description:
Extra tolerance for QuadrupleGenerator receiving 1s-FlowLog.
#3.3.2 Conntrack (a.k.a. Flow Map)
#3.3.2.1 Flow Flush Interval
Tags:
agent_restart
FQCN:
processors.flow_log.conntrack.flow_flush_interval
Upgrade from old version: static_config.flow.flush-interval
Default value:
processors:
flow_log:
conntrack:
flow_flush_interval: 1s
2
3
4
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['1s', '1m'] |
Description:
Flush interval of the queue connected to the collector.
#3.3.2.2 Flow Generation
#3.3.2.2.1 Server Ports
Tags:
agent_restart
FQCN:
processors.flow_log.conntrack.flow_generation.server_ports
Upgrade from old version: static_config.server-ports
Default value:
processors:
flow_log:
conntrack:
flow_generation:
server_ports: []
2
3
4
5
Schema:
Key | Value |
---|---|
Type | int |
Range | [1, 65535] |
Description:
Service port list, priority lower than TCP SYN flags.
#3.3.2.2.2 Cloud Traffic Ignore MAC
Tags:
agent_restart
FQCN:
processors.flow_log.conntrack.flow_generation.cloud_traffic_ignore_mac
Upgrade from old version: static_config.flow.ignore-tor-mac
Default value:
processors:
flow_log:
conntrack:
flow_generation:
cloud_traffic_ignore_mac: false
2
3
4
5
Schema:
Key | Value |
---|---|
Type | bool |
Description:
When the MAC addresses of the two-way traffic collected at the same location are asymmetrical, the traffic cannot be aggregated into a Flow. You can set this value at this time. Only valid for Cloud (not IDC) traffic.
#3.3.2.2.3 Ignore L2End
Tags:
agent_restart
FQCN:
processors.flow_log.conntrack.flow_generation.ignore_l2_end
Upgrade from old version: static_config.flow.ignore-l2-end
Default value:
processors:
flow_log:
conntrack:
flow_generation:
ignore_l2_end: false
2
3
4
5
Schema:
Key | Value |
---|---|
Type | bool |
Description:
For Cloud traffic, only the MAC address corresponding to the side with L2End = true is matched when generating the flow. Set this value to true to force a double-sided MAC address match and only aggregate traffic with exactly equal MAC addresses.
#3.3.2.2.4 IDC Traffic Ignore VLAN
Tags:
agent_restart ee_feature
FQCN:
processors.flow_log.conntrack.flow_generation.idc_traffic_ignore_vlan
Upgrade from old version: static_config.flow.ignore-idc-vlan
Default value:
processors:
flow_log:
conntrack:
flow_generation:
idc_traffic_ignore_vlan: false
2
3
4
5
Schema:
Key | Value |
---|---|
Type | bool |
Description:
When the VLAN of the two-way traffic collected at the same location are asymmetrical, the traffic cannot be aggregated into a Flow. You can set this value at this time. Only valid for IDC (not Cloud) traffic.
#3.3.2.3 Timeouts
#3.3.2.3.1 Established
Tags:
agent_restart
FQCN:
processors.flow_log.conntrack.timeouts.established
Upgrade from old version: static_config.flow.established-timeout
Default value:
processors:
flow_log:
conntrack:
timeouts:
established: 300s
2
3
4
5
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['1s', '1d'] |
Description:
Timeouts for TCP State Machine - Established.
#3.3.2.3.2 Closing RST
Tags:
agent_restart
FQCN:
processors.flow_log.conntrack.timeouts.closing_rst
Upgrade from old version: static_config.flow.closing-rst-timeout
Default value:
processors:
flow_log:
conntrack:
timeouts:
closing_rst: 35s
2
3
4
5
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['1s', '1d'] |
Description:
Timeouts for TCP State Machine - Closing Reset.
#3.3.2.3.3 Opening RST
Tags:
agent_restart
FQCN:
processors.flow_log.conntrack.timeouts.opening_rst
Upgrade from old version: static_config.flow.opening-rst-timeout
Default value:
processors:
flow_log:
conntrack:
timeouts:
opening_rst: 1s
2
3
4
5
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['1s', '1d'] |
Description:
Timeouts for TCP State Machine - Opening Reset.
#3.3.2.3.4 Others
Tags:
agent_restart
FQCN:
processors.flow_log.conntrack.timeouts.others
Upgrade from old version: static_config.flow.others-timeout
Default value:
processors:
flow_log:
conntrack:
timeouts:
others: 5s
2
3
4
5
Schema:
Key | Value |
---|---|
Type | duration |
Range | ['1s', '1d'] |
Description:
Timeouts for TCP State Machine - Others.
#3.3.3 Tunning
#3.3.3.1 FlowMap Hash Slots
Tags:
agent_restart
FQCN:
processors.flow_log.tunning.flow_map_hash_slots
Upgrade from old version: static_config.flow.flow-slots-size
Default value:
processors:
flow_log:
tunning:
flow_map_hash_slots: 131072
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [1024, 64000000] |
Description:
Since FlowAggregator is the first step in all processing, this value is also widely used in other hash tables such as QuadrupleGenerator, Collector, etc.
#3.3.3.2 Concurrent Flow Limit
Tags:
agent_restart
FQCN:
processors.flow_log.tunning.concurrent_flow_limit
Upgrade from old version: static_config.flow.flow-count-limit
Default value:
processors:
flow_log:
tunning:
concurrent_flow_limit: 65535
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [1024, 64000000] |
Description:
Maximum number of flows that can be stored in FlowMap, It will also affect the capacity of
the RRT cache, Example: rrt-cache-capacity
= flow-count-limit
. When rrt-cache-capacity
is not enough, it will be unable to calculate the rrt of l7.
#3.3.3.3 Memory Pool Size
Tags:
agent_restart
FQCN:
processors.flow_log.tunning.memory_pool_size
Upgrade from old version: static_config.flow.memory-pool-size
Default value:
processors:
flow_log:
tunning:
memory_pool_size: 65536
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [1024, 64000000] |
Description:
This value is used to set max length of memory pool in FlowMap Memory pools are used for frequently create and destroy objects like FlowNode, FlowLog, etc.
#3.3.3.4 Maximum Size of Batched Buffer
Tags:
agent_restart
FQCN:
processors.flow_log.tunning.max_batched_buffer_size
Upgrade from old version: static_config.batched-buffer-size-limit
Default value:
processors:
flow_log:
tunning:
max_batched_buffer_size: 131072
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [1024, 64000000] |
Description:
Only TaggedFlow allocation is affected at the moment. Structs will be allocated in batch to minimalize malloc calls. Total memory size of a batch will not exceed this limit. A number larger than 128K is not recommended because the default MMAP_THRESHOLD is 128K, allocating chunks larger than 128K will result in calling mmap and more page faults.
#3.3.3.5 FlowAggregator Queue Size
Tags:
agent_restart
FQCN:
processors.flow_log.tunning.flow_aggregator_queue_size
Upgrade from old version: static_config.flow.flow-aggr-queue-size
Default value:
processors:
flow_log:
tunning:
flow_aggregator_queue_size: 65535
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [65536, 64000000] |
Description:
The length of the following queues:
- 2-second-flow-to-minute-aggrer
#3.3.3.6 FlowGenerator Queue Size
Tags:
agent_restart
FQCN:
processors.flow_log.tunning.flow_generator_queue_size
Upgrade from old version: static_config.flow-queue-size
Default value:
processors:
flow_log:
tunning:
flow_generator_queue_size: 65536
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [65536, 64000000] |
Description:
The length of the following queues:
- 1-tagged-flow-to-quadruple-generator
- 1-tagged-flow-to-app-protocol-logs
- 0-{flow_type}-{port}-packet-to-tagged-flow (flow_type: sflow, netflow)
#3.3.3.7 QuadrupleGenerator Queue Size
Tags:
agent_restart
FQCN:
processors.flow_log.tunning.quadruple_generator_queue_size
Upgrade from old version: static_config.quadruple-queue-size
Default value:
processors:
flow_log:
tunning:
quadruple_generator_queue_size: 262144
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [262144, 64000000] |
Description:
The length of the following queues:
- 2-flow-with-meter-to-second-collector
- 2-flow-with-meter-to-minute-collector
#4. Outputs
#4.1 Socket
#4.1.1 Data Socket Type
Tags:
hot_update
FQCN:
outputs.socket.data_socket_type
Upgrade from old version: collector_socket_type
Default value:
outputs:
socket:
data_socket_type: TCP
2
3
Enum options:
Value | Note |
---|---|
TCP | |
UDP | |
FILE |
Schema:
Key | Value |
---|---|
Type | string |
Description:
It can only be set to FILE in standalone mode, in which case l4_flow_log and l7_flow_log will be written to local files.
#4.1.2 PCAP Socket Type
Tags:
hot_update
FQCN:
outputs.socket.pcap_socket_type
Upgrade from old version: compressor_socket_type
Default value:
outputs:
socket:
pcap_socket_type: TCP
2
3
Enum options:
Value | Note |
---|---|
TCP | |
UDP | |
RAW_UDP |
Schema:
Key | Value |
---|---|
Type | string |
Description:
RAW_UDP uses RawSocket to send UDP packets, which has the highest performance, but there may be compatibility issues in some environments.
#4.1.3 NPB Socket Type
Tags:
hot_update
ee_feature
FQCN:
outputs.socket.npb_socket_type
Upgrade from old version: npb_socket_type
Default value:
outputs:
socket:
npb_socket_type: RAW_UDP
2
3
Enum options:
Value | Note |
---|---|
UDP | |
RAW_UDP | |
TCP | |
ZMQ |
Schema:
Key | Value |
---|---|
Type | string |
Description:
RAW_UDP uses RawSocket to send UDP packets, which has the highest performance, but there may be compatibility issues in some environments.
#4.1.4 RAW_UDP QoS Bypass
Tags:
agent_restart
FQCN:
outputs.socket.raw_udp_qos_bypass
Upgrade from old version: static_config.enable-qos-bypass
Default value:
outputs:
socket:
raw_udp_qos_bypass: false
2
3
Schema:
Key | Value |
---|---|
Type | bool |
Description:
When sender uses RAW_UDP to send data, this feature can be enabled to improve performance. Linux Kernel >= 3.14 is required. Note that the data sent when this feature is enabled cannot be captured by tcpdump.
#4.1.5 Multiple Sockets To Ingester
Tags:
hot_update
FQCN:
outputs.socket.multiple_sockets_to_ingester
Upgrade from old version: static_config.multiple-sockets-to-ingester
Default value:
outputs:
socket:
multiple_sockets_to_ingester: false
2
3
Schema:
Key | Value |
---|---|
Type | bool |
Description:
When set to true, deepflow-agent will send data with multiple sockets to Ingester, which has higher performance, but will bring more impact to the firewall.
#4.2 Flow Log and Request Log
#4.2.1 Filters
#4.2.1.1 Capture Network Types for L4
Tags:
hot_update
FQCN:
outputs.flow_log.filters.l4_capture_network_types
Upgrade from old version: l4_log_tap_types
Default value:
outputs:
flow_log:
filters:
l4_capture_network_types:
- 0
2
3
4
5
Enum options:
Value | Note |
---|---|
-1 | Disabled |
0 | All TAPs |
DYNAMIC_OPTIONS | DYNAMIC_OPTIONS |
Schema:
Key | Value |
---|---|
Type | int |
Description:
The list of TAPs to collect l4_flow_log, you can also set a list of TAPs to be collected.
#4.2.1.2 Capture Network Types for L7
Tags:
hot_update
FQCN:
outputs.flow_log.filters.l7_capture_network_types
Upgrade from old version: l7_log_store_tap_types
Default value:
outputs:
flow_log:
filters:
l7_capture_network_types: []
2
3
4
Enum options:
Value | Note |
---|---|
-1 | Disabled |
0 | All TAPs |
DYNAMIC_OPTIONS | DYNAMIC_OPTIONS |
Schema:
Key | Value |
---|---|
Type | int |
Description:
The list of TAPs to collect l7_flow_log, you can also set a list of TAPs to be collected.
#4.2.1.3 Ignored Observation Points for L4
Tags:
hot_update
FQCN:
outputs.flow_log.filters.l4_ignored_observation_points
Upgrade from old version: l4_log_ignore_tap_sides
Default value:
outputs:
flow_log:
filters:
l4_ignored_observation_points: []
2
3
4
Enum options:
Value | Note |
---|---|
0 | rest, Other NIC |
1 | c, Client NIC |
2 | s, Server NIC |
4 | local, Local NIC |
9 | c-nd, Client K8s Node |
10 | s-nd, Server K8s Node |
17 | c-hv, Client VM Hypervisor |
18 | s-hv, Server VM Hypervisor |
25 | c-gw-hv, Client-side Gateway Hypervisor |
26 | s-gw-hv, Server-side Gateway Hypervisor |
33 | c-gw, Client-side Gateway |
34 | s-gw, Server-side Gateway |
41 | c-p, Client Process |
42 | s-p, Server Process |
Schema:
Key | Value |
---|---|
Type | int |
Description:
Use the value of tap_side to control which l4_flow_log should be ignored for
collection. This configuration also applies to tcp_sequence and pcap data in
the Enterprise Edition. Default value []
means store everything.
#4.2.1.4 Ignored Observation Points for L7
Tags:
hot_update
FQCN:
outputs.flow_log.filters.l7_ignored_observation_points
Upgrade from old version: l7_log_ignore_tap_sides
Default value:
outputs:
flow_log:
filters:
l7_ignored_observation_points: []
2
3
4
Enum options:
Value | Note |
---|---|
0 | rest, Other NIC |
1 | c, Client NIC |
2 | s, Server NIC |
4 | local, Local NIC |
9 | c-nd, Client K8s Node |
10 | s-nd, Server K8s Node |
17 | c-hv, Client VM Hypervisor |
18 | s-hv, Server VM Hypervisor |
25 | c-gw-hv, Client-side Gateway Hypervisor |
26 | s-gw-hv, Server-side Gateway Hypervisor |
33 | c-gw, Client-side Gateway |
34 | s-gw, Server-side Gateway |
41 | c-p, Client Process |
42 | s-p, Server Process |
Schema:
Key | Value |
---|---|
Type | int |
Description:
Use the value of tap_side to control which l7_flow_log should be ignored for collection.
#4.2.2 Throttles
#4.2.2.1 L4 Throttle
Tags:
hot_update
FQCN:
outputs.flow_log.throttles.l4_throttle
Upgrade from old version: l4_log_collect_nps_threshold
Default value:
outputs:
flow_log:
throttles:
l4_throttle: 10000
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Unit | Per Second |
Range | [100, 1000000] |
Description:
The maximum number of rows of l4_flow_log sent per second, when the actual number of rows exceeds this value, sampling is triggered.
#4.2.2.2 L7 Throttle
Tags:
hot_update
FQCN:
outputs.flow_log.throttles.l7_throttle
Upgrade from old version: l7_log_collect_nps_threshold
Default value:
outputs:
flow_log:
throttles:
l7_throttle: 10000
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Unit | Per Second |
Range | [100, 1000000] |
Description:
The maximum number of rows of l7_flow_log sent per second, when the actual number of rows exceeds this value, sampling is triggered.
#4.2.3 Tunning
#4.2.3.1 Collector Queue Size
Tags:
agent_restart
FQCN:
outputs.flow_log.tunning.collector_queue_size
Upgrade from old version: static_config.flow-sender-queue-size
Default value:
outputs:
flow_log:
tunning:
collector_queue_size: 65536
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [65536, 64000000] |
Description:
The length of the following queues:
- 3-flow-to-collector-sender
- 3-protolog-to-collector-sender
#4.2.3.2 Collector Queue Count
Tags:
agent_restart
FQCN:
outputs.flow_log.tunning.collector_queue_count
Upgrade from old version: static_config.flow-sender-queue-count
Default value:
outputs:
flow_log:
tunning:
collector_queue_count: 1
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [1, 64] |
Description:
The number of replicas for each output queue of the FlowAggregator/SessionAggregator.
#4.3 Flow Metrics
#4.3.1 Enabled
Tags:
hot_update
FQCN:
outputs.flow_metrics.enabled
Upgrade from old version: collector_enabled
Default value:
outputs:
flow_metrics:
enabled: true
2
3
Schema:
Key | Value |
---|---|
Type | bool |
Description:
When disabled, deepflow-agent will not send metrics and logging data collected using eBPF and cBPF.
Attention: set to false will also disable l4_flow_log and l7_flow_log.
#4.3.2 Filters
#4.3.2.1 Inactive Server Port Aggregation
Tags:
hot_update
FQCN:
outputs.flow_metrics.filters.inactive_server_port_aggregation
Upgrade from old version: inactive_server_port_enabled
Default value:
outputs:
flow_metrics:
filters:
inactive_server_port_aggregation: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
When enabled, deepflow-agent will not generate detailed metrics for each inactive port (ports that only receive data, not send data), and the data of all inactive ports will be aggregated into the metrics with a tag 'server_port = 0'.
#4.3.2.2 Inactive IP Aggregation
Tags:
hot_update
FQCN:
outputs.flow_metrics.filters.inactive_ip_aggregation
Upgrade from old version: inactive_ip_enabled
Default value:
outputs:
flow_metrics:
filters:
inactive_ip_aggregation: false
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
When enabled, deepflow-agent will not generate detailed metrics for each inactive IP address (IP addresses that only receive data, not send data), and the data of all inactive IP addresses will be aggregated into the metrics with a tag 'ip = 0'.
#4.3.2.3 NPM Metrics
Tags:
hot_update
FQCN:
outputs.flow_metrics.filters.npm_metrics
Upgrade from old version: l4_performance_enabled
Default value:
outputs:
flow_metrics:
filters:
npm_metrics: true
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
When closed, deepflow-agent only collects some basic throughput metrics.
#4.3.2.4 APM Metrics
Tags:
hot_update
FQCN:
outputs.flow_metrics.filters.apm_metrics
Upgrade from old version: l7_metrics_enabled
Default value:
outputs:
flow_metrics:
filters:
apm_metrics: true
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
When closed, deepflow-agent will not collect RED (request/error/delay) metrics.
#4.3.2.5 Second Metrics
Tags:
hot_update
FQCN:
outputs.flow_metrics.filters.second_metrics
Upgrade from old version: vtap_flow_1s_enabled
Default value:
outputs:
flow_metrics:
filters:
second_metrics: true
2
3
4
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Second granularity metrics.
#4.3.3 Tunning
#4.3.3.1 Sender Queue Size
Tags:
agent_restart
FQCN:
outputs.flow_metrics.tunning.sender_queue_size
Upgrade from old version: static_config.collector-sender-queue-size
Default value:
outputs:
flow_metrics:
tunning:
sender_queue_size: 65536
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [65536, 64000000] |
Description:
The length of the following queues:
- 2-doc-to-collector-sender
#4.3.3.2 Sender Queue Count
Tags:
agent_restart
FQCN:
outputs.flow_metrics.tunning.sender_queue_count
Upgrade from old version: static_config.collector-sender-queue-count
Default value:
outputs:
flow_metrics:
tunning:
sender_queue_count: 1
2
3
4
Schema:
Key | Value |
---|---|
Type | int |
Range | [1, 64] |
Description:
The number of replicas for each output queue of the collector.
#4.4 NPB (Network Packet Broker)
#4.4.1 Maximum MTU
Tags:
hot_update
ee_feature
FQCN:
outputs.npb.max_mtu
Upgrade from old version: mtu
Default value:
outputs:
npb:
max_mtu: 1500
2
3
Schema:
Key | Value |
---|---|
Type | int |
Unit | byte |
Range | [500, 10000] |
Description:
Maximum MTU allowed when using UDP to transfer data.
Attention: Public cloud service providers may modify the content of the tail of the UDP packet whose packet length is close to 1500 bytes. When using UDP transmission, it is recommended to set a slightly smaller value.
#4.4.2 RAW_UDP VLAN Tag
Tags:
hot_update
ee_feature
FQCN:
outputs.npb.raw_udp_vlan_tag
Upgrade from old version: output_vlan
Default value:
outputs:
npb:
raw_udp_vlan_tag: 0
2
3
Schema:
Key | Value |
---|---|
Type | int |
Range | [0, 4095] |
Description:
When using RAW_UDP Socket to transmit UDP data, this value can be used to
set the VLAN tag. Default value 0
means no VLAN tag.
#4.4.3 Extra VLAN Header
Tags:
hot_update
ee_feature
FQCN:
outputs.npb.extra_vlan_header
Upgrade from old version: npb_vlan_mode
Default value:
outputs:
npb:
extra_vlan_header: 0
2
3
Enum options:
Value | Note |
---|---|
0 | None |
1 | 802.1Q |
2 | QinQ |
Schema:
Key | Value |
---|---|
Type | int |
Description:
Whether to add an extra 802.1Q header to NPB traffic, when this value is set, deepflow-agent will insert a VLAN Tag into the NPB traffic header, and the value is the lower 12 bits of TunnelID in the VXLAN header.
#4.4.4 Traffic Global Dedup
Tags:
hot_update
ee_feature
FQCN:
outputs.npb.traffic_global_dedup
Upgrade from old version: npb_dedup_enabled
Default value:
outputs:
npb:
traffic_global_dedup: true
2
3
Schema:
Key | Value |
---|---|
Type | bool |
Description:
Whether to enable global (distributed) traffic deduplication for the NPB feature.
#4.4.5 Target Port
Tags:
agent_restart ee_feature
FQCN:
outputs.npb.target_port
Upgrade from old version: static_config.npb-port
Default value:
outputs:
npb:
target_port: 4789
2
3
Schema:
Key | Value |
---|---|
Type | int |
Range | [1, 65535] |
Description:
Server port for NPB.
#4.4.6 Custom VXLAN Flags
Tags:
agent_restart ee_feature
FQCN:
outputs.npb.custom_vxlan_flags
Upgrade from old version: static_config.vxlan-flags
Default value:
outputs:
npb:
custom_vxlan_flags: 255
2
3
Schema:
Key | Value |
---|---|
Type | int |
Range | [0, 255] |
Description:
NPB uses the first byte of the VXLAN Flag to identify the sending traffic to prevent the traffic sent by NPB from being collected by deepflow-agent.
Attention: To ensure that the VNI bit is set, the value configured here will be used after |= 0b1000_0000. Therefore, this value cannot be directly configured as 0b1000_0000.
#4.4.7 Overlay VLAN Header Trimming
Tags:
agent_restart ee_feature
FQCN:
outputs.npb.overlay_vlan_header_trimming
Upgrade from old version: static_config.ignore-overlay-vlan
Default value:
outputs:
npb:
overlay_vlan_header_trimming: false
2
3
Schema:
Key | Value |
---|---|
Type | bool |
Description:
This configuration only ignores the VLAN header in the captured original message and does not affect the configuration item: npb_vlan_mode
#4.4.8 Maximum Tx Throughput
Tags:
hot_update
ee_feature
FQCN:
outputs.npb.max_tx_throughput
Upgrade from old version: max_npb_bps
Default value:
outputs:
npb:
max_tx_throughput: 1000
2
3
Schema:
Key | Value |
---|---|
Type | int |
Unit | Mbps |
Range | [1, 100000] |
Description:
Maximum traffic rate allowed for npb sender.
#5. Plugins
#5.1 Wasm Plugins
Tags:
hot_update
FQCN:
plugins.wasm_plugins
Upgrade from old version: wasm_plugins
Default value:
plugins:
wasm_plugins: []
2
Enum options:
Value | Note |
---|---|
DYNAMIC_OPTIONS | DYNAMIC_OPTIONS |
Schema:
Key | Value |
---|---|
Type | string |
Description:
Wasm plugin need to load in agent
#5.2 SO Plugins
Tags:
hot_update
FQCN:
plugins.so_plugins
Upgrade from old version: so_plugins
Default value:
plugins:
so_plugins: []
2
Enum options:
Value | Note |
---|---|
DYNAMIC_OPTIONS | DYNAMIC_OPTIONS |
Schema:
Key | Value |
---|---|
Type | string |
Description:
so plugin need to load in agent. so plugin use dlopen flag RTLD_LOCAL and RTLD_LAZY to open the so file, it mean that the so must solve the link problem by itself
#6. Dev
#6.1 Feature Flags
Tags:
agent_restart
FQCN:
dev.feature_flags
Upgrade from old version: static_config.feature-flags
Default value:
dev:
feature_flags: []
2
Schema:
Key | Value |
---|---|
Type | string |
Description:
Unreleased deepflow-agent features can be turned on by setting this switch.