Configuration

Agent

Created：-- Last Modified：--

#1. Global

#1.1 Limits

Resource limitations

#1.1.1 CPU Limit

Tags:

hot_update

FQCN:

global.limits.max_millicpus

Upgrade from old version: max_millicpus

Default value:

global:
  limits:
    max_millicpus: 1000

1
2
3

Schema:

Key	Value
Type	int
Unit	Logical Milli Cores
Range	[1, 100000]

Description:

deepflow-agent uses cgroups to limit CPU usage. 1 millicpu = 1 millicore = 0.001 core.

#1.1.2 Memory Limit

Tags:

hot_update

FQCN:

global.limits.max_memory

Upgrade from old version: max_memory

Default value:

global:
  limits:
    max_memory: 768

1
2
3

Schema:

Key	Value
Type	int
Unit	MiB
Range	[128, 100000]

Description:

deepflow-agent uses cgroups to limit memory usage.

#1.1.3 Maximum Log Backhaul Rate

Tags:

hot_update

FQCN:

global.limits.max_log_backhaul_rate

Upgrade from old version: log_threshold

Default value:

global:
  limits:
    max_log_backhaul_rate: 36000

1
2
3

Schema:

Key	Value
Type	int
Unit	Lines/Hour
Range	[0, 1000000]

Description:

deepflow-agent will send logs to deepflow-server, 0 means no limit.

#1.1.4 Maximum Local Log File Size

Tags:

hot_update

FQCN:

global.limits.max_local_log_file_size

Upgrade from old version: log_file_size

Default value:

global:
  limits:
    max_local_log_file_size: 1000

1
2
3

Schema:

Key	Value
Type	int
Unit	MiB
Range	[10, 10000]

Description:

The maximum disk space allowed for deepflow-agent log files.

#1.1.5 Local Log Retention

Tags:

hot_update

FQCN:

global.limits.local_log_retention

Upgrade from old version: log_retention

Default value:

global:
  limits:
    local_log_retention: 300d

1
2
3

Schema:

Key	Value
Type	duration
Range	['10d', '10000d']

Description:

The retention time for deepflow-agent log files.

#1.1.6 Maximum Socket Count

Tags:

hot_update

FQCN:

global.limits.max_sockets

Upgrade from old version: static_config.max-sockets

Default value:

global:
  limits:
    max_sockets: 1024

1
2
3

Schema:

Key	Value
Type	int
Unit	count
Range	[16, 4096]

Description:

The maximum number of sockets that the agent can open. Agent will restart if socket count exceeds this value.

#1.1.7 Maximum Socket Count Tolerate Interval

Tags:

hot_update

FQCN:

global.limits.max_sockets_tolerate_interval

Upgrade from old version: static_config.max-sockets-tolerate-interval

Default value:

global:
  limits:
    max_sockets_tolerate_interval: 60s

1
2
3

Schema:

Key	Value
Type	duration
Range	['0s', '3600s']

Description:

The interval to tolerate socket count exceeding max-sockets before restarting. Agent will only restart if socket count exceeds max-sockets for this duration. Restarts are triggered by guard module, so setting this value lower than guard-interval will cause agent to restart immediately.

#1.2 Alerts

#1.2.1 Thread Limit

Tags:

hot_update

FQCN:

global.alerts.thread_threshold

Upgrade from old version: thread_threshold

Default value:

global:
  alerts:
    thread_threshold: 500

1
2
3

Schema:

Key	Value
Type	int
Range	[1, 1000]

Description:

The maximum number of threads deepflow-agent is allowed to create.

When the number of threads exceeds this limit, an exception alert will be triggered.
When the number of threads exceeds twice this limit value, a deepflow-agent restart will be triggered.

#1.2.2 Process Limit

Tags:

hot_update

FQCN:

global.alerts.process_threshold

Upgrade from old version: process_threshold

Default value:

global:
  alerts:
    process_threshold: 10

1
2
3

Schema:

Key	Value
Type	int
Range	[1, 100]

Description:

The maximum number of processes named deepflow-agent is allowed to launch. If the number of processes named deepflow-agent in the current system reaches this limit, subsequent processes named deepflow-agent will fail to start.

#1.2.3 Core File Checker

Tags:

agent_restart deprecated

FQCN:

global.alerts.check_core_file_disabled

Upgrade from old version: static_config.check-core-file-disabled

Default value:

global:
  alerts:
    check_core_file_disabled: false

1
2
3

Schema:

Key	Value
Type	bool

Description:

When the host has an invalid NFS file system or a docker is running, sometime program hang when checking the core file, so the core file check provides a switch to prevent the process hang. Additional links:

#1.3 Circuit Breakers

Control deepflow-agent to stop running or stop some functions under certain environmental conditions.

#1.3.1 System Free Memory Percentage

Calculation Method: (free_memory / total_memory) * 100%

#1.3.1.1 Trigger Threshold

Tags:

hot_update

FQCN:

global.circuit_breakers.sys_memory_percentage.trigger_threshold

Upgrade from old version: sys_free_memory_limit

Default value:

global:
  circuit_breakers:
    sys_memory_percentage:
      trigger_threshold: 0

1
2
3
4

Schema:

Key	Value
Type	int
Unit	%
Range	[0, 100]

Description:

Setting it to 0 indicates that the system memory ratio is not checked. The observed memory ratio is determined by global.circuit_breakers.sys_memory_percentage.metric.

When the current system observed memory ratio is below trigger_threshold * 70%, the agent will automatically restart.
When the current system observed memory ratio is below trigger_threshold but above 70%, the agent is set to the abnormal state of FREE_MEM_EXCEEDED and reports an alarm.
When the current system observed memory ratio remains above trigger_threshold * 110%, the agent recovers from the abnormal state.

#1.3.1.2 Metric

Tags:

hot_update

FQCN:

global.circuit_breakers.sys_memory_percentage.metric

Upgrade from old version: sys_free_memory_metric

Default value:

global:
  circuit_breakers:
    sys_memory_percentage:
      metric: free

1
2
3
4

Enum options:

Value	Note
free
available

Schema:

Key	Value
Type	string

Description:

deepflow-agent observes the percentage of this memory metric

#1.3.2 Relative System Load

Calculation Method: system_load / total_cpu_cores

#1.3.2.1 Trigger Threshold

Tags:

hot_update

FQCN:

global.circuit_breakers.relative_sys_load.trigger_threshold

Upgrade from old version: system_load_circuit_breaker_threshold

Default value:

global:
  circuit_breakers:
    relative_sys_load:
      trigger_threshold: 1.0

1
2
3
4

Schema:

Key	Value
Type	float
Range	[0, 10]

Description:

When Linux system load divided by the number of CPU cores exceeds this value, the agent automatically enters the disabled state. Setting it or recovery_threshold to 0 disables this feature.

#1.3.2.2 Recovery Threshold

Tags:

hot_update

FQCN:

global.circuit_breakers.relative_sys_load.recovery_threshold

Upgrade from old version: system_load_circuit_breaker_recover

Default value:

global:
  circuit_breakers:
    relative_sys_load:
      recovery_threshold: 0.9

1
2
3
4

Schema:

Key	Value
Type	float
Range	[0, 10]

Description:

After deepflow-agent enters disabled state and Linux system load divided by the number of CPU cores is continuously below this value for 5 minutes, the agent can recover from the circuit breaker disabled state. Setting it or trigger_threshold to 0 disables this feature.

#1.3.2.3 Metric

Tags:

hot_update

FQCN:

global.circuit_breakers.relative_sys_load.metric

Upgrade from old version: system_load_circuit_breaker_metric

Default value:

global:
  circuit_breakers:
    relative_sys_load:
      metric: load15

1
2
3
4

Enum options:

Value	Note
load1
load5
load15

Schema:

Key	Value
Type	string

Description:

The system load circuit breaker mechanism uses this metric, and the agent will check this metric every 10 seconds by default.

#1.3.3 Tx Throughput

#1.3.3.1 Trigger Threshold

Tags:

hot_update ee_feature

FQCN:

global.circuit_breakers.tx_throughput.trigger_threshold

Upgrade from old version: max_tx_bandwidth

Default value:

global:
  circuit_breakers:
    tx_throughput:
      trigger_threshold: 0

1
2
3
4

Schema:

Key	Value
Type	int
Unit	Mbps
Range	[0, 100000]

Description:

When the outbound throughput of the NPB interface reaches or exceeds the threshold, the broker will be stopped, after that the broker will be resumed if the throughput is lower than (trigger_threshold - outputs.npb.max_tx_throughput)*90% within 5 consecutive monitoring intervals.

Attention: When configuring this value, it must be greater than outputs.npb.max_tx_throughput. Set to 0 will disable this feature.

#1.3.3.2 Throughput Monitoring Interval

Tags:

hot_update ee_feature

FQCN:

global.circuit_breakers.tx_throughput.throughput_monitoring_interval

Upgrade from old version: bandwidth_probe_interval

Default value:

global:
  circuit_breakers:
    tx_throughput:
      throughput_monitoring_interval: 10s

1
2
3
4

Schema:

Key	Value
Type	duration
Range	['1s', '60s']

Description:

Monitoring interval for outbound traffic rate of NPB interface.

#1.3.4 Free Disk

#1.3.4.1 Percentage Trigger Threshold

Tags:

hot_update

FQCN:

global.circuit_breakers.free_disk.percentage_trigger_threshold

Default value:

global:
  circuit_breakers:
    free_disk:
      percentage_trigger_threshold: 15

1
2
3
4

Schema:

Key	Value
Type	int
Unit	%
Range	[0, 100]

Description:

This configuration is only valid when the Agent runs in a non-container environment. Configuring to 0 means disabling the threshold. The observed disks are the disks where the global.circuit_breakers.free_disk.directories are located.

When the system free disk ratio is lower than this threshold, the Agent enters the fuse disabled state, and sets the FREE_DISK_CIRCUIT_BREAKER abnormal state, and reports the Agent abnormal alarm.
When the system free disk ratio is higher than this threshold * 110%, the Agent recovers from the abnormal state.

#1.3.4.2 Absolute_Trigger Threshold

Tags:

hot_update

FQCN:

global.circuit_breakers.free_disk.absolute_trigger_threshold

Default value:

global:
  circuit_breakers:
    free_disk:
      absolute_trigger_threshold: 10

1
2
3
4

Schema:

Key	Value
Type	int
Unit	GB
Range	[0, 100000]

Description:

When the system free disk size is lower than this threshold, the Agent enters the fuse disabled state, and sets the FREE_DISK_CIRCUIT_BREAKER abnormal state, and reports the Agent abnormal alarm.
When the system free disk size is higher than this threshold * 110%, the Agent recovers from the abnormal state.

#1.3.4.3 Directories

Tags:

hot_update

FQCN:

global.circuit_breakers.free_disk.directories

Default value:

global:
  circuit_breakers:
    free_disk:
      directories:
      - /

1
2
3
4
5

Schema:

Key	Value
Type	string

Description:

Observe the disk space where the directories is located. For the windows operating system, the default value is c:\.

#1.4 Tunning

Tune the runtime of deepflow-agent.

#1.4.1 CPU Affinity

Tags:

agent_restart

FQCN:

global.tunning.cpu_affinity

Upgrade from old version: static_config.cpu-affinity

Default value:

global:
  tunning:
    cpu_affinity: []

1
2
3

Schema:

Key	Value
Type	int
Range	[0, 65536]

Description:

CPU affinity is the tendency of a process to run on a given CPU for as long as possible without being migrated to other processors. Invalid ID will be ignored. Currently only works for dispatcher threads. Example:

global:
  tunning:
    cpu_affinity: [1, 3, 5, 7, 9]

1
2
3

#1.4.2 Process Scheduling Priority

Tags:

agent_restart

FQCN:

global.tunning.process_scheduling_priority

Upgrade from old version: static_config.process-scheduling-priority

Default value:

global:
  tunning:
    process_scheduling_priority: 0

1
2
3

Schema:

Key	Value
Type	int
Range	[-20, 19]

Description:

The smaller the value of process scheduling priority, the higher the priority of the deepflow-agent process, and the larger the value, the lower the priority.

#1.4.3 Idle Memory Trimming

Tags:

agent_restart

FQCN:

global.tunning.idle_memory_trimming

Upgrade from old version: static_config.memory-trim-disabled

Default value:

global:
  tunning:
    idle_memory_trimming: true

1
2
3

Schema:

Key	Value
Type	bool

Description:

Proactive memory trimming can effectively reduce memory usage, but there may be performance loss.

#1.4.4 Turn off swap memory

Tags:

agent_restart

FQCN:

global.tunning.swap_disabled

Default value:

global:
  tunning:
    swap_disabled: false

1
2
3

Schema:

Key	Value
Type	bool

Description:

Note that disabling swap memory requires root and CAP_IPC_LOCK permissions, and disabling swap memory may improve performance and reduce CPU usage, but memory will increase.

#1.4.5 Page Cache Reclaim Percentage

Tags:

hot_update

FQCN:

global.tunning.page_cache_reclaim_percentage

Upgrade from old version: static_config.page-cache-reclaim-percentage

Default value:

global:
  tunning:
    page_cache_reclaim_percentage: 100

1
2
3

Schema:

Key	Value
Type	int
Range	[0, 100]

Description:

A page cache reclaim is triggered when the pecentage of page cache and cgroups memory.limit_in_bytes exceeds this value. Both anonymous memory and file page cache are accounted for in cgroup's memory usage. Under some circumstances, page cache alone can cause cgroup to OOM kill agent process. To avoid this, agent can reclaim page cache periodically. Although reclaming may not cause performance issues for agent who doesn't have much I/O, other processes in the same cgroup may be affected. Very low values are not recommended. Note:

This feature is available for cgroups v1 only.
This feature is disabled if agent memory cgroup path is "/".
The minimal interval of reclaims is 1 minute.

#1.4.6 Resource Monitoring Interval

Tags:

agent_restart

FQCN:

global.tunning.resource_monitoring_interval

Upgrade from old version: static_config.guard-interval

Default value:

global:
  tunning:
    resource_monitoring_interval: 10s

1
2
3

Schema:

Key	Value
Type	duration
Range	['1s', '3600s']

Description:

The agent will monitor:

System free memory
Get the number of threads of the agent itself by reading the file information under the /proc directory
Size and number of log files generated by the agent.
System load
Agent memory usage (check if memory trimming is needed)

#1.5 NTP Clock Synchronization

This synchronization mechanism does not alter the host's clock; it is only used internally by the deepflow-agent process.

#1.5.1 Enabled

Tags:

hot_update

FQCN:

global.ntp.enabled

Upgrade from old version: ntp_enabled

Default value:

global:
  ntp:
    enabled: false

1
2
3

Schema:

Key	Value
Type	bool

Description:

Whether to synchronize the clock to the deepflow-server, this behavior will not change the time of the deepflow-agent running environment.

#1.5.2 Maximum Drift

Tags:

agent_restart

FQCN:

global.ntp.max_drift

Upgrade from old version: static_config.ntp-max-interval

Default value:

global:
  ntp:
    max_drift: 300s

1
2
3

Schema:

Key	Value
Type	duration
Range	['0ns', '365d']

Description:

When the clock drift exceeds this value, the agent will restart.

#1.5.3 Minimal Drift

Tags:

agent_restart

FQCN:

global.ntp.min_drift

Upgrade from old version: static_config.ntp-min-interval

Default value:

global:
  ntp:
    min_drift: 10s

1
2
3

Schema:

Key	Value
Type	duration
Range	['0ns', '365d']

Description:

When the clock drift exceeds this value, the timestamp will be corrected.

#1.6 Communication

Configuration of deepflow-agent communication.

#1.6.1 Proactive Request Interval

Tags:

hot_update

FQCN:

global.communication.proactive_request_interval

Upgrade from old version: sync_interval

Default value:

global:
  communication:
    proactive_request_interval: 60s

1
2
3

Schema:

Key	Value
Type	duration
Range	['10s', '3600s']

Description:

The interval at which deepflow-agent proactively requests configuration and tag information from deepflow-server.

#1.6.2 Maximum Escape Duration

Tags:

hot_update

FQCN:

global.communication.max_escape_duration

Upgrade from old version: max_escape_seconds

Default value:

global:
  communication:
    max_escape_duration: 3600s

1
2
3

Schema:

Key	Value
Type	duration
Range	['600s', '30d']

Description:

The maximum time that the agent is allowed to work normally when it cannot connect to the server. After the timeout, the agent automatically enters the disabled state.

#1.6.3 Controller IP Address

Tags:

hot_update

FQCN:

global.communication.proxy_controller_ip

Upgrade from old version: proxy_controller_ip

Default value:

global:
  communication:
    proxy_controller_ip: 127.0.0.1

1
2
3

Schema:

Key	Value
Type	ip

Description:

When this value is set, deepflow-agent will use this IP to access the control plane port of deepflow-server, otherwise, the server will use its own node IP as the control plane communication IP. This parameter is usually used when the server uses a load balancer or a virtual IP to provide services externally.

#1.6.4 Controller Port

Tags:

hot_update

FQCN:

global.communication.proxy_controller_port

Upgrade from old version: proxy_controller_port

Default value:

global:
  communication:
    proxy_controller_port: 30035

1
2
3

Schema:

Key	Value
Type	int
Range	[1, 65535]

Description:

The control plane port used by deepflow-agent to access deepflow-server. The default port within the same K8s cluster is 20035, and the default port of deepflow-agent outside the cluster is 30035.

#1.6.5 Ingester IP Address

Tags:

hot_update

FQCN:

global.communication.ingester_ip

Upgrade from old version: analyzer_ip

Default value:

global:
  communication:
    ingester_ip: ''

1
2
3

Schema:

Key	Value
Type	ip

Description:

When this value is set, deepflow-agent will use this IP to access the data plane port of deepflow-server, which is usually used when deepflow-server uses an external load balancer.

#1.6.6 Ingester Port

Tags:

hot_update

FQCN:

global.communication.ingester_port

Upgrade from old version: analyzer_port

Default value:

global:
  communication:
    ingester_port: 30033

1
2
3

Schema:

Key	Value
Type	int
Range	[1, 65535]

Description:

The data plane port used by deepflow-agent to access deepflow-server. The default port within the same K8s cluster is 20033, and the default port of deepflow-agent outside the cluster is 30033.

#1.6.7 gRPC Socket Buffer Size

Tags:

hot_update

FQCN:

global.communication.grpc_buffer_size

Upgrade from old version: static_config.grpc-buffer-size

Default value:

global:
  communication:
    grpc_buffer_size: 5

1
2
3

Schema:

Key	Value
Type	int
Unit	MiB
Range	[5, 1024]

Description:

gRPC socket buffer size.

#1.6.8 Max Throughput To Ingester

Tags:

hot_update

FQCN:

global.communication.max_throughput_to_ingester

Default value:

global:
  communication:
    max_throughput_to_ingester: 100

1
2
3

Schema:

Key	Value
Type	int
Unit	Mbps
Range	[0, 10000]

Description:

The maximum allowed flow rate for sending observability data to the server-side Ingester module. For the overflow action, refer to the ingester_traffic_overflow_action configuration description. Setting it to 0 means no speed limit.

#1.6.9 Action when the Ingester traffic exceeds the limit

Tags:

hot_update

FQCN:

global.communication.ingester_traffic_overflow_action

Default value:

global:
  communication:
    ingester_traffic_overflow_action: WAIT

1
2
3

Enum options:

Value	Note
WAIT
DROP

Schema:

Key	Value
Type	string

Description:

Action when the Ingester traffic exceeds the limit

WAIT: pause sending, cache data into queue, and wait for next sending
DROP: the data is discarded directly and the Agent DATA_BPS_THRESHOLD_EXCEEDED exception is triggered

#1.6.10 Request via NAT IP Address

Tags:

hot_update

FQCN:

global.communication.request_via_nat_ip

Upgrade from old version: nat_ip_enabled

Default value:

global:
  communication:
    request_via_nat_ip: false

1
2
3

Schema:

Key	Value
Type	bool

Description:

Used when deepflow-agent uses an external IP address to access deepflow-server. For example, when deepflow-server is behind a NAT gateway, or the host where deepflow-server is located has multiple node IP addresses and different deepflow-agents need to access different node IPs, you can set an additional NAT IP for each deepflow-server address, and modify this value to true.

#1.7 Self Monitoring

Configuration of deepflow-agent's own diagnosis.

#1.7.1 Log

Configuration of deepflow-agent's own logs.

#1.7.1.1 Log Level

Tags:

hot_update

FQCN:

global.self_monitoring.log.log_level

Upgrade from old version: log_level

Default value:

global:
  self_monitoring:
    log:
      log_level: INFO

1
2
3
4

Enum options:

Value	Note
DEBUG
INFO
WARN
ERROR

Schema:

Key	Value
Type	string

Description:

Log level of deepflow-agent.

It is also possible to specify the log level for specific modules with advanced configuation in the following format:

<log_level_spec> ::= single_log_level_spec[{,single_log_level_spec}][/<text_filter>]
<single_log_level_spec> ::= <path_to_module>|<log_level>|<path_to_module>=<log_level>
<text_filter> ::= <regex>

1
2
3

For example:

log_level: info,deepflow_agent::rpc::session=debug

will set the log level to INFO for all modules and DEBUG for the rpc::session module.

#1.7.1.2 Log File

Tags:

agent_restart

FQCN:

global.self_monitoring.log.log_file

Upgrade from old version: static_config.log-file

Default value:

global:
  self_monitoring:
    log:
      log_file: /var/log/deepflow-agent/deepflow-agent.log

1
2
3
4

Schema:

Key	Value
Type	string

Description:

The file where deepflow-agent logs are written. Note that this configuration is only used in standalone mode.

#1.7.1.3 Log Backhaul Enabled

Tags:

hot_update

FQCN:

global.self_monitoring.log.log_backhaul_enabled

Upgrade from old version: rsyslog_enabled

Default value:

global:
  self_monitoring:
    log:
      log_backhaul_enabled: true

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

When enabled, deepflow-agent will send its own logs to deepflow-server.

#1.7.2 Profile

#1.7.2.1 Enabled

Tags:

agent_restart deprecated

FQCN:

global.self_monitoring.profile.enabled

Upgrade from old version: static_config.profiler

Default value:

global:
  self_monitoring:
    profile:
      enabled: false

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

Only available for Trident (Golang version of Agent).

#1.7.3 Debug

#1.7.3.1 Enabled

Tags:

hot_update

FQCN:

global.self_monitoring.debug.enabled

Upgrade from old version: debug_enabled

Default value:

global:
  self_monitoring:
    debug:
      enabled: true

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

Disabled / Enabled the debug function of the deepflow-agent.

#1.7.3.2 Local UDP Port

Tags:

agent_restart

FQCN:

global.self_monitoring.debug.local_udp_port

Upgrade from old version: static_config.debug-listen-port

Default value:

global:
  self_monitoring:
    debug:
      local_udp_port: 0

1
2
3
4

Schema:

Key	Value
Type	int
Range	[0, 65535]

Description:

Default value 0 means use a random listen port number. Only available for Trident (Golang version of Agent).

#1.7.3.3 Debug Metrics Enabled

Tags:

agent_restart deprecated

FQCN:

global.self_monitoring.debug.debug_metrics_enabled

Upgrade from old version: static_config.enable-debug-stats

Default value:

global:
  self_monitoring:
    debug:
      debug_metrics_enabled: false

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

Only available for Trident (Golang version of Agent).

#1.7.4 Interval

Tags:

hot_update

FQCN:

global.self_monitoring.interval

Upgrade from old version: stats_interval

Default value:

global:
  self_monitoring:
    interval: 10s

1
2
3

Schema:

Key	Value
Type	duration
Range	['1s', '3600s']

Description:

statsd interval.

#1.8 Standalone Mode

Configuration of deepflow-agent standalone mode.

#1.8.1 Maximum Data File Size

Tags:

agent_restart

FQCN:

global.standalone_mode.max_data_file_size

Upgrade from old version: static_config.standalone-data-file-size

Default value:

global:
  standalone_mode:
    max_data_file_size: 200

1
2
3

Schema:

Key	Value
Type	int
Unit	MiB
Range	[1, 1000000]

Description:

When deepflow-agent runs in standalone mode, it will not be controlled by deepflow-server, and the collected data will only be written to the local file. Currently supported data types for writing are l4_flow_log and l7_flow_log. Each type of data is written to a separate file. This configuration can be used to specify the maximum size of the data file, and rotate when it exceeds this size. A maximum of two files are kept for each type of data.

#1.8.2 Data File Directory

Tags:

agent_restart

FQCN:

global.standalone_mode.data_file_dir

Upgrade from old version: static_config.standalone-data-file-dir

Default value:

global:
  standalone_mode:
    data_file_dir: /var/log/deepflow-agent/

1
2
3

Schema:

Key	Value
Type	string

Description:

Directory where data files are written to.

#2. Inputs

#2.1 Proc

#2.1.1 Enabled

Tags:

agent_restart

FQCN:

inputs.proc.enabled

Upgrade from old version: static_config.os-proc-sync-enabled

Default value:

inputs:
  proc:
    enabled: true

1
2
3

Schema:

Key	Value
Type	bool

Description:

After enabling this configuration, deepflow-agent will periodically report the process information specified in inputs.proc.process_matcher to deepflow-server. After synchronizing process information, all eBPF observability data will automatically inject the global process ID (gprocess_id) tag.

Note: When enabling this feature, the specific process list must also be specified in inputs.proc.process_matcher, i.e., proc.gprocess_info must be included in inputs.proc.process_matcher.[*].enabled_features.

This configuration only applies to agents of cloud server types (CHOST_VM, CHOST_BM) and container types (K8S_VM, K8S_BM). Use the command deepflow-ctl agent list to determine the specific agent type in CLI environments.

#2.1.2 Directory of /proc

Tags:

agent_restart

FQCN:

inputs.proc.proc_dir_path

Upgrade from old version: static_config.os-proc-root

Default value:

inputs:
  proc:
    proc_dir_path: /proc

1
2
3

Schema:

Key	Value
Type	string

Description:

The /proc fs mount path.

#2.1.3 Socket Information Synchronization Interval

Tags:

agent_restart

FQCN:

inputs.proc.socket_info_sync_interval

Upgrade from old version: static_config.os-proc-socket-sync-interval

Default value:

inputs:
  proc:
    socket_info_sync_interval: 0ns

1
2
3

Schema:

Key	Value
Type	duration
Range	['0ns', '1h']

Description:

Synchronization interval for process Socket information.

'0ns' means disabled, do not configure a value less than 1s except for 0.

Note: When enabling this feature, the specific process list must also be specified in inputs.proc.process_matcher, i.e., inputs.proc.socket_info_sync_interval must be included in inputs.proc.process_matcher.[*].enabled_features. Additionally, ensure inputs.proc.enabled is configured to true.

#2.1.4 Minimal Lifetime

Tags:

agent_restart

FQCN:

inputs.proc.min_lifetime

Upgrade from old version: static_config.os-proc-socket-min-lifetime

Default value:

inputs:
  proc:
    min_lifetime: 3s

1
2
3

Schema:

Key	Value
Type	duration
Range	['1s', '1h']

Description:

Socket and Process will not be reported if their uptime is lower than this threshold.

#2.1.5 Tag Extraction

#2.1.5.1 Script Command

Tags:

agent_restart

FQCN:

inputs.proc.tag_extraction.script_command

Upgrade from old version: static_config.os-app-tag-exec

Default value:

inputs:
  proc:
    tag_extraction:
      script_command: []

1
2
3
4

Schema:

Key	Value
Type	string

Description:

Execute the command every time when scan the process, expect get the process tag from stdout in yaml format, the example yaml format as follow:

- pid: 1
  tags:
  - key: xxx
    value: xxx
- pid: 2
  tags:
  - key: xxx
    value: xxx

1
2
3
4
5
6
7
8

Example configuration:

inputs:
  proc:
    tag_extraction:
      script_command: ["cat", "/tmp/tag.yaml"]

1
2
3
4

#2.1.5.2 Execution Username

Tags:

agent_restart

FQCN:

inputs.proc.tag_extraction.exec_username

Upgrade from old version: static_config.os-app-tag-exec-user

Default value:

inputs:
  proc:
    tag_extraction:
      exec_username: deepflow

1
2
3
4

Schema:

Key	Value
Type	string

Description:

The user who should execute the script_command command.

#2.1.6 Process Matcher

Tags:

agent_restart

FQCN:

inputs.proc.process_matcher

Upgrade from old version: static_config.os-proc-regex

Default value:

inputs:
  proc:
    process_matcher:
    - enabled_features:
      - proc.gprocess_info
      ignore: true
      match_regex: ^(sleep|sh|bash|pause|runc)$
      only_in_container: false
    - enabled_features:
      - ebpf.profile.on_cpu
      - proc.gprocess_info
      match_regex: \bjava( +\S+)* +-jar +(\S*/)*([^ /]+\.jar)
      match_type: cmdline_with_args
      only_in_container: false
      rewrite_name: $3
    - enabled_features:
      - ebpf.profile.on_cpu
      - proc.gprocess_info
      match_regex: \bpython(\S)*( +-\S+)* +(\S*/)*([^ /]+)
      match_type: cmdline_with_args
      only_in_container: false
      rewrite_name: $4
    - enabled_features:
      - ebpf.profile.on_cpu
      - proc.gprocess_info
      match_regex: ^deepflow-
      only_in_container: false
    - enabled_features:
      - proc.gprocess_info
      match_regex: .*

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

Schema:

Key	Value
Type	dict

Description:

List of advanced features enabled for specific processes.

Will traverse over the entire array, so the previous ones will be matched first. when match_type is parent_process_name, will recursive to match parent proc name, and rewrite_name field will ignore. rewrite_name can replace by regexp capture group and windows style environment variable, for example: $1-py-script-%HOSTNAME% will replace regexp capture group 1 and HOSTNAME env var.

Configuration Item:

match_regex: The regexp use for match the process, default value is ""
match_type: regexp match field, default value is process_name, options are [process_name, cmdline, cmdline_with_args, parent_process_name, tag]
ignore: Whether to ignore matched processes, default value is false
rewrite_name: The name will replace the process name or cmd use regexp replace. Default value "" means no replacement.
enabled_features: List of features enabled for matched processes. Available options:
- proc.gprocess_info (Ensure inputs.proc.enabled is configured to true)
- proc.golang_symbol_table (Ensure inputs.proc.symbol_table.golang_specific.enabled is configured to true)
- proc.socket_list (Ensure inputs.proc.socket_info_sync_interval is configured to a number > 0)
- ebpf.socket.uprobe.golang (Ensure inputs.ebpf.socket.uprobe.golang.enabled is configured to true)
- ebpf.socket.uprobe.tls (Ensure inputs.ebpf.socket.uprobe.tls.enabled is configured to true)
- ebpf.profile.on_cpu (Ensure inputs.ebpf.profile.on_cpu.disabled is configured to false)
- ebpf.profile.off_cpu (Ensure inputs.ebpf.profile.off_cpu.disabled is configured to false)
- ebpf.profile.memory (Ensure inputs.ebpf.profile.memory.disabled is configured to false)

Example:

inputs:
  proc:
    process_matcher:
    - match_regex: python3 (.*)\.py
      match_type: cmdline
      match_languages: []
      match_usernames: []
      only_in_container: true
      only_with_tag: false
      ignore: false
      rewrite_name: $1-py-script
      enabled_features: [ebpf.socket.uprobe.golang, ebpf.profile.on_cpu]
    - match_regex: (?P<PROC_NAME>nginx)
      match_type: process_name
      rewrite_name: ${PROC_NAME}-%HOSTNAME%
    - match_regex: "nginx"
      match_type: parent_process_name
      ignore: true
    - match_regex: .*sleep.*
      match_type: process_name
      ignore: true
    - match_regex: .+ # match after concatenating a tag key and value pair using colon,
                      # i.e., an regex `app:.+` can match all processes has a `app` tag
      match_type: tag

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

#2.1.6.1 Match Regex

Tags:

agent_restart

FQCN:

inputs.proc.process_matcher.match_regex

Upgrade from old version: static_config.os-proc-regex.match-regex

Default value:

inputs:
  proc:
    process_matcher:
    - match_regex: ''

1
2
3
4

Schema:

Key	Value
Type	string

Description:

The regex of matcher.

#2.1.6.2 Match Type

Tags:

agent_restart

FQCN:

inputs.proc.process_matcher.match_type

Upgrade from old version: static_config.os-proc-regex.match-type

Default value:

inputs:
  proc:
    process_matcher:
    - match_type: process_name

1
2
3
4

Enum options:

Value	Note
process_name
cmdline
cmdline_with_args
parent_process_name
tag

Schema:

Key	Value
Type	string

Description:

The type of matcher.

#2.1.6.3 Match Languages

Tags:

agent_restart

FQCN:

inputs.proc.process_matcher.match_languages

Default value:

inputs:
  proc:
    process_matcher:
    - match_languages: []

1
2
3
4

Enum options:

Value	Note
java
golang
python
nodejs
dotnet

Schema:

Key	Value
Type	string

Description:

Default value [] match all languages.

#2.1.6.4 Match Usernames

Tags:

agent_restart

FQCN:

inputs.proc.process_matcher.match_usernames

Default value:

inputs:
  proc:
    process_matcher:
    - match_usernames: []

1
2
3
4

Schema:

Key	Value
Type	string

Description:

Default value [] match all usernames.

#2.1.6.5 Only in Container

Tags:

agent_restart

FQCN:

inputs.proc.process_matcher.only_in_container

Default value:

inputs:
  proc:
    process_matcher:
    - only_in_container: true

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

Default value true means only match processes in container.

#2.1.6.6 Only with Tag

Tags:

agent_restart

FQCN:

inputs.proc.process_matcher.only_with_tag

Upgrade from old version: static_config.os-proc-sync-tagged-only

Default value:

inputs:
  proc:
    process_matcher:
    - only_with_tag: false

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

Default value false means match processes with or without tags.

#2.1.6.7 Ignore

Tags:

agent_restart

FQCN:

inputs.proc.process_matcher.ignore

Upgrade from old version: static_config.os-proc-regex.action

Default value:

inputs:
  proc:
    process_matcher:
    - ignore: false

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

Whether to ignore matched processes.

#2.1.6.8 Rewrite Name

Tags:

agent_restart

FQCN:

inputs.proc.process_matcher.rewrite_name

Upgrade from old version: static_config.os-proc-regex.rewrite-name

Default value:

inputs:
  proc:
    process_matcher:
    - rewrite_name: ''

1
2
3
4

Schema:

Key	Value
Type	string

Description:

New name after matched.

#2.1.6.9 Enabled Features

Tags:

agent_restart

FQCN:

inputs.proc.process_matcher.enabled_features

Upgrade from old version: static_config.ebpf.on-cpu-profile.regex, static_config.ebpf.off-cpu-profile.regex

Default value:

inputs:
  proc:
    process_matcher:
    - enabled_features: []

1
2
3
4

Enum options:

Value	Note
proc.gprocess_info
proc.golang_symbol_table
proc.socket_list
ebpf.socket.uprobe.golang
ebpf.socket.uprobe.tls
ebpf.profile.on_cpu
ebpf.profile.off_cpu
ebpf.profile.memory

Schema:

Key	Value
Type	string

Description:

Also ensure the global configuration parameters for related features are enabled:

proc.gprocess_info (Ensure inputs.proc.enabled is configured to true)
proc.golang_symbol_table (Ensure inputs.proc.symbol_table.golang_specific.enabled is configured to true)
proc.socket_list (Ensure inputs.proc.socket_info_sync_interval is configured to a number > 0)
ebpf.socket.uprobe.golang (Ensure inputs.ebpf.socket.uprobe.golang.enabled is configured to true)
ebpf.socket.uprobe.tls (Ensure inputs.ebpf.socket.uprobe.tls.enabled is configured to true)
ebpf.profile.on_cpu (Ensure inputs.ebpf.profile.on_cpu.disabled is configured to false)
ebpf.profile.off_cpu (Ensure inputs.ebpf.profile.off_cpu.disabled is configured to false)
ebpf.profile.memory (Ensure inputs.ebpf.profile.memory.disabled is configured to false)

#2.1.7 Symbol Table

#2.1.7.1 Golang-specific

#2.1.7.1.1 Enabled

Tags:

agent_restart

FQCN:

inputs.proc.symbol_table.golang_specific.enabled

Upgrade from old version: static_config.ebpf.uprobe-process-name-regexs.golang-symbol

Default value:

inputs:
  proc:
    symbol_table:
      golang_specific:
        enabled: false

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

Whether to enable Golang-specific symbol table parsing.

This feature acts on Golang processes that have trimmed the standard symbol table. When this feature is enabled, for processes with Golang version >= 1.13 and < 1.18, when the standard symbol table is missing, the Golang-specific symbol table will be parsed to complete uprobe data collection. Note that enabling this feature may cause the eBPF initialization process to take ten minutes.

Example:

You've encountered the following warning log:

[eBPF] WARNING: func resolve_bin_file() [user/go_tracer.c:558] Go process pid 1946
[path: /proc/1946/root/usr/local/bin/kube-controller-manager] (version: go1.16). Not find any symbols!

1
2

Suppose there is a Golang process with a process ID of '1946.'

To initially confirm whether the executable file for this process has a symbol table:

Retrieve the executable file's path using the process ID:

# ls -al /proc/1946/exe
/proc/1946/exe -> /usr/local/bin/kube-controller-manager

1
2

Check if there is a symbol table:

# nm /proc/1946/root/usr/local/bin/kube-controller-manager
nm: /proc/1946/root/usr/local/bin/kube-controller-manager: no symbols

1
2

If "no symbols" is encountered, it indicates the absence of a symbol table. In such a scenario, we need to configure the "golang-symbol" setting.
During the agent startup process, you will observe the following log information: (The entry address for the function crypto/tls.(*Conn).Write has already been resolved, i.e., entry:0x25fca0).
```
[eBPF] INFO Uprobe [/proc/1946/root/usr/local/bin/kube-controller-manager] pid:1946 go1.16.0
entry:0x25fca0 size:1952 symname:crypto/tls.(*Conn).Write probe_func:uprobe_go_tls_write_enter rets_count:0
```
1
2
The logs indicate that the Golang program has been successfully hooked.

Note: When enabling this feature, the specific process list must also be specified in inputs.proc.process_matcher, i.e., proc.golang_symbol_table must be included in inputs.proc.process_matcher.[*].enabled_features.

#2.1.7.2 Java

#2.1.7.2.1 Refresh Defer Duration

Tags:

agent_restart

FQCN:

inputs.proc.symbol_table.java.refresh_defer_duration

Upgrade from old version: static_config.ebpf.java-symbol-file-refresh-defer-interval

Default value:

inputs:
  proc:
    symbol_table:
      java:
        refresh_defer_duration: 60s

1
2
3
4
5

Schema:

Key	Value
Type	duration
Range	['5s', '3600s']

Description:

When the deepflow-agent detects unresolved function names in the Java process call stack, it triggers the generation of the process function symbol table and updates the symbol cache. Currently, the Java symbol file is continuously updated, and the duration is used to control the delay in updating the symbol cache with the new symbol file. This delay is necessary because Java uses a JIT (Just-In-Time) compilation mechanism, which requires a warm-up phase for symbol generation. To obtain more complete Java symbols, the update of the Java symbol cache is deferred. This approach also helps avoid frequent symbol cache refreshes due to missing symbols, which could otherwise result in significant CPU resource consumption.

#2.1.7.2.2 Maximum Symbol File Size

Tags:

agent_restart deprecated

FQCN:

inputs.proc.symbol_table.java.max_symbol_file_size

Upgrade from old version: static_config.ebpf.java-symbol-file-max-space-limit

Default value:

inputs:
  proc:
    symbol_table:
      java:
        max_symbol_file_size: 10

1
2
3
4
5

Schema:

Key	Value
Type	int
Unit	MiB
Range	[2, 100]

Description:

All Java symbol files are stored in the '/tmp' directory mounted by the deepflow-agent. To prevent excessive occupation of host node space due to large Java symbol files, a maximum size limit is set for each generated Java symbol file.

#2.2 cBPF

#2.2.1 Common

#2.2.1.1 Packet Capture Mode

Tags:

hot_update

FQCN:

inputs.cbpf.common.capture_mode

Upgrade from old version: tap_mode

Default value:

inputs:
  cbpf:
    common:
      capture_mode: 0

1
2
3
4

Enum options:

Value	Note
0	Local
1	Virtual Mirror
2	Physical Mirror

Schema:

Key	Value
Type	int

Description:

Virtual Mirror mode is used when deepflow-agent cannot directly capture the traffic from the source. For example:

in the K8s macvlan environment, capture the Pod traffic through the Node NIC
in the Hyper-V environment, capture the VM traffic through the Hypervisor NIC
in the ESXi environment, capture traffic through VDS/VSS local SPAN
in the DPDK environment, capture traffic through DPDK ring buffer

Use Physical Mirror mode when deepflow-agent captures traffic through physical switch mirroring.

Physical Mirror is only supported in the Enterprise Edition.

#2.2.2 Capture via AF_PACKET

#2.2.2.1 Interface Regex

Tags:

hot_update

FQCN:

inputs.cbpf.af_packet.interface_regex

Upgrade from old version: tap_interface_regex

Default value:

inputs:
  cbpf:
    af_packet:
      interface_regex: ^(tap.*|cali.*|veth.*|eth.*|en[osipx].*|lxc.*|lo|[0-9a-f]+_h)$

1
2
3
4

Schema:

Key	Value
Type	string
Range	[0, 65535]

Description:

Regular expression of NIC name for collecting traffic.

Explanation of the default configuration:

Localhost:     lo
Common NIC:    eth.*|en[osipx].*
QEMU VM NIC:   tap.*
Flannel:       veth.*
Calico:        cali.*
Cilium         lxc.*
Kube-OVN       [0-9a-f]+_h$

1
2
3
4
5
6
7

When it is not configured, it indicates that network card traffic is not being collected

#2.2.2.2 Inner Net Namespace Capture Enabled

Tags:

agent_restart

FQCN:

inputs.cbpf.af_packet.inner_interface_capture_enabled

Default value:

inputs:
  cbpf:
    af_packet:
      inner_interface_capture_enabled: false

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

Whether to collect traffic in sub net namespaces. When enabled, agent will spawn recv engine threads to capture traffic in different namespaces, causing additional memory consumption for each namespace captured. The default setting of inputs.cbpf.af_packet.tunning.ring_blocks is 128, which means that the memory consumption will be 128 * 1MB for each namespace. For example, a node with 20 pods will require 20 * 128 * 1MB = 2.56GB for dispatcher. Make sure to estimate this memory consumption before enabling this feature. Enabling inputs.cbpf.af_packet.tunning.ring_blocks_enabled and change inputs.cbpf.af_packet.tunning.ring_blocks to reduce memory consumption.

#2.2.2.3 Inner Net Namespace Interface Regex

Tags:

hot_update

FQCN:

inputs.cbpf.af_packet.inner_interface_regex

Default value:

inputs:
  cbpf:
    af_packet:
      inner_interface_regex: ^eth\d+$

1
2
3
4

Schema:

Key	Value
Type	string
Range	[0, 65535]

Description:

Regular expression of NIC name for collecting traffic in sub net namespaces.

#2.2.2.4 Bond Interfaces

Tags:

agent_restart

FQCN:

inputs.cbpf.af_packet.bond_interfaces

Upgrade from old version: static_config.tap-interface-bond-groups

Default value:

inputs:
  cbpf:
    af_packet:
      bond_interfaces: []

1
2
3
4

Schema:

Key	Value
Type	dict

Description:

Packets of interfaces in the same group can be aggregated together, Only effective when inputs.cbpf.common.capture_mode is 0.

Example:

inputs:
  cbpf:
    af_packet:
      bond_interfaces:
      - slave_interfaces: [eth0, eth1]
      - slave_interfaces: [eth2, eth3]

1
2
3
4
5
6

#2.2.2.4.1 Slave Interfaces

Tags:

agent_restart

FQCN:

inputs.cbpf.af_packet.bond_interfaces.slave_interfaces

Upgrade from old version: static_config.tap-interface-bond-groups.tap-interfaces

Default value:

inputs:
  cbpf:
    af_packet:
      bond_interfaces:
      - slave_interfaces: []

1
2
3
4
5

Schema:

Key	Value
Type	string

Description:

The slave interfaces of one bond interface.

#2.2.2.5 Extra Network Namespace Regex

Tags:

hot_update ee_feature

FQCN:

inputs.cbpf.af_packet.extra_netns_regex

Upgrade from old version: extra_netns_regex

Default value:

inputs:
  cbpf:
    af_packet:
      extra_netns_regex: ''

1
2
3
4

Schema:

Key	Value
Type	string

Description:

Packet will be captured in regex matched namespaces besides the default namespace. NICs captured in extra namespaces are also filtered with inputs.cbpf.af_packet.interface_regex.

Default value "" means no extra network namespace (default namespace only).

#2.2.2.6 Extra BPF Filter

Tags:

hot_update

FQCN:

inputs.cbpf.af_packet.extra_bpf_filter

Upgrade from old version: capture_bpf

Default value:

inputs:
  cbpf:
    af_packet:
      extra_bpf_filter: ''

1
2
3
4

Schema:

Key	Value
Type	string
Range	[0, 512]

Description:

If not configured, all traffic will be collected. Please refer to BPF syntax: https://biot.com/capstats/bpf.html (opens new window)

#2.2.2.7 TAP Interfaces

Tags:

deprecated

FQCN:

inputs.cbpf.af_packet.src_interfaces

Upgrade from old version: static_config.src-interfaces

Default value:

inputs:
  cbpf:
    af_packet:
      src_interfaces: []

1
2
3
4

Schema:

Key	Value
Type	string

#2.2.2.8 VLAN PCP in Physical Mirror Traffic

Tags:

agent_restart ee_feature

FQCN:

inputs.cbpf.af_packet.vlan_pcp_in_physical_mirror_traffic

Upgrade from old version: static_config.mirror-traffic-pcp

Default value:

inputs:
  cbpf:
    af_packet:
      vlan_pcp_in_physical_mirror_traffic: 0

1
2
3
4

Schema:

Key	Value
Type	int
Range	[0, 9]

Description:

When this configuration <= 7 calculate TAP value from vlan tag only if vlan pcp matches this value.
when this configuration is 8 calculate TAP value from outer vlan tag,
when this configuration is 9 calculate TAP value from inner vlan tag.

#2.2.2.9 BPF Filter Disabled

Tags:

agent_restart

FQCN:

inputs.cbpf.af_packet.bpf_filter_disabled

Upgrade from old version: static_config.bpf-disabled

Default value:

inputs:
  cbpf:
    af_packet:
      bpf_filter_disabled: false

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

It is found that there may be bugs in BPF traffic filtering under some versions of Linux Kernel. After this configuration is enabled, deepflow-agent will not use the filtering capabilities of BPF, and will filter by itself after capturing full traffic. Note that this may significantly increase the resource overhead of deepflow-agent.

#2.2.2.10 Skip NPB BPF

Tags:

agent_restart

FQCN:

inputs.cbpf.af_packet.skip_npb_bpf

Upgrade from old version: static_config.skip-npb-bpf

Default value:

inputs:
  cbpf:
    af_packet:
      skip_npb_bpf: false

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

If the NIC on the data plane has ERSPAN tunnel traffic but does not NPB traffic, enable the switch to collect ERSPAN traffic.

#2.2.2.11 Tunning

#2.2.2.11.1 Socket Version

Tags:

hot_update

FQCN:

inputs.cbpf.af_packet.tunning.socket_version

Upgrade from old version: capture_socket_type

Default value:

inputs:
  cbpf:
    af_packet:
      tunning:
        socket_version: 0

1
2
3
4
5

Enum options:

Value	Note
0	Adaptive
2	AF_PACKET V2
3	AF_PACKET V3

Schema:

Key	Value
Type	int

Description:

AF_PACKET socket version in Linux environment.

#2.2.2.11.2 Ring Blocks Config Enabled

Tags:

agent_restart

FQCN:

inputs.cbpf.af_packet.tunning.ring_blocks_enabled

Upgrade from old version: static_config.afpacket-blocks-enabled

Default value:

inputs:
  cbpf:
    af_packet:
      tunning:
        ring_blocks_enabled: false

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

When inputs.cbpf.common.capture_mode != Physical Mirror, you need to explicitly turn on this switch to configure 'inputs.cbpf.af_packet.tunning.ring_blocks'.

#2.2.2.11.3 Ring Blocks

Tags:

agent_restart

FQCN:

inputs.cbpf.af_packet.tunning.ring_blocks

Upgrade from old version: static_config.afpacket-blocks

Default value:

inputs:
  cbpf:
    af_packet:
      tunning:
        ring_blocks: 128

1
2
3
4
5

Schema:

Key	Value
Type	int
Range	[8, 1000000]

Description:

deepflow-agent will automatically calculate the number of blocks used by AF_PACKET according to max_memory, which can also be specified using this configuration item. The size of each block is fixed at 1MB.

#2.2.2.11.4 Packet Fanout Count

Tags:

agent_restart

FQCN:

inputs.cbpf.af_packet.tunning.packet_fanout_count

Upgrade from old version: static_config.local-dispatcher-count

Default value:

inputs:
  cbpf:
    af_packet:
      tunning:
        packet_fanout_count: 1

1
2
3
4
5

Schema:

Key	Value
Type	int
Range	[1, 64]

Description:

The configuration takes effect when inputs.cbpf.common.capture_mode is Local and inputs.cbpf.af_packet.extra_netns_regex is null, PACKET_FANOUT is to enable load balancing and parallel processing, scaling dispatcher for better performance of handling network applications. When the packet_fanout_count is greater than 1, multiple dispatcher threads will be launched, consuming more CPU and memory. Increasing the packet_fanout_count helps to reduce the operating system's software interrupts on multi-core CPU servers.

Attention:

only valid for inputs.cbpf.common.capture_mode = Local
When self.inputs.cbpf.special_network.dpdk.source is eBPF, this configuration value is forced to be self.inputs.ebpf.tunning.userspace_worker_threads

#2.2.2.11.5 Packet Fanout Mode

Tags:

agent_restart

FQCN:

inputs.cbpf.af_packet.tunning.packet_fanout_mode

Upgrade from old version: static_config.packet-fanout-mode

Default value:

inputs:
  cbpf:
    af_packet:
      tunning:
        packet_fanout_mode: 0

1
2
3
4
5

Enum options:

Value	Note
0	PACKET_FANOUT_HASH
1	PACKET_FANOUT_LB
2	PACKET_FANOUT_CPU
3	PACKET_FANOUT_ROLLOVER
4	PACKET_FANOUT_RND
5	PACKET_FANOUT_QM
6	PACKET_FANOUT_CBPF
7	PACKET_FANOUT_EBPF

Schema:

Key	Value
Type	int

Description:

The configuration is a parameter used with the PACKET_FANOUT feature in the Linux kernel to specify the desired packet distribution algorithm. Refer to:

#2.2.2.11.6 Interface Promisc Enabled

Tags:

agent_restart

FQCN:

inputs.cbpf.af_packet.tunning.interface_promisc_enabled

Default value:

inputs:
  cbpf:
    af_packet:
      tunning:
        interface_promisc_enabled: false

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

The following scenarios require promiscuous mode to be enabled:

inputs.cbpf.common.capture_mode is Virtual Mirror or Physical Mirror
inputs.cbpf.common.capture_mode is Local and traffic to the virtual machine cannot be collected Note: After the NIC is enabled in promiscuous mode, more traffic will be collected, resulting in lower performance

#2.2.3 Special Network

#2.2.3.1 DPDK

#2.2.3.1.1 source

Tags:

agent_restart ee_feature

FQCN:

inputs.cbpf.special_network.dpdk.source

Default value:

inputs:
  cbpf:
    special_network:
      dpdk:
        source: None

1
2
3
4
5

Enum options:

Value	Note
None
eBPF
pdump

Schema:

Key	Value
Type	string

Description:

Currently, there are two ways to collect DPDK traffic, including:

pdump: See details https://dpdk-docs.readthedocs.io/en/latest/prog_guide/multi_proc_support.html (opens new window)
eBPF: Use eBPF Uprobe to obtain DPDK traffic

#2.2.3.1.2 reorder cache window size

Tags:

agent_restart

FQCN:

inputs.cbpf.special_network.dpdk.reorder_cache_window_size

Default value:

inputs:
  cbpf:
    special_network:
      dpdk:
        reorder_cache_window_size: 60ms

1
2
3
4
5

Schema:

Key	Value
Type	duration
Range	['60ms', '100ms']

Description:

When inputs.cbpf.special_network.dpdk.source is eBPF, the larger the time window will cause the agent to use more memory.

#2.2.3.2 Libpcap

#2.2.3.2.1 Enabled

Tags:

agent_restart ee_feature

FQCN:

inputs.cbpf.special_network.libpcap.enabled

Upgrade from old version: static_config.libpcap-enabled

Default value:

inputs:
  cbpf:
    special_network:
      libpcap:
        enabled: false

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

Supports running on Windows and Linux, Low performance when using multiple interfaces. Default to true in Windows, false in Linux.

#2.2.3.3 vHost User

#2.2.3.3.1 vHost Socket Path

Tags:

agent_restart ee_feature

FQCN:

inputs.cbpf.special_network.vhost_user.vhost_socket_path

Upgrade from old version: static_config.vhost-socket-path

Default value:

inputs:
  cbpf:
    special_network:
      vhost_user:
        vhost_socket_path: ''

1
2
3
4
5

Schema:

Key	Value
Type	string

Description:

Supports running on Linux with mirror mode.

#2.2.3.4 Physical Switch

#2.2.3.4.1 sFlow Receiving Ports

Tags:

agent_restart ee_feature

FQCN:

inputs.cbpf.special_network.physical_switch.sflow_ports

Upgrade from old version: static_config.xflow-collector.sflow-ports

Default value:

inputs:
  cbpf:
    special_network:
      physical_switch:
        sflow_ports: []

1
2
3
4
5

Schema:

Key	Value
Type	int
Range	[1, 65535]

Description:

This feature is only supported by the Enterprise Edition of Trident. In general, sFlow uses port 6343. Default value [] means that no sFlow data will be collected.

#2.2.3.4.2 NetFlow Receiving Ports

Tags:

agent_restart ee_feature

FQCN:

inputs.cbpf.special_network.physical_switch.netflow_ports

Upgrade from old version: static_config.xflow-collector.netflow-ports

Default value:

inputs:
  cbpf:
    special_network:
      physical_switch:
        netflow_ports: []

1
2
3
4
5

Schema:

Key	Value
Type	int
Range	[1, 65535]

Description:

This feature is only supported by the Enterprise Edition of Trident. Additionally, only NetFlow v5 is currently supported. In general, NetFlow uses port 2055. Default value [] means that no NetFlow data will be collected.

#2.2.4 Tunning

#2.2.4.1 Dispatcher Queue Enabled

Tags:

agent_restart

FQCN:

inputs.cbpf.tunning.dispatcher_queue_enabled

Upgrade from old version: static_config.dispatcher-queue

Default value:

inputs:
  cbpf:
    tunning:
      dispatcher_queue_enabled: false

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

The configuration takes effect when inputs.cbpf.common.capture_mode is Local or Virtual Mirror, dispatcher-queue is always true when inputs.cbpf.common.capture_mode is Physical Mirror.

Available for all recv_engines.

#2.2.4.2 Maximum Capture Packet Size

Tags:

hot_update

FQCN:

inputs.cbpf.tunning.max_capture_packet_size

Upgrade from old version: capture_packet_size

Default value:

inputs:
  cbpf:
    tunning:
      max_capture_packet_size: 65535

1
2
3
4

Schema:

Key	Value
Type	int
Unit	byte
Range	[128, 65535]

Description:

DPDK environment does not support this configuration.

#2.2.4.3 Raw Packet Buffer Block Size

Tags:

agent_restart ee_feature

FQCN:

inputs.cbpf.tunning.raw_packet_buffer_block_size

Upgrade from old version: static_config.analyzer-raw-packet-block-size

Default value:

inputs:
  cbpf:
    tunning:
      raw_packet_buffer_block_size: 65536

1
2
3
4

Schema:

Key	Value
Type	int
Range	[65536, 16000000]

Description:

In certain modes, raw packets will go through a queue before being processed. To avoid memory allocation for each packet, a memory block of size raw_packet_buffer_block_size is allocated for multiple packets. Larger value will reduce memory allocation for raw packet, but will also delay memory free. This configuration is effective for the following inputs.cbpf.common.capture_mode:

analyzer mode
local mode with inputs.cbpf.af_packet.inner_interface_capture_enabled = true
local mode with inputs.cbpf.tunning.dispatcher_queue_enabled = true
mirror mode with inputs.cbpf.tunning.dispatcher_queue_enabled = true

#2.2.4.4 Raw Packet Queue Size

Tags:

agent_restart ee_feature

FQCN:

inputs.cbpf.tunning.raw_packet_queue_size

Upgrade from old version: static_config.analyzer-queue-size

Default value:

inputs:
  cbpf:
    tunning:
      raw_packet_queue_size: 131072

1
2
3
4

Schema:

Key	Value
Type	int
Range	[65536, 64000000]

Description:

The length of the following queues (only for inputs.cbpf.common.capture_mode = Physical Mirror):

0.1-bytes-to-parse
0.2-packet-to-flowgenerator
0.3-packet-to-pipeline

#2.2.4.5 Max Capture PPS

Tags:

hot_update

FQCN:

inputs.cbpf.tunning.max_capture_pps

Upgrade from old version: max_collect_pps

Default value:

inputs:
  cbpf:
    tunning:
      max_capture_pps: 1048576

1
2
3
4

Schema:

Key	Value
Type	int
Unit	pps
Range	[1, 10000000]

Description:

Maximum packet rate allowed for collection.

Available for all recv_engines.

#2.2.5 Preprocess

#2.2.5.1 Tunnel Decap Protocols

Tags:

hot_update

FQCN:

inputs.cbpf.preprocess.tunnel_decap_protocols

Upgrade from old version: decap_type

Default value:

inputs:
  cbpf:
    preprocess:
      tunnel_decap_protocols:
      - 1
      - 2

1
2
3
4
5
6

Enum options:

Value	Note
1	VXLAN
2	IPIP
3	GRE
4	Geneve
5	VXLAN-NSH

Schema:

Key	Value
Type	int

Description:

Decapsulation tunnel protocols, Only the Enterprise Edition supports decap GRE and VXLAN-NSH.

#2.2.5.2 Tunnel Trim Protocols

Tags:

agent_restart

FQCN:

inputs.cbpf.preprocess.tunnel_trim_protocols

Upgrade from old version: static_config.trim-tunnel-types

Default value:

inputs:
  cbpf:
    preprocess:
      tunnel_trim_protocols: []

1
2
3
4

Enum options:

Value	Note
ERSPAN
VXLAN
TEB

Schema:

Key	Value
Type	string

Description:

Whether to remove the tunnel header in mirrored traffic. Only the Enterprise Edition supports decap ERSPAN and TEB.

#2.2.5.3 Packet Segmentation Reassembly Ports

Tags:

agent_restart ee_feature

FQCN:

inputs.cbpf.preprocess.packet_segmentation_reassembly

Upgrade from old version: static_config.packet-segmentation-reassembly

Default value:

inputs:
  cbpf:
    preprocess:
      packet_segmentation_reassembly: []

1
2
3
4

Schema:

Key	Value
Type	int
Range	[1, 65535]

Description:

For the specified ports, consecutive TCP packets will be aggregated together for application log parsing.

#2.2.6 Physical Mirror Traffic

#2.2.6.1 Default Capture Network Type

Tags:

agent_restart ee_feature

FQCN:

inputs.cbpf.physical_mirror.default_capture_network_type

Upgrade from old version: static_config.default-tap-type

Default value:

inputs:
  cbpf:
    physical_mirror:
      default_capture_network_type: 3

1
2
3
4

Enum options:

Value	Note
3	Cloud Network
DYNAMIC_OPTIONS	DYNAMIC_OPTIONS

Schema:

Key	Value
Type	int

Description:

deepflow-agent will mark the TAP (Traffic Access Point) location according to the outer vlan tag in the mirrored traffic of the physical switch. When the vlan tag has no corresponding TAP value, or the vlan pcp does not match the inputs.cbpf.af_packet.vlan_pcp_in_physical_mirror_traffic, it will assign the TAP value. This configuration item. Default value 3 means Cloud Network.

#2.2.6.2 Packet Dedup Disabled

Tags:

agent_restart ee_feature

FQCN:

inputs.cbpf.physical_mirror.packet_dedup_disabled

Upgrade from old version: static_config.analyzer-dedup-disabled

Default value:

inputs:
  cbpf:
    physical_mirror:
      packet_dedup_disabled: false

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

Whether to enable mirror traffic deduplication when inputs.cbpf.common.capture_mode = Physical Mirror.

#2.2.6.3 Gateway Traffic of Private Cloud

Tags:

agent_restart ee_feature

FQCN:

inputs.cbpf.physical_mirror.private_cloud_gateway_traffic

Upgrade from old version: static_config.cloud-gateway-traffic

Default value:

inputs:
  cbpf:
    physical_mirror:
      private_cloud_gateway_traffic: false

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

Whether it is the mirrored traffic of NFVGW (cloud gateway) when inputs.cbpf.common.capture_mode = Physical Mirror.

#2.3 eBPF

#2.3.1 Disabled

Tags:

agent_restart

FQCN:

inputs.ebpf.disabled

Upgrade from old version: static_config.ebpf.disabled

Default value:

inputs:
  ebpf:
    disabled: false

1
2
3

Schema:

Key	Value
Type	bool

Description:

Whether to enable eBPF features.

#2.3.2 Socket

#2.3.2.1 Uprobe

#2.3.2.1.1 Golang

#2.3.2.1.1.1 Enabled

Tags:

agent_restart

FQCN:

inputs.ebpf.socket.uprobe.golang.enabled

Upgrade from old version: static_config.ebpf.uprobe-golang-trace-enabled, static_config.ebpf.uprobe-process-name-regexs.golang

Default value:

inputs:
  ebpf:
    socket:
      uprobe:
        golang:
          enabled: false

1
2
3
4
5
6

Schema:

Key	Value
Type	bool

Description:

Whether golang process enables HTTP2/HTTPS protocol data collection and auto-tracing. go auto-tracing also dependent go-tracing-timeout.

Note: When enabling this feature, the specific process list must also be specified in inputs.proc.process_matcher, i.e., ebpf.socket.uprobe.golang must be included in inputs.proc.process_matcher.[*].enabled_features.

#2.3.2.1.1.2 Tracing Timeout

Tags:

agent_restart

FQCN:

inputs.ebpf.socket.uprobe.golang.tracing_timeout

Upgrade from old version: static_config.ebpf.go-tracing-timeout

Default value:

inputs:
  ebpf:
    socket:
      uprobe:
        golang:
          tracing_timeout: 120s

1
2
3
4
5
6

Schema:

Key	Value
Type	duration
Range	['0ns', '1d']

Description:

The expected maximum time interval between the server receiving the request and returning the response, If the value is '0ns', this feature is disabled. Tracing only considers the thread number.

#2.3.2.1.2 TLS

#2.3.2.1.2.1 Enabled

Tags:

agent_restart

FQCN:

inputs.ebpf.socket.uprobe.tls.enabled

Upgrade from old version: static_config.ebpf.uprobe-openssl-trace-enabled, static_config.ebpf.uprobe-process-name-regexs.openssl

Default value:

inputs:
  ebpf:
    socket:
      uprobe:
        tls:
          enabled: false

1
2
3
4
5
6

Schema:

Key	Value
Type	bool

Description:

Whether the process that uses the openssl library to enable HTTPS protocol data collection.

One can use the following method to determine whether an application process can use Uprobe hook openssl library to access encrypted data:

Use the command cat /proc/<PID>/maps | grep "libssl.so" to check if it contains information about openssl. If it does, it indicates that this process is using the openssl library.

After enabled, deepflow-agent will retrieve process information that matches the regular expression, hooking the corresponding encryption/decryption interfaces of the openssl library. In the logs, you will encounter a message similar to the following:

[eBPF] INFO openssl uprobe, pid:1005, path:/proc/1005/root/usr/lib64/libssl.so.1.0.2k

Note: When enabling this feature, the specific process list must also be specified in inputs.proc.process_matcher, i.e., ebpf.socket.uprobe.tls must be included in inputs.proc.process_matcher.[*].enabled_features.

#2.3.2.1.3 DPDK

#2.3.2.1.3.1 DPDK Application Command Name

Tags:

agent_restart ee_feature

FQCN:

inputs.ebpf.socket.uprobe.dpdk.command

Default value:

inputs:
  ebpf:
    socket:
      uprobe:
        dpdk:
          command: ''

1
2
3
4
5
6

Schema:

Key	Value
Type	string

Description:

Set the command name of the DPDK application, eBPF will automatically locate and trace packets for data collection.

Example: In the command line /usr/bin/mydpdk, it can be set as command: mydpdk

In scenarios where DPDK acts as the vhost-user backend, data exchange between the virtual machine and the DPDK application occurs through virtqueues (vrings). eBPF can automatically hook into the vring interface without requiring any modifications to DPDK or the virtual machine, enabling packet capture and traffic observability with zero additional configuration. In contrast, capturing packets on physical NICs requires explicit configuration of the corresponding DPDK driver interfaces.

#2.3.2.1.3.2 DPDK Application RX Hooks Configuration

Tags:

agent_restart ee_feature

FQCN:

inputs.ebpf.socket.uprobe.dpdk.rx_hooks

Default value:

inputs:
  ebpf:
    socket:
      uprobe:
        dpdk:
          rx_hooks: []

1
2
3
4
5
6

Schema:

Key	Value
Type	string

Description:

Fill in the appropriate packet reception hook point according to the actual network card driver. You can use the command 'lspci -vmmk' to find the network card driver type. For example:

Slot:   04:00.0
Class:  Ethernet controller
Vendor: Intel Corporation
Device: Ethernet Controller XL710 for 40GbE QSFP+
SVendor:        Unknown vendor 1e18
SDevice:        Device 4712
Rev:    02
Driver: igb_uio
Module: i40e

1
2
3
4
5
6
7
8
9

In the example above, "Driver: igb_uio" indicates a DPDK-managed device (other options include "vfio-pci" and "uio_pci_generic", which are also managed by DPDK). The actual driver is 'i40e' (derived from 'Module: i40e').

You can use the sustainable profiling feature provided by DeepFlow to perform function profiling on the DPDK application and check the specific interface names. Alternatively, you can run the perf command on the node where the agent is located: perf record -F97 -a -g -p <DPDK application PID> -- sleep 30 and then use perf script | grep -E 'recv|xmit|rx|tx' | grep <drive_name> (drive_name may be ixgbe/i40e/mlx5) to confirm the driver interfaces.

Below are some common interface names for different drivers, for reference only:

Physical NIC Drivers:
- Intel Drivers:
  - ixgbe: Supports Intel 82598/82599/X520/X540/X550 series NICs.
    - rx: ixgbe_recv_pkts, ixgbe_recv_pkts_vec
    - tx: ixgbe_xmit_pkts, ixgbe_xmit_fixed_burst_vec, ixgbe_xmit_pkts_vec
  - i40e: Supports Intel X710, XL710 series NICs.
    - rx: i40e_recv_pkts
    - tx: i40e_xmit_pkts
  - ice: Supports Intel E810 series NICs.
    - rx: ice_recv_pkts
    - tx: ice_xmit_pkts
- Mellanox Drivers:
  - mlx4: Supports Mellanox ConnectX-3 series NICs.
    - rx: mlx4_rx_burst
    - tx: mlx4_tx_burst
  - mlx5: Supports Mellanox ConnectX-4, ConnectX-5, ConnectX-6 series NICs.
    - rx: mlx5_rx_burst, mlx5_rx_burst_vec, mlx5_rx_burst_mprq
    - tx: Pending confirmation
- Broadcom Drivers:
  - bnxt: Supports Broadcom NetXtreme series NICs.
    - rx: bnxt_recv_pkts, bnxt_recv_pkts_vec (x86, Vector mode receive)
    - tx: bnxt_xmit_pkts, bnxt_xmit_pkts_vec (x86, Vector mode transmit)
Virtual NIC Drivers:
- Virtio Driver:
  - virtio: Supports Virtio-based virtual network interfaces.
    - rx: virtio_recv_pkts, virtio_recv_mergeable_pkts_packed, virtio_recv_pkts_packed, virtio_recv_pkts_vec, virtio_recv_pkts_inorder, virtio_recv_mergeable_pkts
    - tx: virtio_xmit_pkts_packed, virtio_xmit_pkts
- VMXNET3 Driver:
  - vmxnet3: Supports VMware's VMXNET3 virtual NICs.
    - rx: vmxnet3_recv_pkts
    - tx: vmxnet3_xmit_pkts

Example: rx_hooks: [ixgbe_recv_pkts, i40e_recv_pkts, virtio_recv_pkts, virtio_recv_mergeable_pkts]

Note: When using the burst mode of the current DPDK driver interface to send and receive packets, the number of eBPF instructions is limited to 4096 in older Linux kernels (below Linux 5.2). As a result, during DPDK packet capture, only a maximum of 16 packets can be captured. For Linux kernels 5.2 and above, up to 32 packets can be captured (this is typically the default value for DPDK burst mode). For kernels older than Linux 5.2, packet loss may occur (if the burst size exceeds 16).

#2.3.2.1.3.3 DPDK Application TX Hooks Configuration

Tags:

agent_restart ee_feature

FQCN:

inputs.ebpf.socket.uprobe.dpdk.tx_hooks

Default value:

inputs:
  ebpf:
    socket:
      uprobe:
        dpdk:
          tx_hooks: []

1
2
3
4
5
6

Schema:

Key	Value
Type	string

Description:

Specify the appropriate packet transmission hook point according to the actual network card driver. To obtain the driver method and configure the transmission hook point, as well as precautions，refer to the description of inputs.ebpf.socket.uprobe.dpdk.rx_hooks.

Example: tx_hooks: [i40e_xmit_pkts, virtio_xmit_pkts_packed, virtio_xmit_pkts]

#2.3.2.2 Kprobe

#2.3.2.2.1 kprobe disabled

Tags:

agent_restart

FQCN:

inputs.ebpf.socket.kprobe.disabled

Default value:

inputs:
  ebpf:
    socket:
      kprobe:
        disabled: false

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

When set to true, kprobe will be disabled.

#2.3.2.2.2 Unix Socket Enabled

Tags:

agent_restart

FQCN:

inputs.ebpf.socket.kprobe.enable_unix_socket

Default value:

inputs:
  ebpf:
    socket:
      kprobe:
        enable_unix_socket: false

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

When set to true, enable tracing of Unix domain sockets.

#2.3.2.2.3 Blacklist

#2.3.2.2.3.1 Port Numbers

Tags:

agent_restart

FQCN:

inputs.ebpf.socket.kprobe.blacklist.ports

Upgrade from old version: static_config.ebpf.kprobe-blacklist.port-list

Default value:

inputs:
  ebpf:
    socket:
      kprobe:
        blacklist:
          ports: ''

1
2
3
4
5
6

Schema:

Key	Value
Type	string

Description:

TCP&UDP Port Blacklist, Priority higher than kprobe-whitelist.

Example: ports: 80,1000-2000

#2.3.2.2.4 Whitelist

#2.3.2.2.4.1 Port Numbers

Tags:

agent_restart

FQCN:

inputs.ebpf.socket.kprobe.whitelist.ports

Upgrade from old version: static_config.ebpf.kprobe-whitelist.port-list

Default value:

inputs:
  ebpf:
    socket:
      kprobe:
        whitelist:
          ports: ''

1
2
3
4
5
6

Schema:

Key	Value
Type	string

Description:

TCP&UDP Port Whitelist, Priority lower than kprobe-blacklist. Use kprobe to collect data on ports that are not in the blacklist or whitelist.

Example: ports: 80,1000-2000

#2.3.2.3 Tunning

#2.3.2.3.1 Max Capture Rate

Tags:

hot_update

FQCN:

inputs.ebpf.socket.tunning.max_capture_rate

Upgrade from old version: static_config.ebpf.global-ebpf-pps-threshold

Default value:

inputs:
  ebpf:
    socket:
      tunning:
        max_capture_rate: 0

1
2
3
4
5

Schema:

Key	Value
Type	int
Unit	Per Second
Range	[0, 64000000]

Description:

Default value 0 means no limitation.

#2.3.2.3.2 Syscall_trace_id Disabled

Tags:

agent_restart

FQCN:

inputs.ebpf.socket.tunning.syscall_trace_id_disabled

Default value:

inputs:
  ebpf:
    socket:
      tunning:
        syscall_trace_id_disabled: false

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

When the trace_id is injected into all requests, the computation logic for all syscall_trace_id can be turned off. This will significantly reduce the impact of the eBPF hook on the CPU consumption of the application process.

#2.3.2.3.3 Disable Pre-allocating Memory

Tags:

agent_restart

FQCN:

inputs.ebpf.socket.tunning.map_prealloc_disabled

Upgrade from old version: static_config.ebpf.map-prealloc-disabled

Default value:

inputs:
  ebpf:
    socket:
      tunning:
        map_prealloc_disabled: false

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

When full map preallocation is too expensive, set this configuration to true will prevent memory pre-allocation during map definition, but it may result in some performance degradation. This configuration only applies to maps of type 'BPF_MAP_TYPE_HASH'. Currently applicable to socket trace and uprobe Golang/OpenSSL trace functionalities. Disabling memory preallocation will approximately reduce memory usage by 45MB.

#2.3.2.4 Preprocess

#2.3.2.4.1 OOOR Cache Size

Tags:

agent_restart ee_feature

FQCN:

inputs.ebpf.socket.preprocess.out_of_order_reassembly_cache_size

Upgrade from old version: static_config.ebpf.syscall-out-of-order-cache-size

Default value:

inputs:
  ebpf:
    socket:
      preprocess:
        out_of_order_reassembly_cache_size: 16

1
2
3
4
5

Schema:

Key	Value
Type	int
Range	[8, 1024]

Description:

OOOR: Out Of Order Reassembly

When out_of_order_reassembly_protocols is enabled, up to out_of_order_reassembly_cache_size eBPF socket events (each event consuming up to processors.request_log.tunning.payload_truncation bytes) will be cached in each TCP/UDP flow to prevent out-of-order events from impacting application protocol parsing. Since eBPF socket events are sent to user space in batches, out-of-order scenarios mainly occur when requests and responses within a single session are processed by different CPUs, causing the response to reach user space before the request.

#2.3.2.4.2 OOOR Protocols

Tags:

agent_restart ee_feature

FQCN:

inputs.ebpf.socket.preprocess.out_of_order_reassembly_protocols

Upgrade from old version: static_config.ebpf.syscall-out-of-order-reassembly

Default value:

inputs:
  ebpf:
    socket:
      preprocess:
        out_of_order_reassembly_protocols: []

1
2
3
4
5

Enum options:

Value	Note
DYNAMIC_OPTIONS

Schema:

Key	Value
Type	string

Description:

OOOR: Out Of Order Reassembly

When this capability is enabled for a specific application protocol, the agent will add out-of-order-reassembly processing for it. Note that the agent will consume more memory in this case, so please adjust the syscall-out-of-order-cache-size accordingly and monitor the agent's memory usage.

Supported protocols: https://www.deepflow.io/docs/features/l7-protocols/overview/ (opens new window)

Attention: use HTTP2 for gRPC Protocol.

#2.3.2.4.3 SR Protocols

Tags:

agent_restart ee_feature

FQCN:

inputs.ebpf.socket.preprocess.segmentation_reassembly_protocols

Upgrade from old version: static_config.ebpf.syscall-segmentation-reassembly

Default value:

inputs:
  ebpf:
    socket:
      preprocess:
        segmentation_reassembly_protocols: []

1
2
3
4
5

Enum options:

Value	Note
DYNAMIC_OPTIONS

Schema:

Key	Value
Type	string

Description:

SR: Segmentation Reassembly

When this capability is enabled for a specific application protocol, the agent will add segmentation-reassembly processing to merge application protocol content spread across multiple syscalls before parsing it. This enhances the success rate of application protocol parsing. Note that out_of_order_reassembly_protocols must also be enabled for this feature to be effective. Supported protocols: https://www.deepflow.io/docs/features/l7-protocols/overview/ (opens new window) Attention: use HTTP2 for gRPC Protocol.

#2.3.3 File

#2.3.3.1 IO Event

#2.3.3.1.1 Collect Mode

Tags:

agent_restart

FQCN:

inputs.ebpf.file.io_event.collect_mode

Upgrade from old version: static_config.ebpf.io-event-collect-mode

Default value:

inputs:
  ebpf:
    file:
      io_event:
        collect_mode: 1

1
2
3
4
5

Enum options:

Value	Note
0	Disabled
1	Request Life Cycle
2	All

Schema:

Key	Value
Type	int

Description:

Collection modes:

Disabled: Indicates that no IO events are collected.
Request Life Cycle: Indicates that only IO events within the request life cycle are collected.
All: Indicates that all IO events are collected.

#2.3.3.1.2 Minimal Duration

Tags:

agent_restart

FQCN:

inputs.ebpf.file.io_event.minimal_duration

Upgrade from old version: static_config.ebpf.io-event-minimal-duration

Default value:

inputs:
  ebpf:
    file:
      io_event:
        minimal_duration: 1ms

1
2
3
4
5

Schema:

Key	Value
Type	duration
Range	['1ns', '1s']

Description:

Only collect IO events with delay exceeding this threshold.

#2.3.4 Profile

#2.3.4.1 Unwinding

#2.3.4.1.1 DWARF unwinding disabled

Tags:

hot_update

FQCN:

inputs.ebpf.profile.unwinding.dwarf_disabled

Upgrade from old version: static_config.ebpf.dwarf-disabled

Default value:

inputs:
  ebpf:
    profile:
      unwinding:
        dwarf_disabled: true

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

The default setting is true, agent will use frame pointer based unwinding for all processes. If a process does not contain frame pointers, the stack cannot be displayed correctly. Setting it to false will enable DWARF based stack unwinding for all processes that do not contain frame pointers. Agent uses a heuristic algorithm to determine whether the process being analyzed contains frame pointers. Additionally, setting dwarf_regex to force DWARF based stack unwinding for certain processes.

#2.3.4.1.2 DWARF unwinding process matching regular expression

Tags:

hot_update

FQCN:

inputs.ebpf.profile.unwinding.dwarf_regex

Upgrade from old version: static_config.ebpf.dwarf-regex

Default value:

inputs:
  ebpf:
    profile:
      unwinding:
        dwarf_regex: ''

1
2
3
4
5

Schema:

Key	Value
Type	string

Description:

If set to empty, agennt will use a heuristic algorithm to determine whether the process being analyzed contains frame pointers, and will use DWARF based stack unwinding for processes that do not contain frame pointers. If set to a valid regular expression, agent will no longer infer whether a process contains frame pointers but will instead use the provided regular expression to match process names, applying DWARF based stack unwinding only to the matching processes.

#2.3.4.1.3 DWARF unwinding process map size

Tags:

hot_update

FQCN:

inputs.ebpf.profile.unwinding.dwarf_process_map_size

Upgrade from old version: static_config.ebpf.dwarf-process-map-size

Default value:

inputs:
  ebpf:
    profile:
      unwinding:
        dwarf_process_map_size: 1024

1
2
3
4
5

Schema:

Key	Value
Type	int
Range	[1, 131072]

Description:

Each process using DWARF unwind has an entry in this map, relating process id to DWARF unwind entries. The size of each one of these entries is arount 8K, the default setting will allocate around 8M kernel memory. This is a hash map, so size can be lower than max process id. The configuration is only effective if DWARF is enabled.

#2.3.4.1.4 DWARF unwinding shard map size

Tags:

hot_update

FQCN:

inputs.ebpf.profile.unwinding.dwarf_shard_map_size

Upgrade from old version: static_config.ebpf.dwarf-shard-map-size

Default value:

inputs:
  ebpf:
    profile:
      unwinding:
        dwarf_shard_map_size: 128

1
2
3
4
5

Schema:

Key	Value
Type	int
Range	[1, 4096]

Description:

The number of unwind entry shards for DWARF unwinding. The size of each one of these entries is 1M, the default setting will allocate around 128M kernel memory. The configuration is only effective if DWARF is enabled.

#2.3.4.2 On-CPU

#2.3.4.2.1 Disabled

Tags:

agent_restart

FQCN:

inputs.ebpf.profile.on_cpu.disabled

Upgrade from old version: static_config.ebpf.on-cpu-profile.disabled

Default value:

inputs:
  ebpf:
    profile:
      on_cpu:
        disabled: false

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

eBPF On-CPU profile switch.

Note: When enabling this feature, the specific process list must also be specified in inputs.proc.process_matcher, i.e., ebpf.profile.on_cpu must be included in inputs.proc.process_matcher.[*].enabled_features.

#2.3.4.2.2 Sampling Frequency

Tags:

agent_restart

FQCN:

inputs.ebpf.profile.on_cpu.sampling_frequency

Upgrade from old version: static_config.ebpf.on-cpu-profile.frequency

Default value:

inputs:
  ebpf:
    profile:
      on_cpu:
        sampling_frequency: 99

1
2
3
4
5

Schema:

Key	Value
Type	int
Range	[1, 1000]

Description:

eBPF On-CPU profile sampling frequency.

#2.3.4.2.3 Aggregate by CPU

Tags:

agent_restart

FQCN:

inputs.ebpf.profile.on_cpu.aggregate_by_cpu

Upgrade from old version: static_config.ebpf.on-cpu-profile.cpu

Default value:

inputs:
  ebpf:
    profile:
      on_cpu:
        aggregate_by_cpu: false

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

Whether to obtain the value of CPUID and decide whether to participate in aggregation.

true: Obtain the value of CPUID and will be included in the aggregation of stack trace data.
false: It will not be included in the aggregation. Any other value is considered invalid, the CPU value for stack trace data reporting is a special value CPU_INVALID: 0xfff used to indicate that it is an invalid value.

#2.3.4.3 Off-CPU

#2.3.4.3.1 Disabled

Tags:

agent_restart ee_feature

FQCN:

inputs.ebpf.profile.off_cpu.disabled

Upgrade from old version: static_config.ebpf.off-cpu-profile.disabled

Default value:

inputs:
  ebpf:
    profile:
      off_cpu:
        disabled: true

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

eBPF Off-CPU profile switch.

Note: When enabling this feature, the specific process list must also be specified in inputs.proc.process_matcher, i.e., ebpf.profile.off_cpu must be included in inputs.proc.process_matcher.[*].enabled_features.

#2.3.4.3.2 Aggregate by CPU

Tags:

agent_restart ee_feature

FQCN:

inputs.ebpf.profile.off_cpu.aggregate_by_cpu

Upgrade from old version: static_config.ebpf.off-cpu-profile.cpu

Default value:

inputs:
  ebpf:
    profile:
      off_cpu:
        aggregate_by_cpu: false

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

Whether to obtain the value of CPUID and decide whether to participate in aggregation.

true: Obtain the value of CPUID and will be included in the aggregation of stack trace data.
false: It will not be included in the aggregation. Any other value is considered invalid, the CPU value for stack trace data reporting is a special value CPU_INVALID: 0xfff used to indicate that it is an invalid value.

#2.3.4.3.3 Minimum Blocking Time

Tags:

agent_restart ee_feature

FQCN:

inputs.ebpf.profile.off_cpu.min_blocking_time

Upgrade from old version: static_config.ebpf.off-cpu-profile.minblock

Default value:

inputs:
  ebpf:
    profile:
      off_cpu:
        min_blocking_time: 50us

1
2
3
4
5

Schema:

Key	Value
Type	duration
Range	['0ns', '1h']

Description:

If set to '0ns', there will be no minimum value limitation. Scheduler events are still high-frequency events, as their rate may exceed 1 million events per second, so caution should still be exercised.

If overhead remains an issue, you can configure the 'minblock' tunable parameter here. If the off-CPU time is less than the value configured in this item, the data will be discarded. If your goal is to trace longer blocking events, increasing this parameter can filter out shorter blocking events, further reducing overhead. Additionally, we will not collect events with a blocking time exceeding 1 hour.

#2.3.4.4 Memory

#2.3.4.4.1 Disabled

Tags:

hot_update ee_feature

FQCN:

inputs.ebpf.profile.memory.disabled

Upgrade from old version: static_config.ebpf.memory-profile.disabled

Default value:

inputs:
  ebpf:
    profile:
      memory:
        disabled: true

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

eBPF memory profile switch.

Note: When enabling this feature, the specific process list must also be specified in inputs.proc.process_matcher, i.e., ebpf.profile.memory must be included in inputs.proc.process_matcher.[*].enabled_features.

#2.3.4.4.2 Memory profile report interval

Tags:

hot_update ee_feature

FQCN:

inputs.ebpf.profile.memory.report_interval

Upgrade from old version: static_config.ebpf.memory-profile.report-interval

Default value:

inputs:
  ebpf:
    profile:
      memory:
        report_interval: 10s

1
2
3
4
5

Schema:

Key	Value
Type	duration
Range	['1s', '60s']

Description:

The interval at which deepflow-agent aggregates and reports memory profile data.

#2.3.4.4.3 LRU length for process allocated addresses

Tags:

hot_update ee_feature

FQCN:

inputs.ebpf.profile.memory.allocated_addresses_lru_len

Default value:

inputs:
  ebpf:
    profile:
      memory:
        allocated_addresses_lru_len: 131072

1
2
3
4
5

Schema:

Key	Value
Type	int
Range	[1024, 4194704]

Description:

Agent uses LRU cache to record process allocated addresses to avoid uncontrolled memory usage. Each record in this LRU is about 80B.

#2.3.4.5 Preprocess

#2.3.4.5.1 Stack Compression

Tags:

agent_restart

FQCN:

inputs.ebpf.profile.preprocess.stack_compression

Upgrade from old version: static_config.ebpf.preprocess.stack-compression

Default value:

inputs:
  ebpf:
    profile:
      preprocess:
        stack_compression: true

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

Compress the call stack before sending data. Compression can effectively reduce the agent's memory usage, data transmission bandwidth consumption, and ingester's CPU overhead. However, it also increases the CPU usage of the agent. Tests have shown that compressing the on-cpu function call stack of the deepflow-agent can reduce bandwidth consumption by x times, but it will result in an additional y% CPU usage for the agent.

#2.3.5 Tunning

#2.3.5.1 Collector Queue Size

Tags:

agent_restart

FQCN:

inputs.ebpf.tunning.collector_queue_size

Upgrade from old version: static_config.ebpf-collector-queue-size

Default value:

inputs:
  ebpf:
    tunning:
      collector_queue_size: 65535

1
2
3
4

Schema:

Key	Value
Type	int
Range	[4096, 64000000]

Description:

The length of the following queues:

0-ebpf-to-ebpf-collector
1-proc-event-to-sender
1-profile-to-sender

#2.3.5.2 Userspace Worker Threads

Tags:

agent_restart

FQCN:

inputs.ebpf.tunning.userspace_worker_threads

Upgrade from old version: static_config.ebpf.thread-num

Default value:

inputs:
  ebpf:
    tunning:
      userspace_worker_threads: 1

1
2
3
4

Schema:

Key	Value
Type	int
Range	[1, 1024]

Description:

The number of worker threads refers to how many threads participate in data processing in user-space. The actual maximal value is the number of CPU logical cores on the host.

#2.3.5.3 Perf Pages Count

Tags:

agent_restart

FQCN:

inputs.ebpf.tunning.perf_pages_count

Upgrade from old version: static_config.ebpf.perf-pages-count

Default value:

inputs:
  ebpf:
    tunning:
      perf_pages_count: 128

1
2
3
4

Schema:

Key	Value
Type	int
Range	[32, 8192]

Description:

The number of page occupied by the shared memory of the kernel. The value is 2^n (5 <= n <= 13). Used for perf data transfer. If the value is between 2^n and 2^(n+1), it will be automatically adjusted by the ebpf configurator to the minimum value 2^n.

#2.3.5.4 Kernel Ring Size

Tags:

agent_restart

FQCN:

inputs.ebpf.tunning.kernel_ring_size

Upgrade from old version: static_config.ebpf.ring-size

Default value:

inputs:
  ebpf:
    tunning:
      kernel_ring_size: 65536

1
2
3
4

Schema:

Key	Value
Type	int
Range	[8192, 131072]

Description:

The size of the ring cache queue, The value is 2^n (13 <= n <= 17). If the value is between 2^n and 2^(n+1), it will be automatically adjusted by the ebpf configurator to the minimum value 2^n.

#2.3.5.5 Maximum Socket Entries

Tags:

agent_restart

FQCN:

inputs.ebpf.tunning.max_socket_entries

Upgrade from old version: static_config.ebpf.max-socket-entries

Default value:

inputs:
  ebpf:
    tunning:
      max_socket_entries: 131072

1
2
3
4

Schema:

Key	Value
Type	int
Range	[10000, 2000000]

Description:

Set the maximum value of hash table entries for socket tracking, depending on the number of concurrent requests in the actual scenario

#2.3.5.6 Socket Map Reclaim Threshold

Tags:

agent_restart

FQCN:

inputs.ebpf.tunning.socket_map_reclaim_threshold

Upgrade from old version: static_config.ebpf.socket-map-max-reclaim

Default value:

inputs:
  ebpf:
    tunning:
      socket_map_reclaim_threshold: 120000

1
2
3
4

Schema:

Key	Value
Type	int
Range	[8000, 2000000]

Description:

The threshold for cleaning socket map table entries.

#2.3.5.7 Maximum Trace Entries

Tags:

agent_restart

FQCN:

inputs.ebpf.tunning.max_trace_entries

Upgrade from old version: static_config.ebpf.max-trace-entries

Default value:

inputs:
  ebpf:
    tunning:
      max_trace_entries: 131072

1
2
3
4

Schema:

Key	Value
Type	int
Range	[10000, 2000000]

Description:

Set the maximum value of hash table entries for thread/coroutine tracking sessions.

#2.4 Resources

#2.4.1 Push Interval

Tags:

hot_update

FQCN:

inputs.resources.push_interval

Upgrade from old version: platform_sync_interval

Default value:

inputs:
  resources:
    push_interval: 10s

1
2
3

Schema:

Key	Value
Type	duration
Range	['10s', '3600s']

Description:

The interval at which deepflow-agent actively reports resource information to deepflow-server.

#2.4.2 Workload Resource Sync Enabled

Tags:

hot_update

FQCN:

inputs.resources.workload_resource_sync_enabled

Default value:

inputs:
  resources:
    workload_resource_sync_enabled: false

1
2
3

Schema:

Key	Value
Type	bool

Description:

When enabled, deepflow-server will abstract VM based on the runtime environment information reported by deepflow-agent.

#2.4.3 Collect Private Cloud Resource

#2.4.3.1 Hypervisor Resource Enabled

Tags:

hot_update

FQCN:

inputs.resources.private_cloud.hypervisor_resource_enabled

Upgrade from old version: platform_enabled

Default value:

inputs:
  resources:
    private_cloud:
      hypervisor_resource_enabled: false

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

When enabled, deepflow-agent will automatically synchronize virtual machine and network information on KVM or Linux Host to deepflow-server. Information collected includes:

raw_all_vm_xml
raw_vm_states
raw_ovs_interfaces
raw_ovs_ports
raw_brctl_show
raw_vlan_config

#2.4.3.2 VM MAC Source

Tags:

hot_update

FQCN:

inputs.resources.private_cloud.vm_mac_source

Upgrade from old version: if_mac_source

Default value:

inputs:
  resources:
    private_cloud:
      vm_mac_source: 0

1
2
3
4

Enum options:

Value	Note
0	Interface MAC Address
1	Interface Name
2	Qemu XML File

Schema:

Key	Value
Type	int

Description:

How to extract the real MAC address of the virtual machine when the agent runs on the KVM host.

Explanation of the options:

Interface MAC Address: extracted from tap interface MAC address
Interface Name: extracted from tap interface name
Qemu XML File: extracted from the XML file of the virtual machine

#2.4.3.3 VM XML Directory

Tags:

hot_update

FQCN:

inputs.resources.private_cloud.vm_xml_directory

Upgrade from old version: vm_xml_path

Default value:

inputs:
  resources:
    private_cloud:
      vm_xml_directory: /etc/libvirt/qemu/

1
2
3
4

Schema:

Key	Value
Type	string
Range	[0, 100]

Description:

VM XML file directory.

#2.4.3.4 VM MAC Mapping Script

Tags:

agent_restart

FQCN:

inputs.resources.private_cloud.vm_mac_mapping_script

Upgrade from old version: static_config.tap-mac-script

Default value:

inputs:
  resources:
    private_cloud:
      vm_mac_mapping_script: ''

1
2
3
4

Schema:

Key	Value
Type	string
Range	[0, 100]

Description:

The MAC address mapping relationship of TAP NIC in complex environment can be constructed by writing a script. The following conditions must be met to use this script:

if_mac_source = 2
tap_mode = 0
The name of the TAP NIC is the same as in the virtual machine XML file
The format of the script output is as follows:
- tap2d283dfe,11:22:33:44:55:66
- tap2d283223,aa:bb:cc:dd:ee:ff

#2.4.4 Collect K8s Resource

#2.4.4.1 K8s Namespace

Tags:

agent_restart

FQCN:

inputs.resources.kubernetes.kubernetes_namespace

Upgrade from old version: static_config.kubernetes-namespace

Default value:

inputs:
  resources:
    kubernetes:
      kubernetes_namespace: null

1
2
3
4

Schema:

Key	Value
Type	string

Description:

Specify the namespace for agent to query K8s resources.

#2.4.4.2 K8s API Resources

Tags:

agent_restart

FQCN:

inputs.resources.kubernetes.api_resources

Upgrade from old version: static_config.kubernetes-resources

Default value:

inputs:
  resources:
    kubernetes:
      api_resources:
      - name: namespaces
      - name: nodes
      - name: pods
      - name: replicationcontrollers
      - name: services
      - name: daemonsets
      - name: deployments
      - name: replicasets
      - name: statefulsets
      - name: ingresses
      - name: configmaps

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

Schema:

Key	Value
Type	dict

Description:

Specify kubernetes resources to watch.

The schematics of entries in list is: { name: string group: string version: string disabled: bool field_selector: string }

Agent will watch the following resources by default:

namespaces
nodes
pods
replicationcontrollers
services
daemonsets
deployments
replicasets
statefulsets
ingresses
configmaps

To disable a resource, add an entry to the list with disabled: true:

inputs:
  resources:
    kubernetes:
      api_resources:
      - name: services
        disabled: true

1
2
3
4
5
6

To enable a resource, add an entry of this resource to the list. Be advised that this setting overrides the default of the same resource. For example, to enable statefulsets in both group apps (the default) and apps.kruise.io will require two entries:

inputs:
  resources:
    kubernetes:
      api_resources:
      - name: statefulsets
        group: apps
      - name: statefulsets
        group: apps.kruise.io
        version: v1beta1

1
2
3
4
5
6
7
8
9

To watching routes in openshift you can use the following settings:

inputs:
  resources:
    kubernetes:
      api_resources:
      - name: ingresses
        disabled: true
      - name: routes

1
2
3
4
5
6
7

#2.4.4.2.1 Name

Tags:

agent_restart

FQCN:

inputs.resources.kubernetes.api_resources.name

Upgrade from old version: static_config.kubernetes-resources.name

Default value:

inputs:
  resources:
    kubernetes:
      api_resources:
      - name: ''

1
2
3
4
5

Enum options:

Value	Note
namespaces
nodes
pods
replicationcontrollers
services
daemonsets
deployments
replicasets
statefulsets
ingresses
routes
servicerules
clonesets
ippools
opengaussclusters
configmaps

Schema:

Key	Value
Type	string

Description:

K8s API resource name.

#2.4.4.2.2 Group

Tags:

agent_restart

FQCN:

inputs.resources.kubernetes.api_resources.group

Upgrade from old version: static_config.kubernetes-resources.group

Default value:

inputs:
  resources:
    kubernetes:
      api_resources:
      - group: ''

1
2
3
4
5

Schema:

Key	Value
Type	string

Description:

K8s API resource group.

#2.4.4.2.3 Version

Tags:

agent_restart

FQCN:

inputs.resources.kubernetes.api_resources.version

Upgrade from old version: static_config.kubernetes-resources.version

Default value:

inputs:
  resources:
    kubernetes:
      api_resources:
      - version: ''

1
2
3
4
5

Schema:

Key	Value
Type	string

Description:

K8s API version.

#2.4.4.2.4 Disabled

Tags:

agent_restart

FQCN:

inputs.resources.kubernetes.api_resources.disabled

Upgrade from old version: static_config.kubernetes-resources.disabled

Default value:

inputs:
  resources:
    kubernetes:
      api_resources:
      - disabled: false

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

K8s API resource disabled.

#2.4.4.2.5 Field Selector

Tags:

agent_restart

FQCN:

inputs.resources.kubernetes.api_resources.field_selector

Upgrade from old version: static_config.kubernetes-resources.field-selector

Default value:

inputs:
  resources:
    kubernetes:
      api_resources:
      - field_selector: ''

1
2
3
4
5

Schema:

Key	Value
Type	string

Description:

K8s API resource field selector.

#2.4.4.3 K8s API List Page Size

Tags:

agent_restart

FQCN:

inputs.resources.kubernetes.api_list_page_size

Upgrade from old version: static_config.kubernetes-api-list-limit

Default value:

inputs:
  resources:
    kubernetes:
      api_list_page_size: 1000

1
2
3
4

Schema:

Key	Value
Type	int
Range	[10, 4294967295]

Description:

Used when limit k8s api list entry size.

#2.4.4.4 K8s API List Maximum Interval

Tags:

agent_restart

FQCN:

inputs.resources.kubernetes.api_list_max_interval

Upgrade from old version: static_config.kubernetes-api-list-interval

Default value:

inputs:
  resources:
    kubernetes:
      api_list_max_interval: 10m

1
2
3
4

Schema:

Key	Value
Type	duration
Range	['10m', '30d']

Description:

Interval of listing resource when watcher idles

#2.4.4.5 Ingress Flavour

Tags:

deprecated

FQCN:

inputs.resources.kubernetes.ingress_flavour

Upgrade from old version: static_config.ingress-flavour

Default value:

inputs:
  resources:
    kubernetes:
      ingress_flavour: kubernetes

1
2
3
4

Schema:

Key	Value
Type	string

#2.4.4.6 Pod MAC Collection Method

Tags:

agent_restart

FQCN:

inputs.resources.kubernetes.pod_mac_collection_method

Upgrade from old version: static_config.kubernetes-poller-type

Default value:

inputs:
  resources:
    kubernetes:
      pod_mac_collection_method: adaptive

1
2
3
4

Enum options:

Value	Note
adaptive
active
passive

Schema:

Key	Value
Type	string

Description:

In active mode, deepflow-agent enters the netns of other Pods through setns syscall to query the MAC and IP addresses. In this mode, the setns operation requires the SYS_ADMIN permission. In passive mode deepflow-agent calculates the MAC and IP addresses used by Pods by capturing ARP/ND traffic. When set to adaptive, active mode will be used first.

#2.4.5 Pull Resource From Controller

Configurations for deepflow-server on pulling resources from controller. DeepFlow-agent will not read this section.

#2.4.5.1 Domain Filter

Tags:

hot_update

FQCN:

inputs.resources.pull_resource_from_controller.domain_filter

Upgrade from old version: domains

Default value:

inputs:
  resources:
    pull_resource_from_controller:
      domain_filter:
      - '0'

1
2
3
4
5

Enum options:

Value	Note
DYNAMIC_OPTIONS	DYNAMIC_OPTIONS

Schema:

Key	Value
Type	string

Description:

Default value 0 means all domains, or can be set to a list of lcuuid of a series of domains, you can get lcuuid through 'deepflow-ctl domain list'.

Note: The list of MAC and IP addresses is used by deepflow-agent to inject tags into data. This configuration can reduce the number and frequency of MAC and IP addresses delivered by deepflow-server to deepflow-agent. When there is no cross-domain service request, deepflow-server can be configured to only deliver the information in the local domain to deepflow-agent.

#2.4.5.2 Only K8s Pod IP in Local Cluster

Tags:

hot_update

FQCN:

inputs.resources.pull_resource_from_controller.only_kubernetes_pod_ip_in_local_cluster

Upgrade from old version: pod_cluster_internal_ip

Default value:

inputs:
  resources:
    pull_resource_from_controller:
      only_kubernetes_pod_ip_in_local_cluster: false

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

The list of MAC and IP addresses is used by deepflow-agent to inject tags into data. This configuration can reduce the number and frequency of MAC and IP addresses delivered by deepflow-server to deepflow-agent. When the Pod IP is not used for direct communication between the K8s cluster and the outside world, deepflow-server can be configured to only deliver the information in the local K8s cluster to deepflow-agent.

#2.5 Integration

#2.5.1 Enabled

Tags:

hot_update

FQCN:

inputs.integration.enabled

Upgrade from old version: external_agent_http_proxy_enabled

Default value:

inputs:
  integration:
    enabled: true

1
2
3

Schema:

Key	Value
Type	bool

Description:

Whether to enable receiving external data sources such as Prometheus, Telegraf, OpenTelemetry, and SkyWalking.

#2.5.2 Listen Port

Tags:

hot_update

FQCN:

inputs.integration.listen_port

Upgrade from old version: external_agent_http_proxy_port

Default value:

inputs:
  integration:
    listen_port: 38086

1
2
3

Schema:

Key	Value
Type	int
Range	[1, 65535]

Description:

Listen port of the data integration socket.

#2.5.3 Compression

#2.5.3.1 Trace

Tags:

agent_restart

FQCN:

inputs.integration.compression.trace

Upgrade from old version: static_config.external-agent-http-proxy-compressed

Default value:

inputs:
  integration:
    compression:
      trace: true

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

Whether to compress the integrated trace data received by deepflow-agent. The compression ratio is about 5:1~10:1. Turning on this feature will result in higher CPU consumption of deepflow-agent.

#2.5.3.2 Profile

Tags:

agent_restart

FQCN:

inputs.integration.compression.profile

Upgrade from old version: static_config.external-agent-http-proxy-profile-compressed

Default value:

inputs:
  integration:
    compression:
      profile: true

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

Whether to compress the integrated profile data received by deepflow-agent. The compression ratio is about 5:1~10:1. Turning on this feature will result in higher CPU consumption of deepflow-agent.

#2.5.4 Prometheus Extra Labels

Support for getting extra labels from headers in http requests from RemoteWrite.

#2.5.4.1 Enabled

Tags:

agent_restart

FQCN:

inputs.integration.prometheus_extra_labels.enabled

Upgrade from old version: static_config.prometheus-extra-config.enabled

Default value:

inputs:
  integration:
    prometheus_extra_labels:
      enabled: false

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

Prometheus extra labels switch.

#2.5.4.2 Extra Labels

Tags:

agent_restart

FQCN:

inputs.integration.prometheus_extra_labels.extra_labels

Upgrade from old version: static_config.prometheus-extra-config.labels

Default value:

inputs:
  integration:
    prometheus_extra_labels:
      extra_labels: []

1
2
3
4

Schema:

Key	Value
Type	string

Description:

Labels list. Labels in this list are sent. Label is a string matching the regular expression [a-zA-Z_][a-zA-Z0-9_]*

#2.5.4.3 Label Key Total Length Limit

Tags:

agent_restart

FQCN:

inputs.integration.prometheus_extra_labels.label_length

Upgrade from old version: static_config.prometheus-extra-config.labels-limit

Default value:

inputs:
  integration:
    prometheus_extra_labels:
      label_length: 1024

1
2
3
4

Schema:

Key	Value
Type	int
Unit	byte
Range	[1024, 1048576]

Description:

The limit of the total length of parsed extra Prometheus label keys.

#2.5.4.4 Value Total Length Limit

Tags:

agent_restart

FQCN:

inputs.integration.prometheus_extra_labels.value_length

Upgrade from old version: static_config.prometheus-extra-config.values-limit

Default value:

inputs:
  integration:
    prometheus_extra_labels:
      value_length: 4096

1
2
3
4

Schema:

Key	Value
Type	int
Unit	byte
Range	[4096, 4194304]

Description:

The limit of the total length of parsed extra Prometheus label values.

#2.5.5 Feature Control

#2.5.5.1 Profile Integration Disabled

Tags:

agent_restart

FQCN:

inputs.integration.feature_control.profile_integration_disabled

Upgrade from old version: static_config.external-profile-integration-disabled

Default value:

inputs:
  integration:
    feature_control:
      profile_integration_disabled: false

1
2
3
4

Schema:

Key	Value
Type	bool

#2.5.5.2 Trace Integration Disabled

Tags:

agent_restart

FQCN:

inputs.integration.feature_control.trace_integration_disabled

Upgrade from old version: static_config.external-trace-integration-disabled

Default value:

inputs:
  integration:
    feature_control:
      trace_integration_disabled: false

1
2
3
4

Schema:

Key	Value
Type	bool

#2.5.5.3 Metric Integration Disabled

Tags:

agent_restart

FQCN:

inputs.integration.feature_control.metric_integration_disabled

Upgrade from old version: static_config.external-metric-integration-disabled

Default value:

inputs:
  integration:
    feature_control:
      metric_integration_disabled: false

1
2
3
4

Schema:

Key	Value
Type	bool

#2.5.5.4 Log Integration Disabled

Tags:

agent_restart

FQCN:

inputs.integration.feature_control.log_integration_disabled

Upgrade from old version: static_config.external-log-integration-disabled

Default value:

inputs:
  integration:
    feature_control:
      log_integration_disabled: false

1
2
3
4

Schema:

Key	Value
Type	bool

#2.6 Vector

#2.6.1 Vector Component Enabled

Tags:

hot_update ee_feature

FQCN:

inputs.vector.enabled

Default value:

inputs:
  vector:
    enabled: false

1
2
3

Schema:

Key	Value
Type	bool

Description:

The switcher control for Vector component running.

#2.6.2 Vector Component Config

Tags:

hot_update ee_feature

FQCN:

inputs.vector.config

Default value:

inputs:
  vector:
    config: null

1
2
3

Schema:

Key	Value
Type	dict

Description:

The detail config for Vector Component, all availble config keys could be found in vector.dev (opens new window) Here's an example for how to capture kubernetes logs、host metrics in virtual machine and kubelet metrics in kubernetes. It'll send to DeepFlow-Agent as output.

scrape host metrics:

sources:
  host_metrics:
    type: host_metrics
    scrape_interval_secs: 10
    namespace: node
transforms:
  host_metrics_relabel:
    type: remap
    inputs:
    - host_metrics
    source: |
      .tags.instance = "${K8S_NODE_IP_FOR_DEEPFLOW}"
      .tags.host = "${K8S_NODE_NAME_FOR_DEEPFLOW}"
      metrics_map = {
        "boot_time": "boot_time_seconds",
        "memory_active_bytes": "memory_Active_bytes",
        "memory_available_bytes": "memory_MemAvailable_bytes",
        "memory_buffers_bytes": "memory_Buffers_bytes",
        "memory_cached_bytes": "memory_Cached_bytes",
        "memory_free_bytes": "memory_MemFree_bytes",
        "memory_swap_free_bytes": "memory_SwapFree_bytes",
        "memory_swap_total_bytes": "memory_SwapTotal_bytes",
        "memory_swap_used_bytes": "memory_SwapCached_bytes",
        "memory_total_bytes": "memory_MemTotal_bytes",
        "network_transmit_packets_drop_total": "network_transmit_drop_total",
        "uptime": "uname_info",
        "filesystem_total_bytes": "filesystem_size_bytes",
      }
      metric_name = get!(value: metrics_map, path: [.name])
      if !is_null(metric_name) {
        .name = metric_name
      }
      if .tags.collector == "filesystem" {
        .tags.fstype = .tags.filesystem
        del(.tags.filesystem)
      }
sinks:
  prometheus_remote_write:
    type: prometheus_remote_write
    inputs:
    - host_metrics_relabel
    endpoint: http://127.0.0.1:38086/api/v1/prometheus
    healthcheck:
      enabled: false

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45

scrape kubernetes metrics

secret:
  kube_token:
    type: directory
    path: /var/run/secrets/kubernetes.io/serviceaccount
sources:
  cadvisor_metrics:
    type: prometheus_scrape
    endpoints:
    - https://${K8S_NODE_IP_FOR_DEEPFLOW}:10250/metrics/cadvisor
    auth:
      strategy: bearer
      token: SECRET[kube_token.token]
    scrape_interval_secs: 10
    scrape_timeout_secs: 10
    honor_labels: true
    instance_tag: instance
    endpoint_tag: metrics_endpoint
    tls:
      verify_certificate: false
  kubelet_metrics:
    type: prometheus_scrape
    endpoints:
    - http://${K8S_NODE_IP_FOR_DEEPFLOW}:10250/metrics
    auth:
      strategy: bearer
      token: SECRET[kube_token.token]
    scrape_interval_secs: 10
    scrape_timeout_secs: 10
    honor_labels: true
    instance_tag: instance
    endpoint_tag: metrics_endpoint
    tls:
      verify_certificate: false
  kube_state_metrics:
    type: prometheus_scrape
    endpoints:
    - http://opensource-kube-state-metrics:8080/metrics
    scrape_interval_secs: 10
    scrape_timeout_secs: 10
    honor_labels: true
    instance_tag: instance
    endpoint_tag: metrics_endpoint
transforms:
  cadvisor_relabel_filter:
    type: filter
    inputs:
    - cadvisor_metrics
    condition: "!match(string!(.name), r'container_cpu_(cfs_throttled_seconds_total|load_average_10s|system_seconds_total|user_seconds_total)|container_fs_(io_current|io_time_seconds_total|io_time_weighted_seconds_total|reads_merged_total|sector_reads_total|sector_writes_total|writes_merged_total)|container_memory_(mapped_file|swap)|container_(file_descriptors|tasks_state|threads_max)|container_spec.*')"
  kubelet_relabel_filter:
    type: filter
    inputs:
    - kubelet_metrics
    condition: "match(string!(.name), r'kubelet_cgroup_(manager_duration_seconds_bucket|manager_duration_seconds_count)|kubelet_node_(config_error|node_name)|kubelet_pleg_relist_(duration_seconds_bucket|duration_seconds_count|interval_seconds_bucket)|kubelet_pod_(start_duration_seconds_count|worker_duration_seconds_bucket|worker_duration_seconds_count)|kubelet_running_(container_count|containers|pod_count|pods)|kubelet_runtime_(operations_duration_seconds_bucket|perations_errors_total|operations_total)|kubelet_volume_stats_(available_bytes|capacity_bytes|inodes|inodes_used)|process_(cpu_seconds_total|resident_memory_bytes)|rest_client_(request_duration_seconds_bucket|requests_total)|storage_operation_(duration_seconds_bucket|duration_seconds_count|errors_total)|up|volume_manager_total_volumes')"
  kube_state_relabel_filter:
    type: filter
    inputs:
    - kube_state_metrics
    condition: "!match(string!(.name), r'kube_endpoint_address_not_ready|kube_endpoint_address_available')"
  common_relabel_config:
    type: remap
    inputs:
    - cadvisor_relabel_filter
    - kubelet_relabel_filter
    - kube_state_relabel_filter
    source: |-
      if !is_null(.tags) && is_string(.tags.metrics_endpoint) {
      .tags.metrics_path = parse_regex!(.tags.metrics_endpoint, r'https?:\/\/[^\/]+(?<path>\/.*)$').path
      }
sinks:
  prometheus_remote_write:
    type: prometheus_remote_write
    inputs:
    - common_relabel_config
    endpoint: http://127.0.0.1:38086/api/v1/prometheus
    healthcheck:
      enabled: false

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77

scrape kubernentes logs (capture DeepFlow Pod logs as example, if other Pod logs is required, update extra_label_selector add custom filters)

data_dir: /vector-log-checkpoint
sources:
  kubernetes_logs:
    self_node_name: ${K8S_NODE_NAME_FOR_DEEPFLOW}
    type: kubernetes_logs
    namespace_annotation_fields:
      namespace_labels: ""
    node_annotation_fields:
      node_labels: ""
    pod_annotation_fields:
      pod_annotations: ""
      pod_labels: ""
    extra_label_selector: "app=deepflow,component!=front-end"
  kubernetes_logs_frontend:
    self_node_name: ${K8S_NODE_NAME_FOR_DEEPFLOW}
    type: kubernetes_logs
    namespace_annotation_fields:
      namespace_labels: ""
    node_annotation_fields:
      node_labels: ""
    pod_annotation_fields:
      pod_annotations: ""
      pod_labels: ""
    extra_label_selector: "app=deepflow,component=front-end"
transforms:
  multiline_kubernetes_logs:
    type: reduce
    inputs:
      - kubernetes_logs
    group_by:
      - file
      - stream
    merge_strategies:
      message: concat_newline
    starts_when: match(string!(.message), r'^(.+=|\[|\[?\u001B\[[0-9;]*m|\[mysql\]\s|\{\".+\"|(::ffff:)?([0-9]{1,3}.){3}[0-9]{1,3}[\s\-]+(\[)?)?\d{4}[-\/\.]?\d{2}[-\/\.]?\d{2}[T\s]?\d{2}:\d{2}:\d{2}')
    expire_after_ms: 2000
    flush_period_ms: 500
  flush_kubernetes_logs:
   type: remap
   inputs:
     - multiline_kubernetes_logs
   source: |-
       .message = replace(string!(.message), r'\u001B\[([0-9]{1,3}(;[0-9]{1,3})*)?m', "")
  remap_kubernetes_logs:
    type: remap
    inputs:
    - flush_kubernetes_logs
    - kubernetes_logs_frontend
    source: |-
        if is_string(.message) && is_json(string!(.message)) {
            tags = parse_json(.message) ?? {}
            ._df_log_type = tags._df_log_type
            .org_id = to_int(tags.org_id) ?? 0
            .user_id = to_int(tags.user_id) ?? 0
            .message = tags.message || tags.msg
            del(tags._df_log_type)
            del(tags.org_id)
            del(tags.user_id)
            del(tags.message)
            del(tags.msg)
            .json = tags
        }
        if !exists(.level) {
           if exists(.json) {
              .level = to_string!(.json.level)
              del(.json.level)
           } else {
             level_tags = parse_regex(.message, r'[\[\\<](?<level>(?i)INFOR?(MATION)?|WARN(ING)?|DEBUG?|ERROR?|TRACE|FATAL|CRIT(ICAL)?)[\]\\>]') ?? {}
             if !exists(level_tags.level) {
                level_tags = parse_regex(.message, r'[\s](?<level>INFOR?(MATION)?|WARN(ING)?|DEBUG?|ERROR?|TRACE|FATAL|CRIT(ICAL)?)[\s]') ?? {}
             }
             if exists(level_tags.level) {
                level_tags.level = upcase(string!(level_tags.level))
                if level_tags.level == "INFORMATION" || level_tags.level == "INFOMATION" {
                    level_tags.level = "INFO"
                }
                if level_tags.level == "WARNING" {
                    level_tags.level = "WARN"
                }
                if level_tags.level == "DEBU" {
                    level_tags.level = "DEBUG"
                }
                if level_tags.level == "ERRO" {
                    level_tags.level = "ERROR"
                }
                if level_tags.level == "CRIT" || level_tags.level == "CRITICAL" {
                    level_tags.level = "FATAL"
                }
                .level = level_tags.level
             }
           }
        }
        if !exists(._df_log_type) {
            ._df_log_type = "system"
        }
        if !exists(.app_service) {
            .app_service = .kubernetes.container_name
        }
sinks:
  http:
    type: http
    inputs: [remap_kubernetes_logs]
    uri: http://127.0.0.1:38086/api/v1/log
    encoding:
      codec: json

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106

use http_client or socket to dial a remote server for testing

sources:
  http_client_dial:
    type: http_client
    endpoint: http://$HOST:$PORT
    method: GET
    scrape_interval_secs: 10
    scrape_timeout_secs: 5
  internal_metrics:
    type: internal_metrics
    scrape_interval_secs: 10
    namespace: ${K8S_NAMESPACE_FOR_DEEPFLOW}
  socket_dial_input:
    type: demo_logs
    interval: 10
    format: shuffle
    lines: [""]
transforms:
  internal_metrics_relabel:
    type: remap
    inputs:
    - internal_metrics
    source: |-
      .tags.instance = "${K8S_NODE_IP_FOR_DEEPFLOW}"
  internal_metrics_dispatch:
    type: route
    inputs:
    - internal_metrics_relabel
    route:
      http_client_dial_metrics: '.tags.component_id == "http_client_dial"'
      socket_dial_metrics: '.tags.component_id == "socket_dial"'
  http_client_dial_metrics:
    type: filter
    inputs:
    - internal_metrics_dispatch.http_client_dial_metrics
    condition: "match(string!(.name),r'http_client_.*')"
  socket_dial_metrics:
    type: filter
    inputs:
    - internal_metrics_dispatch.socket_dial_metrics
    condition: "match(string!(.name),r'buffer.*')"
sinks:
  socket_dial:
    type: socket
    inputs:
    - socket_dial_input
    address: $HOST:$PORT
    mode: tcp
    encoding:
      codec: raw_message
  prometheus_remote_write:
    type: prometheus_remote_write
    inputs:
    - http_client_dial_metrics
    - socket_dial_metrics
    endpoint: http://127.0.0.1:38086/api/v1/prometheus
    healthcheck:
      enabled: false

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58

#3. Processors

#3.1 Packet

#3.1.1 Policy

#3.1.1.1 Fast-path Map Size

Tags:

agent_restart

FQCN:

processors.packet.policy.fast_path_map_size

Upgrade from old version: static_config.fast-path-map-size

Default value:

processors:
  packet:
    policy:
      fast_path_map_size: 0

1
2
3
4

Schema:

Key	Value
Type	int
Range	[0, 10000000]

Description:

When set to 0, deepflow-agent will automatically adjust the map size according to global.limits.max_memory. Note: In practice, it should not be set to less than 8000.

#3.1.1.2 Fast-path Disabled

Tags:

agent_restart

FQCN:

processors.packet.policy.fast_path_disabled

Upgrade from old version: static_config.fast-path-disabled

Default value:

processors:
  packet:
    policy:
      fast_path_disabled: false

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

When set to true, deepflow-agent will not use fast path.

#3.1.1.3 Forward Table Capacity

Tags:

agent_restart

FQCN:

processors.packet.policy.forward_table_capacity

Upgrade from old version: static_config.forward-capacity

Default value:

processors:
  packet:
    policy:
      forward_table_capacity: 16384

1
2
3
4

Schema:

Key	Value
Type	int
Range	[16384, 64000000]

Description:

The size of the forwarding table, which is used to store MAC-IP information， When this value is larger, the more memory usage may be.

#3.1.1.4 Max First-path Level

Tags:

agent_restart

FQCN:

processors.packet.policy.max_first_path_level

Upgrade from old version: static_config.first-path-level

Default value:

processors:
  packet:
    policy:
      max_first_path_level: 8

1
2
3
4

Schema:

Key	Value
Type	int
Range	[1, 16]

Description:

DDBS algorithm level.

When this value is larger, the memory overhead is smaller, but the performance of policy matching is worse.

#3.1.2 TCP Header

#3.1.2.1 Block Size

Tags:

agent_restart ee_feature

FQCN:

processors.packet.tcp_header.block_size

Upgrade from old version: static_config.packet-sequence-block-size

Default value:

processors:
  packet:
    tcp_header:
      block_size: 256

1
2
3
4

Schema:

Key	Value
Type	int
Range	[16, 8192]

Description:

When generating TCP header data, each flow uses one block to compress and store multiple TCP headers, and the block size can be set here.

#3.1.2.2 Sender Queue Size

Tags:

agent_restart ee_feature

FQCN:

processors.packet.tcp_header.sender_queue_size

Upgrade from old version: static_config.packet-sequence-queue-size

Default value:

processors:
  packet:
    tcp_header:
      sender_queue_size: 65536

1
2
3
4

Schema:

Key	Value
Type	int
Range	[65536, 64000000]

Description:

The length of the following queues (to UniformCollectSender):

1-packet-sequence-block-to-uniform-collect-sender

#3.1.2.3 Header Fields Flag

Tags:

agent_restart ee_feature

FQCN:

processors.packet.tcp_header.header_fields_flag

Upgrade from old version: static_config.packet-sequence-flag

Default value:

processors:
  packet:
    tcp_header:
      header_fields_flag: 0

1
2
3
4

Schema:

Key	Value
Type	int
Range	[0, 255]

Description:

packet-sequence-flag determines which fields need to be reported, the default value is 0, which means the feature is disabled, and 255, which means all fields need to be reported all fields corresponding to each bit:

| FLAG | SEQ | ACK | PAYLOAD_SIZE | WINDOW_SIZE | OPT_MSS | OPT_WS | OPT_SACK |
    7     6     5              4             3         2        1          0

1
2

#3.1.3 PCAP Stream

#3.1.3.1 Receiver Queue Size

Tags:

agent_restart ee_feature

FQCN:

processors.packet.pcap_stream.receiver_queue_size

Upgrade from old version: static_config.pcap.queue-size

Default value:

processors:
  packet:
    pcap_stream:
      receiver_queue_size: 65536

1
2
3
4

Schema:

Key	Value
Type	int
Range	[65536, 64000000]

Description:

The length of the following queues:

1-mini-meta-packet-to-pcap

#3.1.3.2 Buffer Size Per Flow

Tags:

agent_restart ee_feature

FQCN:

processors.packet.pcap_stream.buffer_size_per_flow

Upgrade from old version: static_config.pcap.flow-buffer-size

Default value:

processors:
  packet:
    pcap_stream:
      buffer_size_per_flow: 65536

1
2
3
4

Schema:

Key	Value
Type	int
Range	[64, 64000000]

Description:

PCap buffer size per flow. Will flush the flow when reach this limit.

#3.1.3.3 Total Buffer Size

Tags:

agent_restart ee_feature

FQCN:

processors.packet.pcap_stream.total_buffer_size

Upgrade from old version: static_config.pcap.buffer-size

Default value:

processors:
  packet:
    pcap_stream:
      total_buffer_size: 88304

1
2
3
4

Schema:

Key	Value
Type	int
Range	[65536, 64000000]

Description:

Total PCap buffer size. Will flush all flows when reach this limit.

#3.1.3.4 Flush Interval

Tags:

agent_restart ee_feature

FQCN:

processors.packet.pcap_stream.flush_interval

Upgrade from old version: static_config.pcap.flush-interval

Default value:

processors:
  packet:
    pcap_stream:
      flush_interval: 1m

1
2
3
4

Schema:

Key	Value
Type	duration
Range	['1s', '10m']

Description:

Flushes the PCap buffer of a flow if it has not been flushed for this interval.

#3.1.4 TOA (TCP Option Address)

#3.1.4.1 Sender Queue Size

Tags:

agent_restart

FQCN:

processors.packet.toa.sender_queue_size

Upgrade from old version: static_config.toa-sender-queue-size

Default value:

processors:
  packet:
    toa:
      sender_queue_size: 65536

1
2
3
4

Schema:

Key	Value
Type	int
Range	[65536, 64000000]

Description:

The length of the following queues:

1-socket-sync-toa-info-queue

#3.1.4.2 Cache Size

Tags:

agent_restart

FQCN:

processors.packet.toa.cache_size

Upgrade from old version: static_config.toa-lru-cache-size

Default value:

processors:
  packet:
    toa:
      cache_size: 65536

1
2
3
4

Schema:

Key	Value
Type	int
Range	[1, 64000000]

Description:

Size of tcp option address info cache size.

#3.2 Request Log

#3.2.1 Application Protocol Inference

#3.2.1.1 Inference Maximum Retries

Tags:

agent_restart

FQCN:

processors.request_log.application_protocol_inference.inference_max_retries

Upgrade from old version: static_config.l7-protocol-inference-max-fail-count

Default value:

processors:
  request_log:
    application_protocol_inference:
      inference_max_retries: 128

1
2
3
4

Schema:

Key	Value
Type	int
Range	[0, 10000]

Description:

The agent records the application protocol resolution results of each server through a hash table, including the protocol, the number of continuous resolution failures, and the last resolution time

When an app protocol for Flow has never been successfully resolved, a hash table is used to decide which protocols to try to resolve:

If the result is not found in the hash table, or the result is not available (the protocol is unknown, or the number of failures exceeds the limit, or the time is more than inference_result_ttl from the current time)
- If the number of failures has been exceeded, Flow is marked as prohibited for resolution for a period of inference_result_ttl
- Otherwise, iterate through all open application protocols and try to parse them
  - When the parsing is successful, the protocol, parsing time, and number of failures (0) are updated to the hash table to keep the successful parsing results fresh
  - When parsing fails, the parsing time and number of failures (+1) are updated to the hash table so that the failed attempts can be accumulated, and subsequent attempts will be prohibited after the accumulation exceeds the threshold
- If a specific, available protocol is found in the hash table, it is attempted using that protocol
  - When the parsing is successful, the protocol, parsing time, and number of failures (0) are updated to the hash table to keep the successful parsing results fresh
  - When parsing fails, the parsing time and number of failures (+1) are updated to the hash table so that the failed attempts can be accumulated, and subsequent attempts will be prohibited after the accumulation exceeds the threshold

Once a Flow is successfully parsed once, it will only use that protocol type to try to parse it once, and there is no need to query the hash table。 Each time the resolution is successful, the protocol in the hash table (for HTTP2/gRPC needs to be updated), the resolution time, and the number of failures will be updated.

#3.2.1.2 Inference Result TTL

Tags:

agent_restart

FQCN:

processors.request_log.application_protocol_inference.inference_result_ttl

Upgrade from old version: static_config.l7-protocol-inference-ttl

Default value:

processors:
  request_log:
    application_protocol_inference:
      inference_result_ttl: 60s

1
2
3
4

Schema:

Key	Value
Type	duration
Range	['0ns', '1d']

Description:

deepflow-agent will mark the application protocol for each <vpc, ip, protocol, port> tuple. In order to avoid misidentification caused by IP changes, the validity period after successfully identifying the protocol will be limited to this value.

#3.2.1.3 Enabled Protocols

Tags:

agent_restart

FQCN:

processors.request_log.application_protocol_inference.enabled_protocols

Upgrade from old version: static_config.l7-protocol-enabled

Default value:

processors:
  request_log:
    application_protocol_inference:
      enabled_protocols:
      - HTTP
      - HTTP2
      - MySQL
      - Redis
      - Kafka
      - DNS
      - TLS

1
2
3
4
5
6
7
8
9
10
11

Enum options:

Value	Note
DYNAMIC_OPTIONS

Schema:

Key	Value
Type	string

Description:

Turning off some protocol identification can reduce deepflow-agent resource consumption. Supported protocols: https://www.deepflow.io/docs/features/l7-protocols/overview/ (opens new window) Oracle and TLS is only supported in the Enterprise Edition.

#3.2.1.4 Protocol Special Config

#3.2.1.4.1 Oracle

#3.2.1.4.1.1 Integer Byte Order

Tags:

agent_restart

FQCN:

processors.request_log.application_protocol_inference.protocol_special_config.oracle.is_be

Upgrade from old version: static_config.oracle-parse-config.is-be

Default value:

processors:
  request_log:
    application_protocol_inference:
      protocol_special_config:
        oracle:
          is_be: true

1
2
3
4
5
6

Schema:

Key	Value
Type	bool

Description:

Whether the oracle integer encode is big endian.

#3.2.1.4.1.2 Integer Compressed

Tags:

agent_restart

FQCN:

processors.request_log.application_protocol_inference.protocol_special_config.oracle.int_compressed

Upgrade from old version: static_config.oracle-parse-config.int-compress

Default value:

processors:
  request_log:
    application_protocol_inference:
      protocol_special_config:
        oracle:
          int_compressed: true

1
2
3
4
5
6

Schema:

Key	Value
Type	bool

Description:

Whether the oracle integer encode is compress.

#3.2.1.4.1.3 Response 0x04 with Extra Byte

Tags:

agent_restart

FQCN:

processors.request_log.application_protocol_inference.protocol_special_config.oracle.resp_0x04_extra_byte

Upgrade from old version: static_config.oracle-parse-config.resp-0x04-extra-byte

Default value:

processors:
  request_log:
    application_protocol_inference:
      protocol_special_config:
        oracle:
          resp_0x04_extra_byte: false

1
2
3
4
5
6

Schema:

Key	Value
Type	bool

Description:

Due to the response with data id 0x04 has different struct in different version, it may has one byte before row affect.

#3.2.1.4.2 MySQL

#3.2.1.4.2.1 Decompress MySQL Payload

Tags:

agent_restart

FQCN:

processors.request_log.application_protocol_inference.protocol_special_config.mysql.decompress_payload

Default value:

processors:
  request_log:
    application_protocol_inference:
      protocol_special_config:
        mysql:
          decompress_payload: true

1
2
3
4
5
6

Schema:

Key	Value
Type	bool

Description:

Some MySQL packets have payload compressed with LZ77 algorithm. Enable this option to decompress payload on parsing. Set to false to disable decompression for better performance. ref: MySQL Source Code Documentation (opens new window)

#3.2.1.4.3 Grpc

#3.2.1.4.3.1 Enable gRPC stream data

Tags:

agent_restart

FQCN:

processors.request_log.application_protocol_inference.protocol_special_config.grpc.streaming_data_enabled

Default value:

processors:
  request_log:
    application_protocol_inference:
      protocol_special_config:
        grpc:
          streaming_data_enabled: false

1
2
3
4
5
6

Schema:

Key	Value
Type	bool

Description:

When enabled, all gRPC packets are considered to be of the stream type, and the data will be reported, and the rrt calculation of the response will use the grpc-status field.

#3.2.2 Filters

#3.2.2.1 Port Number Pre-filters

Tags:

agent_restart

FQCN:

processors.request_log.filters.port_number_prefilters

Upgrade from old version: static_config.l7-protocol-ports

Default value:

processors:
  request_log:
    filters:
      port_number_prefilters:
        AMQP: 1-65535
        Custom: 1-65535
        DNS: 53,5353
        Dubbo: 1-65535
        FastCGI: 1-65535
        HTTP: 1-65535
        HTTP2: 1-65535
        Kafka: 1-65535
        MQTT: 1-65535
        Memcached: 11211
        MongoDB: 1-65535
        MySQL: 1-65535
        NATS: 1-65535
        OpenWire: 1-65535
        Oracle: 1521
        PING: 1-65535
        PostgreSQL: 1-65535
        Pulsar: 1-65535
        Redis: 1-65535
        RocketMQ: 1-65535
        SofaRPC: 1-65535
        SomeIP: 1-65535
        TLS: 443,6443
        Tars: 1-65535
        ZMTP: 1-65535
        bRPC: 1-65535

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

Enum options:

Value	Note
DYNAMIC_OPTIONS

Schema:

Key	Value
Type	dict

Description:

Port-list example:

HTTP: 80,1000-2000
HTTP2: 1-65535

1
2

NOTE:

HTTP2 and TLS are only used for Kprobe, not applicable to Uprobe. All data obtained through Uprobe is not subject to port restrictions.
- Supported protocols: https://www.deepflow.io/docs/features/l7-protocols/overview/ (opens new window)
- Oracle and TLS is only supported in the Enterprise Edition.
Attention: use HTTP2 for gRPC Protocol.

#3.2.2.2 Tag Filters

Tags:

agent_restart

FQCN:

processors.request_log.filters.tag_filters

Upgrade from old version: static_config.l7-log-blacklist

Default value:

processors:
  request_log:
    filters:
      tag_filters:
        AMQP: []
        Custom: []
        DNS: []
        Dubbo: []
        FastCGI: []
        HTTP: []
        HTTP2: []
        Kafka: []
        MQTT: []
        Memcached: []
        MongoDB: []
        MySQL: []
        NATS: []
        OpenWire: []
        Oracle: []
        PING: []
        PostgreSQL: []
        Pulsar: []
        Redis: []
        RocketMQ: []
        SOFARPC: []
        SomeIP: []
        TLS: []
        Tars: []
        ZMTP: []
        bRPC: []
        gRPC: []

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

Enum options:

Value	Note
DYNAMIC_OPTIONS

Schema:

Key	Value
Type	dict

Description:

Tag filter example:

processors:
  request_log:
    filters:
      tag_filters:
        HTTP:
          - field_name: request_resource  # endpoint, request_type, request_domain, request_resource
            operator: equal               # equal, prefix
            value: somevalue
        HTTP2: []
        # other protocols

1
2
3
4
5
6
7
8
9
10

A l7_flow_log blacklist can be configured for each protocol, preventing request logs matching the blacklist from being collected by the agent or included in application performance metrics. It's recommended to only place non-business request logs like heartbeats or health checks in this blacklist. Including business request logs might lead to breaks in the distributed tracing tree.

Supported protocols: https://www.deepflow.io/docs/features/l7-protocols/overview/ (opens new window)

Oracle and TLS is only supported in the Enterprise Edition.

#3.2.2.2.1 $HTTP Tag Filters

Tags:

agent_restart

FQCN:

processors.request_log.filters.tag_filters.HTTP

Upgrade from old version: static_config.l7-log-blacklist.$protocol

Default value:

processors:
  request_log:
    filters:
      tag_filters:
        HTTP: []

1
2
3
4
5

Schema:

Key	Value
Type	dict

Description:

HTTP Tag filter example:

processors:
  request_log:
    filters:
      tag_filters:
        HTTP:
          - field_name: request_resource  # endpoint, request_type, request_domain, request_resource
            operator: equal               # equal, prefix
            value: somevalue

1
2
3
4
5
6
7
8

A l7_flow_log tag_filter can be configured for each protocol, preventing request logs matching the blacklist from being collected by the agent or included in application performance metrics. It's recommended to only place non-business request logs like heartbeats or health checks in this blacklist. Including business request logs might lead to breaks in the distributed tracing tree.

Supported protocols: https://www.deepflow.io/docs/features/l7-protocols/overview/

Oracle and TLS is only supported in the Enterprise Edition.

#3.2.2.2.1.1 Field Name

Tags:

agent_restart

FQCN:

processors.request_log.filters.tag_filters.HTTP.field_name

Upgrade from old version: static_config.l7-log-blacklist.$protocol.field-name

Default value:

processors:
  request_log:
    filters:
      tag_filters:
        HTTP:
        - field_name: ''

1
2
3
4
5
6

Enum options:

Value	Note
endpoint
request_type
request_domain
request_resource

Schema:

Key	Value
Type	string

Description:

Match field name.

#3.2.2.2.1.2 Operator

Tags:

agent_restart

FQCN:

processors.request_log.filters.tag_filters.HTTP.operator

Upgrade from old version: static_config.l7-log-blacklist.$protocol.operator

Default value:

processors:
  request_log:
    filters:
      tag_filters:
        HTTP:
        - operator: ''

1
2
3
4
5
6

Enum options:

Value	Note
equal
prefix

Schema:

Key	Value
Type	string

Description:

Match operator.

#3.2.2.2.1.3 Field Value

Tags:

agent_restart

FQCN:

processors.request_log.filters.tag_filters.HTTP.field_value

Upgrade from old version: static_config.l7-log-blacklist.$protocol.value

Default value:

processors:
  request_log:
    filters:
      tag_filters:
        HTTP:
        - field_value: ''

1
2
3
4
5
6

Schema:

Key	Value
Type	string

Description:

Match field value.

#3.2.2.3 Unconcerned DNS NXDOMAIN

Tags:

agent_restart

FQCN:

processors.request_log.filters.unconcerned_dns_nxdomain_response_suffixes

Upgrade from old version: static_config.l7-protocol-advanced-features.unconcerned-dns-nxdomain-response-suffixes

Default value:

processors:
  request_log:
    filters:
      unconcerned_dns_nxdomain_response_suffixes: []

1
2
3
4

Schema:

Key	Value
Type	string

Description:

You might not be concerned about certain DNS NXDOMAIN errors and may wish to ignore them. For example, when a K8s Pod tries to resolve an external domain name, it first concatenates it with the internal domain suffix of the cluster and attempts to resolve it. All these attempts will receive an NXDOMAIN reply before it finally requests the original domain name directly, and these errors may not be of concern to you. In such cases, you can configure their response_result suffix here, so that the corresponding response_status in the l7_flow_log is forcibly set to Success.

#3.2.2.4 cBPF data disabled

Tags:

hot_update

FQCN:

processors.request_log.filters.cbpf_disabled

Default value:

processors:
  request_log:
    filters:
      cbpf_disabled: false

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

When disabled, deepflow-agent will not generate request_log from packet data.

#3.2.3 Timeouts

#3.2.3.1 TCP Request Timeout

Tags:

agent_restart

FQCN:

processors.request_log.timeouts.tcp_request_timeout

Upgrade from old version: static_config.rrt-tcp-timeout

Default value:

processors:
  request_log:
    timeouts:
      tcp_request_timeout: 1800s

1
2
3
4

Schema:

Key	Value
Type	duration
Range	['10s', '3600s']

Description:

The timeout of l7 log info rrt calculate, when rrt exceed the value will act as timeout and will not calculate the sum and average and will not merge the request and response in session aggregate. the value must greater than session aggregate SLOT_TIME (const 10s) and less than 3600 on tcp.

#3.2.3.2 UDP Request Timeout

Tags:

agent_restart

FQCN:

processors.request_log.timeouts.udp_request_timeout

Upgrade from old version: static_config.rrt-udp-timeout

Default value:

processors:
  request_log:
    timeouts:
      udp_request_timeout: 150s

1
2
3
4

Schema:

Key	Value
Type	duration
Range	['10s', '300s']

Description:

#3.2.3.3 Session Aggregate Window Duration

Tags:

agent_restart deprecated

FQCN:

processors.request_log.timeouts.session_aggregate_window_duration

Upgrade from old version: static_config.l7-log-session-aggr-timeout

Default value:

processors:
  request_log:
    timeouts:
      session_aggregate_window_duration: 120s

1
2
3
4

Schema:

Key	Value
Type	duration
Range	['20s', '300s']

Description:

l7_flow_log aggregate window.

#3.2.3.4 Application Session Aggregate Timeouts

Tags:

hot_update

FQCN:

processors.request_log.timeouts.session_aggregate

Default value:

processors:
  request_log:
    timeouts:
      session_aggregate: []

1
2
3
4

Schema:

Key	Value
Type	dict

Description:

Set the aggregation timeout for each application. The default values is 15s for DNS and TLS, 120s for others.

Example:

processors:
  request_log:
    timeouts:
      session_aggregate:
      - protocol: DNS
        timeout: 15s
      - protocol: HTTP2
        timeout: 120s

1
2
3
4
5
6
7
8

#3.2.3.4.1 Protocol

Tags:

hot_update

FQCN:

processors.request_log.timeouts.session_aggregate.protocol

Default value:

processors:
  request_log:
    timeouts:
      session_aggregate:
      - protocol: ''

1
2
3
4
5

Schema:

Key	Value
Type	string

Description:

Protocol Name for timeout setting.

#3.2.3.4.2 Timeout

Tags:

agent_restart

FQCN:

processors.request_log.timeouts.session_aggregate.timeout

Default value:

processors:
  request_log:
    timeouts:
      session_aggregate:
      - timeout: 0

1
2
3
4
5

Schema:

Key	Value
Type	duration

Description:

Set the timeout for the application.

#3.2.4 Tag Extraction

#3.2.4.1 Tracing Tag

#3.2.4.1.1 HTTP Real Client

Tags:

hot_update

FQCN:

processors.request_log.tag_extraction.tracing_tag.http_real_client

Upgrade from old version: http_log_proxy_client

Default value:

processors:
  request_log:
    tag_extraction:
      tracing_tag:
        http_real_client:
        - X_Forwarded_For

1
2
3
4
5
6

Schema:

Key	Value
Type	string

Description:

It is used to extract the real client IP field in the HTTP header, such as X-Forwarded-For, etc. Leave it empty to disable this feature. If multiple values are specified, the first match will be used. Fields rewritten by plugins have the highest priority.

#3.2.4.1.2 X-Request-ID

Tags:

hot_update

FQCN:

processors.request_log.tag_extraction.tracing_tag.x_request_id

Upgrade from old version: http_log_x_request_id

Default value:

processors:
  request_log:
    tag_extraction:
      tracing_tag:
        x_request_id:
        - X_Request_ID

1
2
3
4
5
6

Schema:

Key	Value
Type	string

Description:

It is used to extract the fields in the HTTP header that are used to uniquely identify the same request before and after the gateway, such as X-Request-ID, etc. This feature can be turned off by setting it to empty. If multiple values are specified, the first match will be used. Fields rewritten by plugins have the highest priority.

#3.2.4.1.3 APM TraceID

Tags:

hot_update

FQCN:

processors.request_log.tag_extraction.tracing_tag.apm_trace_id

Upgrade from old version: http_log_trace_id

Default value:

processors:
  request_log:
    tag_extraction:
      tracing_tag:
        apm_trace_id:
        - traceparent
        - sw8

1
2
3
4
5
6
7

Schema:

Key	Value
Type	string

Description:

Used to extract the TraceID field in HTTP and RPC headers, supports filling in multiple values separated by commas. This feature can be turned off by setting it to empty. If multiple values are specified, the first match will be used. Fields rewritten by plugins have the highest priority.

#3.2.4.1.4 APM SpanID

Tags:

hot_update

FQCN:

processors.request_log.tag_extraction.tracing_tag.apm_span_id

Upgrade from old version: http_log_span_id

Default value:

processors:
  request_log:
    tag_extraction:
      tracing_tag:
        apm_span_id:
        - traceparent
        - sw8

1
2
3
4
5
6
7

Schema:

Key	Value
Type	string

Description:

Used to extract the SpanID field in HTTP and RPC headers, supports filling in multiple values separated by commas. This feature can be turned off by setting it to empty. If multiple values are specified, the first match will be used. Fields rewritten by plugins have the highest priority.

#3.2.4.2 HTTP Endpoint

#3.2.4.2.1 Extraction Disabled

Tags:

agent_restart

FQCN:

processors.request_log.tag_extraction.http_endpoint.extraction_disabled

Upgrade from old version: static_config.l7-protocol-advanced-features.http-endpoint-extraction.disabled

Default value:

processors:
  request_log:
    tag_extraction:
      http_endpoint:
        extraction_disabled: false

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

HTTP endpoint extration is enabled by default.

#3.2.4.2.2 Match Rules

Tags:

agent_restart

FQCN:

processors.request_log.tag_extraction.http_endpoint.match_rules

Upgrade from old version: static_config.l7-protocol-advanced-features.http-endpoint-extraction.match-rules

Default value:

processors:
  request_log:
    tag_extraction:
      http_endpoint:
        match_rules:
        - keep_segments: 2
          url_prefix: ''

1
2
3
4
5
6
7

Schema:

Key	Value
Type	dict

Description:

Extract endpoint according to the following rules:

Find a longest prefix that can match according to the principle of "longest prefix matching"
Intercept the first few paragraphs in URL (the content between two / is regarded as one paragraph) as endpoint

By default, two segments are extracted from the URL. For example, the URL is /a/b/c?query=xxx, whose segment is 3, extracts /a/b as the endpoint.

#3.2.4.2.2.1 URL Prefix

Tags:

agent_restart

FQCN:

processors.request_log.tag_extraction.http_endpoint.match_rules.url_prefix

Upgrade from old version: static_config.l7-protocol-advanced-features.http-endpoint-extraction.match-rules.prefix

Default value:

processors:
  request_log:
    tag_extraction:
      http_endpoint:
        match_rules:
        - url_prefix: ''

1
2
3
4
5
6

Schema:

Key	Value
Type	string

Description:

HTTP URL prefix.

#3.2.4.2.2.2 Keep Segments

Tags:

agent_restart

FQCN:

processors.request_log.tag_extraction.http_endpoint.match_rules.keep_segments

Upgrade from old version: static_config.l7-protocol-advanced-features.http-endpoint-extraction.match-rules.keep-segments

Default value:

processors:
  request_log:
    tag_extraction:
      http_endpoint:
        match_rules:
        - keep_segments: 0

1
2
3
4
5
6

Schema:

Key	Value
Type	int

Description:

Keep how many segments.

#3.2.4.3 Custom Fields

Tags:

agent_restart

FQCN:

processors.request_log.tag_extraction.custom_fields

Upgrade from old version: static_config.l7-protocol-advanced-features.extra-log-fields

Default value:

processors:
  request_log:
    tag_extraction:
      custom_fields:
        HTTP: []
        HTTP2: []

1
2
3
4
5
6

Enum options:

Value	Note
HTTP
HTTP2

Schema:

Key	Value
Type	dict

Description:

Configuration to extract the customized header fields of HTTP, HTTP2, gRPC protocol etc.

Example:

processors:
  request_log:
    tag_extraction:
      custom_fields:
        HTTP:
        - field_name: "user-agent"
        - field_name: "cookie"

1
2
3
4
5
6
7

Attention: use HTTP2 for gRPC Protocol.

#3.2.4.3.1 $HTTP Custom Fields

Tags:

agent_restart

FQCN:

processors.request_log.tag_extraction.custom_fields.HTTP

Upgrade from old version: static_config.l7-protocol-advanced-features.extra-log-fields.$protocol

Default value:

processors:
  request_log:
    tag_extraction:
      custom_fields:
        HTTP: []

1
2
3
4
5

Schema:

Key	Value
Type	dict

Description:

Configuration to extract the customized header fields of HTTP, HTTP2, gRPC protocol etc.

Example:

processors:
  request_log:
    tag_extraction:
      custom_fields:
        HTTP:
        - field_name: "user-agent"
        - field_name: "cookie"

1
2
3
4
5
6
7

Attention: use HTTP2 for gRPC Protocol.

#3.2.4.3.1.1 Field Name

Tags:

agent_restart

FQCN:

processors.request_log.tag_extraction.custom_fields.HTTP.field_name

Upgrade from old version: static_config.l7-protocol-advanced-features.extra-log-fields.$protocol.field-name

Default value:

processors:
  request_log:
    tag_extraction:
      custom_fields:
        HTTP:
        - field_name: ''

1
2
3
4
5
6

Schema:

Key	Value
Type	string

Description:

Field name.

#3.2.4.4 Obfuscate Protocols

Tags:

agent_restart

FQCN:

processors.request_log.tag_extraction.obfuscate_protocols

Upgrade from old version: static_config.l7-protocol-advanced-features.obfuscate-enabled-protocols

Default value:

processors:
  request_log:
    tag_extraction:
      obfuscate_protocols:
      - Redis

1
2
3
4
5

Enum options:

Value	Note
MySQL
PostgreSQL
HTTP
HTTP2
Redis

Schema:

Key	Value
Type	string

Description:

For the sake of data security, the data of the protocol that needs to be desensitized is configured here and is not processed by default. Obfuscated fields mainly include:

Authorization information
Value information in various statements

#3.2.5 Tunning

#3.2.5.1 Payload Truncation

Tags:

hot_update

FQCN:

processors.request_log.tunning.payload_truncation

Upgrade from old version: l7_log_packet_size

Default value:

processors:
  request_log:
    tunning:
      payload_truncation: 1024

1
2
3
4

Schema:

Key	Value
Type	int
Unit	byte
Range	[256, 65535]

Description:

The maximum data length used for application protocol identification, note that the effective value is less than the value of inputs.cbpf.tunning.max_capture_packet_size.

NOTE: For eBPF data, the largest valid value is 16384.

#3.2.5.2 Session Aggregate Slot Capacity

Tags:

agent_restart deprecated

FQCN:

processors.request_log.tunning.session_aggregate_slot_capacity

Upgrade from old version: static_config.l7-log-session-slot-capacity

Default value:

processors:
  request_log:
    tunning:
      session_aggregate_slot_capacity: 1024

1
2
3
4

Schema:

Key	Value
Type	int
Range	[1024, 1000000]

Description:

By default, unidirectional l7_flow_log is aggregated into bidirectional request_log (session) with a caching time window of 2 minutes. During this period, every 5 seconds is considered as a time slot (i.e., a LRU). This configuration is used to specify the maximum number of unidirectional l7_flow_log entries that can be cached in each time slot.

If the number of l7_flow_log entries cached in a time slot exceeds this configuration, 10% of the data in that time slot will be evicted based on the LRU strategy to reduce memory consumption. Note that the evicted data will not be discarded; instead, they will be sent to the deepflow-server as unidirectional request_log.

The following metrics can be used as reference data for adjusting this configuration:

Metric deepflow_system.deepflow_agent_l7_session_aggr.cached-request-resource Used to record the total memory occupied by the request_resource field of the unidirectional l7_flow_log cached in all time slots at the current moment, in bytes.
Metric deepflow_system.deepflow_agent_l7_session_aggr.over-limit Used to record the number of times eviction is triggered due to reaching the LRU capacity limit.

#3.2.5.3 Session Aggregate Max Entries

Tags:

hot_update

FQCN:

processors.request_log.tunning.session_aggregate_max_entries

Default value:

processors:
  request_log:
    tunning:
      session_aggregate_max_entries: 65536

1
2
3
4

Schema:

Key	Value
Type	int
Range	[16384, 10000000]

Description:

The maximum number of l7_flow_log entries cached for merging into a session. If the total number of l7_flow_log entries exceeds this configuration, the oldest entry will be sent without merging, setting its response status to Unknown.

#3.2.5.4 Consistent Timestamp in L7 Metrics

Tags:

agent_restart

FQCN:

processors.request_log.tunning.consistent_timestamp_in_l7_metrics

Default value:

processors:
  request_log:
    tunning:
      consistent_timestamp_in_l7_metrics: false

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

When this configuration is enabled, for the same session, response-related metrics (such as response count, latency, exceptions) are recorded in the time slot corresponding to when the request occurred, rather than the time slot of the response itself. This means that when calculating metrics for requests and responses within a session, a consistent timestamp based on the time of the request occurrence is used.

#3.3 Flow Log

#3.3.1 Time Window

#3.3.1.1 Maximum Tolerable Packet Delay

Tags:

agent_restart

FQCN:

processors.flow_log.time_window.max_tolerable_packet_delay

Upgrade from old version: static_config.packet-delay

Default value:

processors:
  flow_log:
    time_window:
      max_tolerable_packet_delay: 1s

1
2
3
4

Schema:

Key	Value
Type	duration
Range	['1s', '20s']

Description:

The timestamp carried by the packet captured by AF_PACKET may be delayed from the current clock, especially in heavy traffic scenarios, which may be as high as nearly 10s. This also affects FlowMap aggregation window size.

#3.3.1.2 Extra Tolerable Flow Delay

Tags:

agent_restart

FQCN:

processors.flow_log.time_window.extra_tolerable_flow_delay

Upgrade from old version: static_config.second-flow-extra-delay-second

Default value:

processors:
  flow_log:
    time_window:
      extra_tolerable_flow_delay: 0s

1
2
3
4

Schema:

Key	Value
Type	duration
Range	['0s', '20s']

Description:

Extra tolerance for QuadrupleGenerator receiving flows. Affects 1s/1m QuadrupleGenerator aggregation window size.

#3.3.2 Conntrack (a.k.a. Flow Map)

#3.3.2.1 Flow Flush Interval

Tags:

agent_restart

FQCN:

processors.flow_log.conntrack.flow_flush_interval

Upgrade from old version: static_config.flow.flush-interval

Default value:

processors:
  flow_log:
    conntrack:
      flow_flush_interval: 1s

1
2
3
4

Schema:

Key	Value
Type	duration
Range	['1s', '1m']

Description:

Flow generation delay time in FlowMap, used to increase the window size in downstream processing units to avoid pushing the window too fast.

#3.3.2.2 Flow Generation

#3.3.2.2.1 Server Ports

Tags:

agent_restart

FQCN:

processors.flow_log.conntrack.flow_generation.server_ports

Upgrade from old version: static_config.server-ports

Default value:

processors:
  flow_log:
    conntrack:
      flow_generation:
        server_ports: []

1
2
3
4
5

Schema:

Key	Value
Type	int
Range	[1, 65535]

Description:

Service port list, priority lower than TCP SYN flags.

#3.3.2.2.2 Cloud Traffic Ignore MAC

Tags:

agent_restart

FQCN:

processors.flow_log.conntrack.flow_generation.cloud_traffic_ignore_mac

Upgrade from old version: static_config.flow.ignore-tor-mac

Default value:

processors:
  flow_log:
    conntrack:
      flow_generation:
        cloud_traffic_ignore_mac: false

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

When the MAC addresses of the two-way traffic collected at the same location are asymmetrical, the traffic cannot be aggregated into a Flow. You can set this value at this time. Only valid for Cloud (not IDC) traffic.

#3.3.2.2.3 Ignore L2End

Tags:

agent_restart

FQCN:

processors.flow_log.conntrack.flow_generation.ignore_l2_end

Upgrade from old version: static_config.flow.ignore-l2-end

Default value:

processors:
  flow_log:
    conntrack:
      flow_generation:
        ignore_l2_end: false

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

For Cloud traffic, only the MAC address corresponding to the side with L2End = true is matched when generating the flow. Set this value to true to force a double-sided MAC address match and only aggregate traffic with exactly equal MAC addresses.

#3.3.2.2.4 IDC Traffic Ignore VLAN

Tags:

agent_restart ee_feature

FQCN:

processors.flow_log.conntrack.flow_generation.idc_traffic_ignore_vlan

Upgrade from old version: static_config.flow.ignore-idc-vlan

Default value:

processors:
  flow_log:
    conntrack:
      flow_generation:
        idc_traffic_ignore_vlan: false

1
2
3
4
5

Schema:

Key	Value
Type	bool

Description:

When the VLAN of the two-way traffic collected at the same location are asymmetrical, the traffic cannot be aggregated into a Flow. You can set this value at this time. Only valid for IDC (not Cloud) traffic.

#3.3.2.3 Timeouts

#3.3.2.3.1 Established

Tags:

agent_restart

FQCN:

processors.flow_log.conntrack.timeouts.established

Upgrade from old version: static_config.flow.established-timeout

Default value:

processors:
  flow_log:
    conntrack:
      timeouts:
        established: 300s

1
2
3
4
5

Schema:

Key	Value
Type	duration
Range	['1s', '1d']

Description:

Timeouts for TCP State Machine - Established.

#3.3.2.3.2 Closing RST

Tags:

agent_restart

FQCN:

processors.flow_log.conntrack.timeouts.closing_rst

Upgrade from old version: static_config.flow.closing-rst-timeout

Default value:

processors:
  flow_log:
    conntrack:
      timeouts:
        closing_rst: 35s

1
2
3
4
5

Schema:

Key	Value
Type	duration
Range	['1s', '1d']

Description:

Timeouts for TCP State Machine - Closing Reset.

#3.3.2.3.3 Opening RST

Tags:

agent_restart

FQCN:

processors.flow_log.conntrack.timeouts.opening_rst

Upgrade from old version: static_config.flow.opening-rst-timeout

Default value:

processors:
  flow_log:
    conntrack:
      timeouts:
        opening_rst: 1s

1
2
3
4
5

Schema:

Key	Value
Type	duration
Range	['1s', '1d']

Description:

Timeouts for TCP State Machine - Opening Reset.

#3.3.2.3.4 Others

Tags:

agent_restart

FQCN:

processors.flow_log.conntrack.timeouts.others

Upgrade from old version: static_config.flow.others-timeout

Default value:

processors:
  flow_log:
    conntrack:
      timeouts:
        others: 5s

1
2
3
4
5

Schema:

Key	Value
Type	duration
Range	['1s', '1d']

Description:

Timeouts for TCP State Machine - Others.

#3.3.3 Tunning

#3.3.3.1 FlowMap Hash Slots

Tags:

agent_restart

FQCN:

processors.flow_log.tunning.flow_map_hash_slots

Upgrade from old version: static_config.flow.flow-slots-size

Default value:

processors:
  flow_log:
    tunning:
      flow_map_hash_slots: 131072

1
2
3
4

Schema:

Key	Value
Type	int
Range	[1024, 64000000]

Description:

Since FlowAggregator is the first step in all processing, this value is also widely used in other hash tables such as QuadrupleGenerator, Collector, etc.

#3.3.3.2 Concurrent Flow Limit

Tags:

agent_restart

FQCN:

processors.flow_log.tunning.concurrent_flow_limit

Upgrade from old version: static_config.flow.flow-count-limit

Default value:

processors:
  flow_log:
    tunning:
      concurrent_flow_limit: 65535

1
2
3
4

Schema:

Key	Value
Type	int
Range	[1024, 64000000]

Description:

Maximum number of flows that can be stored in FlowMap, It will also affect the capacity of the RRT cache, Example: rrt-cache-capacity = flow-count-limit. When rrt-cache-capacity is not enough, it will be unable to calculate the rrt of l7. When inputs.cbpf.common.capture_mode is Physical Mirror and concurrent_flow_limit is less than or equal to 65535, it will be forced to u32::MAX.

#3.3.3.3 Memory Pool Size

Tags:

agent_restart

FQCN:

processors.flow_log.tunning.memory_pool_size

Upgrade from old version: static_config.flow.memory-pool-size

Default value:

processors:
  flow_log:
    tunning:
      memory_pool_size: 65536

1
2
3
4

Schema:

Key	Value
Type	int
Range	[1024, 64000000]

Description:

This value is used to set max length of memory pool in FlowMap Memory pools are used for frequently create and destroy objects like FlowNode, FlowLog, etc.

#3.3.3.4 Maximum Size of Batched Buffer

Tags:

agent_restart

FQCN:

processors.flow_log.tunning.max_batched_buffer_size

Upgrade from old version: static_config.batched-buffer-size-limit

Default value:

processors:
  flow_log:
    tunning:
      max_batched_buffer_size: 131072

1
2
3
4

Schema:

Key	Value
Type	int
Range	[1024, 64000000]

Description:

Only TaggedFlow allocation is affected at the moment. Structs will be allocated in batch to minimalize malloc calls. Total memory size of a batch will not exceed this limit. A number larger than 128K is not recommended because the default MMAP_THRESHOLD is 128K, allocating chunks larger than 128K will result in calling mmap and more page faults.

#3.3.3.5 FlowAggregator Queue Size

Tags:

agent_restart

FQCN:

processors.flow_log.tunning.flow_aggregator_queue_size

Upgrade from old version: static_config.flow.flow-aggr-queue-size

Default value:

processors:
  flow_log:
    tunning:
      flow_aggregator_queue_size: 65535

1
2
3
4

Schema:

Key	Value
Type	int
Range	[65536, 64000000]

Description:

The length of the following queues:

2-second-flow-to-minute-aggrer

#3.3.3.6 FlowGenerator Queue Size

Tags:

agent_restart

FQCN:

processors.flow_log.tunning.flow_generator_queue_size

Upgrade from old version: static_config.flow-queue-size

Default value:

processors:
  flow_log:
    tunning:
      flow_generator_queue_size: 65536

1
2
3
4

Schema:

Key	Value
Type	int
Range	[65536, 64000000]

Description:

The length of the following queues:

1-tagged-flow-to-quadruple-generator
1-tagged-flow-to-app-protocol-logs
0-{flow_type}-{port}-packet-to-tagged-flow (flow_type: sflow, netflow)

#3.3.3.7 QuadrupleGenerator Queue Size

Tags:

agent_restart

FQCN:

processors.flow_log.tunning.quadruple_generator_queue_size

Upgrade from old version: static_config.quadruple-queue-size

Default value:

processors:
  flow_log:
    tunning:
      quadruple_generator_queue_size: 262144

1
2
3
4

Schema:

Key	Value
Type	int
Range	[262144, 64000000]

Description:

The length of the following queues:

2-flow-with-meter-to-second-collector
2-flow-with-meter-to-minute-collector

#4. Outputs

#4.1 Socket

#4.1.1 Data Socket Type

Tags:

hot_update

FQCN:

outputs.socket.data_socket_type

Upgrade from old version: collector_socket_type

Default value:

outputs:
  socket:
    data_socket_type: TCP

1
2
3

Enum options:

Value	Note
TCP
UDP
FILE

Schema:

Key	Value
Type	string

Description:

It can only be set to FILE in standalone mode, in which case l4_flow_log and l7_flow_log will be written to local files.

#4.1.2 NPB Socket Type

Tags:

hot_update ee_feature

FQCN:

outputs.socket.npb_socket_type

Upgrade from old version: npb_socket_type

Default value:

outputs:
  socket:
    npb_socket_type: RAW_UDP

1
2
3

Enum options:

Value	Note
UDP
RAW_UDP
TCP
ZMQ

Schema:

Key	Value
Type	string

Description:

RAW_UDP uses RawSocket to send UDP packets, which has the highest performance, but there may be compatibility issues in some environments.

#4.1.3 RAW_UDP QoS Bypass

Tags:

agent_restart

FQCN:

outputs.socket.raw_udp_qos_bypass

Upgrade from old version: static_config.enable-qos-bypass

Default value:

outputs:
  socket:
    raw_udp_qos_bypass: false

1
2
3

Schema:

Key	Value
Type	bool

Description:

When sender uses RAW_UDP to send data, this feature can be enabled to improve performance. Linux Kernel >= 3.14 is required. Note that the data sent when this feature is enabled cannot be captured by tcpdump.

#4.1.4 Multiple Sockets To Ingester

Tags:

hot_update

FQCN:

outputs.socket.multiple_sockets_to_ingester

Upgrade from old version: static_config.multiple-sockets-to-ingester

Default value:

outputs:
  socket:
    multiple_sockets_to_ingester: false

1
2
3

Schema:

Key	Value
Type	bool

Description:

When set to true, deepflow-agent will send data with multiple sockets to Ingester, which has higher performance, but will bring more impact to the firewall.

#4.2 Flow Log and Request Log

#4.2.1 Filters

#4.2.1.1 Capture Network Types for L4

Tags:

hot_update

FQCN:

outputs.flow_log.filters.l4_capture_network_types

Upgrade from old version: l4_log_tap_types

Default value:

outputs:
  flow_log:
    filters:
      l4_capture_network_types:
      - 0

1
2
3
4
5

Enum options:

Value	Note
-1	Disabled
0	All TAPs
DYNAMIC_OPTIONS	DYNAMIC_OPTIONS

Schema:

Key	Value
Type	int

Description:

The list of TAPs to collect l4_flow_log, you can also set a list of TAPs to be collected.

#4.2.1.2 Capture Network Types for L7

Tags:

hot_update

FQCN:

outputs.flow_log.filters.l7_capture_network_types

Upgrade from old version: l7_log_store_tap_types

Default value:

outputs:
  flow_log:
    filters:
      l7_capture_network_types:
      - 0

1
2
3
4
5

Enum options:

Value	Note
-1	Disabled
0	All TAPs
DYNAMIC_OPTIONS	DYNAMIC_OPTIONS

Schema:

Key	Value
Type	int

Description:

The list of TAPs to collect l7_flow_log, you can also set a list of TAPs to be collected.

#4.2.1.3 Ignored Observation Points for L4

Tags:

hot_update

FQCN:

outputs.flow_log.filters.l4_ignored_observation_points

Upgrade from old version: l4_log_ignore_tap_sides

Default value:

outputs:
  flow_log:
    filters:
      l4_ignored_observation_points: []

1
2
3
4

Enum options:

Value	Note
0	rest, Other NIC
1	c, Client NIC
2	s, Server NIC
4	local, Local NIC
9	c-nd, Client K8s Node
10	s-nd, Server K8s Node
17	c-hv, Client VM Hypervisor
18	s-hv, Server VM Hypervisor
25	c-gw-hv, Client-side Gateway Hypervisor
26	s-gw-hv, Server-side Gateway Hypervisor
33	c-gw, Client-side Gateway
34	s-gw, Server-side Gateway
41	c-p, Client Process
42	s-p, Server Process

Schema:

Key	Value
Type	int

Description:

Use the value of tap_side to control which l4_flow_log should be ignored for collection. This configuration also applies to tcp_sequence and pcap data in the Enterprise Edition. Default value [] means store everything.

#4.2.1.4 Ignored Observation Points for L7

Tags:

hot_update

FQCN:

outputs.flow_log.filters.l7_ignored_observation_points

Upgrade from old version: l7_log_ignore_tap_sides

Default value:

outputs:
  flow_log:
    filters:
      l7_ignored_observation_points: []

1
2
3
4

Enum options:

Value	Note
0	rest, Other NIC
1	c, Client NIC
2	s, Server NIC
4	local, Local NIC
9	c-nd, Client K8s Node
10	s-nd, Server K8s Node
17	c-hv, Client VM Hypervisor
18	s-hv, Server VM Hypervisor
25	c-gw-hv, Client-side Gateway Hypervisor
26	s-gw-hv, Server-side Gateway Hypervisor
33	c-gw, Client-side Gateway
34	s-gw, Server-side Gateway
41	c-p, Client Process
42	s-p, Server Process

Schema:

Key	Value
Type	int

Description:

Use the value of observation points to control which l7_flow_log should be ignored for collection. The default value [] means that all observation points are collected.

#4.2.2 Aggregators

#4.2.2.1 Health Check Flow Log Aggregation

Tags:

hot_update

FQCN:

outputs.flow_log.aggregators.aggregate_health_check_l4_flow_log

Default value:

outputs:
  flow_log:
    aggregators:
      aggregate_health_check_l4_flow_log: true

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

Agent will mark the following types of flows as close_type = normal end-client reset:

Client sends SYN, server replies SYN-ACK, client sends RST
Client sends SYN, server replies SYN-ACK, client sends ACK, client sends RST

This type of traffic is normal load balancer backend host inspection traffic and does not carry any meaningful application layer payload.

When this configuration item is set to true, Agent will reset the client port number of the flow log to 0 before aggregating the output, thereby reducing bandwidth and storage overhead.

#4.2.3 Throttles

#4.2.3.1 L4 Throttle

Tags:

hot_update

FQCN:

outputs.flow_log.throttles.l4_throttle

Upgrade from old version: l4_log_collect_nps_threshold

Default value:

outputs:
  flow_log:
    throttles:
      l4_throttle: 10000

1
2
3
4

Schema:

Key	Value
Type	int
Unit	Per Second
Range	[100, 1000000]

Description:

The maximum number of rows of l4_flow_log sent per second, when the actual number of upstream rows exceeds this value, reservoir sampling is applied to limit the actual number of rows sent.

#4.2.3.2 L7 Throttle

Tags:

hot_update

FQCN:

outputs.flow_log.throttles.l7_throttle

Upgrade from old version: l7_log_collect_nps_threshold

Default value:

outputs:
  flow_log:
    throttles:
      l7_throttle: 10000

1
2
3
4

Schema:

Key	Value
Type	int
Unit	Per Second
Range	[100, 1000000]

Description:

The maximum number of rows of l7_flow_log sent per second, when the actual number of rows exceeds this value, sampling is triggered.

#4.2.4 Tunning

#4.2.4.1 Collector Queue Size

Tags:

agent_restart

FQCN:

outputs.flow_log.tunning.collector_queue_size

Upgrade from old version: static_config.flow-sender-queue-size

Default value:

outputs:
  flow_log:
    tunning:
      collector_queue_size: 65536

1
2
3
4

Schema:

Key	Value
Type	int
Range	[65536, 64000000]

Description:

The length of the following queues:

3-flow-to-collector-sender
3-protolog-to-collector-sender

#4.3 Flow Metrics

#4.3.1 Enabled

Tags:

hot_update

FQCN:

outputs.flow_metrics.enabled

Upgrade from old version: collector_enabled

Default value:

outputs:
  flow_metrics:
    enabled: true

1
2
3

Schema:

Key	Value
Type	bool

Description:

When disabled, deepflow-agent will not send metrics and logging data collected using eBPF and cBPF.

Attention: set to false will also disable l4_flow_log and l7_flow_log.

#4.3.2 Filters

#4.3.2.1 Inactive Server Port Aggregation

Tags:

hot_update

FQCN:

outputs.flow_metrics.filters.inactive_server_port_aggregation

Upgrade from old version: inactive_server_port_enabled

Default value:

outputs:
  flow_metrics:
    filters:
      inactive_server_port_aggregation: false

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

When enabled, deepflow-agent will not generate detailed metrics for each inactive port (ports that only receive data, not send data), and the data of all inactive ports will be aggregated into the metrics with a tag 'server_port = 0'.

#4.3.2.2 Inactive IP Aggregation

Tags:

hot_update

FQCN:

outputs.flow_metrics.filters.inactive_ip_aggregation

Upgrade from old version: inactive_ip_enabled

Default value:

outputs:
  flow_metrics:
    filters:
      inactive_ip_aggregation: false

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

When enabled, deepflow-agent will not generate detailed metrics for each inactive IP address (IP addresses that only receive data, not send data), and the data of all inactive IP addresses will be aggregated into the metrics with a tag 'ip = 0'.

#4.3.2.3 NPM Metrics

Tags:

hot_update

FQCN:

outputs.flow_metrics.filters.npm_metrics

Upgrade from old version: l4_performance_enabled

Default value:

outputs:
  flow_metrics:
    filters:
      npm_metrics: true

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

When closed, deepflow-agent only collects some basic throughput metrics.

#4.3.2.4 NPM Concurrent Metrics

Tags:

hot_update

FQCN:

outputs.flow_metrics.filters.npm_metrics_concurrent

Default value:

outputs:
  flow_metrics:
    filters:
      npm_metrics_concurrent: true

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

When closed, deepflow-agent does not calculate metrics concurrent.

#4.3.2.5 APM Metrics

Tags:

hot_update

FQCN:

outputs.flow_metrics.filters.apm_metrics

Upgrade from old version: l7_metrics_enabled

Default value:

outputs:
  flow_metrics:
    filters:
      apm_metrics: true

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

When closed, deepflow-agent will not collect RED (request/error/delay) metrics.

#4.3.2.6 Second Metrics

Tags:

hot_update

FQCN:

outputs.flow_metrics.filters.second_metrics

Upgrade from old version: vtap_flow_1s_enabled

Default value:

outputs:
  flow_metrics:
    filters:
      second_metrics: true

1
2
3
4

Schema:

Key	Value
Type	bool

Description:

Second granularity metrics.

#4.3.3 Tunning

#4.3.3.1 Sender Queue Size

Tags:

agent_restart

FQCN:

outputs.flow_metrics.tunning.sender_queue_size

Upgrade from old version: static_config.collector-sender-queue-size

Default value:

outputs:
  flow_metrics:
    tunning:
      sender_queue_size: 65536

1
2
3
4

Schema:

Key	Value
Type	int
Range	[65536, 64000000]

Description:

The length of the following queues:

3-doc-to-collector-sender

#4.4 NPB (Network Packet Broker)

#4.4.1 Maximum MTU

Tags:

hot_update ee_feature

FQCN:

outputs.npb.max_mtu

Upgrade from old version: mtu

Default value:

outputs:
  npb:
    max_mtu: 1500

1
2
3

Schema:

Key	Value
Type	int
Unit	byte
Range	[500, 10000]

Description:

Maximum MTU allowed when using UDP for NPB.

Attention: Public cloud service providers may modify the content of the tail of the UDP packet whose packet length is close to 1500 bytes. When using UDP transmission, it is recommended to set a slightly smaller value.

#4.4.2 RAW_UDP VLAN Tag

Tags:

hot_update ee_feature

FQCN:

outputs.npb.raw_udp_vlan_tag

Upgrade from old version: output_vlan

Default value:

outputs:
  npb:
    raw_udp_vlan_tag: 0

1
2
3

Schema:

Key	Value
Type	int
Range	[0, 4095]

Description:

When using RAW_UDP Socket to transmit UDP data, this value can be used to set the VLAN tag. Default value 0 means no VLAN tag.

#4.4.3 Extra VLAN Header

Tags:

hot_update ee_feature

FQCN:

outputs.npb.extra_vlan_header

Upgrade from old version: npb_vlan_mode

Default value:

outputs:
  npb:
    extra_vlan_header: 0

1
2
3

Enum options:

Value	Note
0	None
1	802.1Q
2	QinQ

Schema:

Key	Value
Type	int

Description:

Whether to add an extra 802.1Q header to NPB traffic, when this value is set, deepflow-agent will insert a VLAN Tag into the NPB traffic header, and the value is the lower 12 bits of TunnelID in the VXLAN header.

#4.4.4 Traffic Global Dedup

Tags:

hot_update ee_feature

FQCN:

outputs.npb.traffic_global_dedup

Upgrade from old version: npb_dedup_enabled

Default value:

outputs:
  npb:
    traffic_global_dedup: true

1
2
3

Schema:

Key	Value
Type	bool

Description:

Whether to enable global (distributed) traffic deduplication for the NPB feature.

#4.4.5 Target Port

Tags:

agent_restart ee_feature

FQCN:

outputs.npb.target_port

Upgrade from old version: static_config.npb-port

Default value:

outputs:
  npb:
    target_port: 4789

1
2
3

Schema:

Key	Value
Type	int
Range	[1, 65535]

Description:

Server port for NPB.

#4.4.6 Custom VXLAN Flags

Tags:

agent_restart ee_feature

FQCN:

outputs.npb.custom_vxlan_flags

Upgrade from old version: static_config.vxlan-flags

Default value:

outputs:
  npb:
    custom_vxlan_flags: 255

1
2
3

Schema:

Key	Value
Type	int
Range	[0, 255]

Description:

NPB uses the first byte of the VXLAN Flag to identify the sending traffic to prevent the traffic sent by NPB from being collected by deepflow-agent.

Attention: To ensure that the VNI bit is set, the value configured here will be used after |= 0b1000_0000. Therefore, this value cannot be directly configured as 0b1000_0000.

#4.4.7 Overlay VLAN Header Trimming

Tags:

agent_restart ee_feature

FQCN:

outputs.npb.overlay_vlan_header_trimming

Upgrade from old version: static_config.ignore-overlay-vlan

Default value:

outputs:
  npb:
    overlay_vlan_header_trimming: false

1
2
3

Schema:

Key	Value
Type	bool

Description:

This configuration only ignores the VLAN header in the captured original message and does not affect the configuration item: npb_vlan_mode

#4.4.8 Maximum Tx Throughput

Tags:

hot_update ee_feature

FQCN:

outputs.npb.max_tx_throughput

Upgrade from old version: max_npb_bps

Default value:

outputs:
  npb:
    max_tx_throughput: 1000

1
2
3

Schema:

Key	Value
Type	int
Unit	Mbps
Range	[1, 100000]

Description:

Maximum traffic rate allowed for npb sender.

#4.5 Compression

#4.5.1 Application_Log

Tags:

agent_restart

FQCN:

outputs.compression.application_log

Default value:

outputs:
  compression:
    application_log: true

1
2
3

Schema:

Key	Value
Type	bool

Description:

Whether to compress the integrated application log data received by deepflow-agent. The compression ratio is about 5:1~20:1. Turning on this feature will result in higher CPU consumption of deepflow-agent.

#4.5.2 Pcap

Tags:

agent_restart

FQCN:

outputs.compression.pcap

Default value:

outputs:
  compression:
    pcap: true

1
2
3

Schema:

Key	Value
Type	bool

Description:

Whether to compress the captured pcap data received by deepflow-agent. The compression ratio is about 5:1~10:1. Turning on this feature will result in higher CPU consumption of deepflow-agent.

#4.5.3 Request Log

Tags:

agent_restart

FQCN:

outputs.compression.l7_flow_log

Default value:

outputs:
  compression:
    l7_flow_log: true

1
2
3

Schema:

Key	Value
Type	bool

Description:

Whether to compress the l7 flow log. The compression ratio is about 8:1. Turning on this feature will result in higher CPU consumption of deepflow-agent.

#4.5.4 Flow Log

Tags:

agent_restart

FQCN:

outputs.compression.l4_flow_log

Default value:

outputs:
  compression:
    l4_flow_log: false

1
2
3

Schema:

Key	Value
Type	bool

Description:

Whether to compress the l4 flow log.

#5. Plugins

#5.1 Wasm Plugins

Tags:

hot_update

FQCN:

plugins.wasm_plugins

Upgrade from old version: wasm_plugins

Default value:

plugins:
  wasm_plugins: []

1
2

Enum options:

Value	Note
DYNAMIC_OPTIONS	DYNAMIC_OPTIONS

Schema:

Key	Value
Type	string

Description:

Wasm plugin need to load in agent

#5.2 SO Plugins

Tags:

hot_update

FQCN:

plugins.so_plugins

Upgrade from old version: so_plugins

Default value:

plugins:
  so_plugins: []

1
2

Enum options:

Value	Note
DYNAMIC_OPTIONS	DYNAMIC_OPTIONS

Schema:

Key	Value
Type	string

Description:

so plugin need to load in agent. so plugin use dlopen flag RTLD_LOCAL and RTLD_LAZY to open the so file, it mean that the so must solve the link problem by itself

#6. Dev

#6.1 Feature Flags

Tags:

agent_restart

FQCN:

dev.feature_flags

Upgrade from old version: static_config.feature-flags

Default value:

dev:
  feature_flags: []

1
2

Schema:

Key	Value
Type	string

Description:

Unreleased deepflow-agent features can be turned on by setting this switch.