v6.4 EE Release Notes

Created:2024-11-05 Last Modified:2024-11-05

This document was translated by ChatGPT

#1. Universal Map, Applications, Network, Infrastructure, Events

  • AutoMetrics
    • Support for parsing MongoDB protocol (Documentation)
    • Support for parsing TLS protocol, and obtaining information such as access domain, key algorithm, TLS version, certificate expiration time, etc.
    • Support for parsing all encrypted application protocols over TLS, not limited to HTTP
    • Enhanced ability to parse HTTP2 compressed headers, cBPF, eBPF kprobe data support compression restoration
    • MySQL, PostgreSQL, Redis protocol data support obfuscation, can be enabled through Agent's obfuscate-enabled-protocols
    • Support for parsing Oracle protocol
    • Optimized default parsing port for DNS traffic, added 5353, see Agent's l7-protocol-ports configuration item for details
    • In flow logs and network performance metrics, system latency (srt, srt_max) supports recording ICMP traffic latency and correcting ICMP traffic direction using ICMP Echo messages
    • Streamlined connection metrics in flow logs, removed redundant rtt_client_avg and rtt_server_avg
    • Support for parsing Geneve tunnel encapsulation (adapted to Kube-OVN)
  • AutoTracing
    • Support for extracting trace_id from SofaRPC Payload (Hessian encoding, TreeMap structure)
    • Support for extracting TraceID fields injected by SkyWalking and OpenTelemetry in Kafka messages
    • Support for parsing SkyWalking sw3 Header
  • AutoProfiling
    • Support for displaying a universal CPU flame graph of all processes on a server, down to the thread level
    • Generate Java process symbol tables in an interleaved manner with random delays to avoid high load due to clustering, support modifying Agent's java-symbol-file-refresh-defer-interval configuration item to adjust the base interval of delay
    • No need to generate symbol table files in the Pod where the Java process is located when enabling Profiling
  • AutoTagging
    • Support for synchronizing tag information of cloud servers in Tencent Cloud and Huawei Cloud, and resource set information of cloud servers in Alibaba Cloud
    • Support for synchronizing MAC addresses of NICs in bare metal servers in QingCloud private cloud through Agent
    • Support for synchronizing RDS and Redis resources in Huawei Cloud
    • Support for synchronizing Redis resources in Baidu Cloud
    • Support for injecting custom auto-grouping tags into all observability data, achieving the ability to automatically take the first non-empty tag and ignore subsequent tag columns by combining multiple tag fields
    • Added pod_group_type (K8s workload type) tag field to all observability data, expanded auto_service_type values to represent K8s workload types
    • Added coroutine ID fields syscall_coroutine_0, syscall_coroutine_1 to call logs
    • Support for extracting topic_name field from Kafka messages and assigning it to request_resource in call logs
    • Support for extracting endpoints from HTTP URLs and assigning them to call logs and application performance metrics data, support configuring http-endpoint-extraction extraction rules for Agent
    • Support for accurately setting Pod tags for all eBPF observability data of HostNetwork Pods
    • Added gprocess process information tags to continuous profiling data
    • Deprecated l7_protocol (application protocol) type Others, merged HTTP_TLS, HTTP2_TLS into HTTP and HTTP2, added is_tls in call logs to indicate whether it is encrypted traffic
  • Universal Map
    • Service list supports clicking to pop up a right-slide page to display associated data of the service
    • Service topology supports manual adjustment of node positions
    • Support for setting the metrics displayed in the service list and service topology
    • Right-slide page automatically associates infrastructure metrics data
    • Support for sharing business with other accounts
  • Network
    • Support for inserting double-layer VLAN (802.1Q / QinQ) in distributed traffic to express 24bit traffic tags
    • Support for specifying 12bit traffic tags for the inner and outer layers of QinQ in distribution policies
    • Support for associating and displaying all call logs from the flow log details page
    • Support for distributing unidirectional traffic only, such as distributing SQL request traffic only
  • Infrastructure
    • Default deployment of grafana-agent to collect infrastructure metrics of the K8s cluster where DeepFlow Server is located
  • Usability Improvements
    • Search Box
      • Support for custom search bar, providing two simple search modes by default: container search and process search
      • Added bidirectional path search mode, ignoring client and server directions
      • Removed "path filter" condition echo in the search bar, indicated by ICON status
      • Optimized performance of the dropdown input box, supporting prompts and filtering for tens of thousands of candidates
      • Support for setting the trigger behavior of the search box when the page is first loaded and during the input process of search conditions (automatic/manual)
      • Aligned search conditions on the continuous profiling page with other pages
      • Support for setting dynamic value ranges for template variables, such as supporting setting cloud server candidates through VPC
    • Universal Map
      • Support for batch defining paths in business definitions
      • Support for directly clicking to edit paths on the service topology
      • Topology graph in the right-slide page of the service topology supports marking nodes and connections in red based on thresholds
      • Usability improvements in the service topology editing page, standard operation capabilities aligned for Panel, topology display optimization
      • Optimized node positions when the service topology page is first loaded, centered and displayed in full view
      • Optimized business definition details page to display more information about the business
      • Optimized service topology operation experience
    • TCP Sequence Diagram
      • Optimized display style of special packets (SYN, FIN, RST)
      • Added date column to display human-readable time information
    • Right-slide Page
      • In the right-slide page, flow logs, call logs, and event tabs support quick filter box on the left
      • Added time control on the network path page
      • Overview graph supports modifying metrics
    • Distributed Tracing
      • Optimized the order of network Spans in the distributed tracing flame graph
      • Quick filter box on the left supports filtering by application protocol
      • Fixed the Tab page below the flame graph to avoid page jitter
    • Trend Analysis: Optimized the display of trend analysis graphs in flow logs, PCAP download, call logs, distributed tracing, events, and continuous profiling pages
    • Knowledge Graph: Optimized the display of deleted resource names in the knowledge graph: $name (deleted)
    • Screen Adaptation: Adjusted page chart sizes to optimize display effects on small screens
    • Optimized default columns displayed in the flow log table, added flow end type
    • Optimized column selection operations in tables to reduce mouse clicks
    • Resource names are copied without resource type names
    • Optimized display of the quick filter box on the left
    • Optimized page menu and topology graph styles

#2. Dashboard, Metrics, Alerts, Reports

  • Dashboard
    • Panel supports multiple query conditions
    • Support for creating Panel directly in the Dashboard
    • Predefined SQL, Redis, DNS, Ingress, Dubbo Dashboards
    • Panel supports setting colors
    • Overview graph supports setting colors, font sizes, adding hourly and daily synchronization display of metric values, and displaying historical trends of metric values in the background
    • Bar chart supports adjusting sorting order (ascending, descending)
    • Table columns are allocated by content by default
    • Support for switching the query area of all Panels in the entire Dashboard
  • Usability Improvements
    • Optimized layout of the Dashboard list page, added quick filter box on the left
    • Optimized default layout position when adding Panel
    • Fixed header row when scrolling down the Dashboard page
    • Optimized search condition editing area of Panel

#3. Resources, System

  • Integration
    • PromQL operators offloaded to ClickHouse, improving PromQL query performance
    • Server supports exporting metrics through Prometheus RemoteWrite protocol (thanks to chenjiandongx: PR (opens new window))
    • Support for extensible Exporter interface
  • Agent
    • eBPF functionality of Agent adapted to Linux 3.10 kernel (Detailed Documentation)
    • Support for collecting traffic in DPDK environment
    • Use TCP protocol to transmit Agent's own logs
    • Reduced Agent memory consumption by 60% in scenarios with a large number of new TCP flows
    • Optimized HTTP2 parsing performance, reducing CPU usage by 60%
  • Wasm
    • Wasm Plugin supports dynamic loading in Agent
  • Server
    • Optimized K8s Label, Annotation, Env synchronization mechanism, support for setting regular expression filters for interested tags, support for limiting the maximum length of tag values
    • Support for balancing data sending Server based on the amount of data sent by Agent, improving data volume balance of ClickHouse
    • ClickHouse uses Array(LowCardinality(String)) instead of Array(String) to optimize read and write performance of low cardinality fields, such as tag_names, metrics_names, etc.
    • DeepFlow's self-monitoring metrics are merged into one table deepflow_system.deepflow_system in ClickHouse
    • Simplified host list CIDR configuration on the Server side when synchronizing resource information through Agent
    • Modified Avg operator logic, Avg represents weighted average algorithm, AAvg represents arithmetic average algorithm
    • Added trace_id_index integer column in ClickHouse as an index column for trace_id field, and supports extracting Timestamp from it, accelerating TraceID search
    • Profiling data supports plaintext (non-compressed) storage in ClickHouse
    • ClickHouse upgraded to v23.8 (LTS)
    • Deprecated Telegraf component
  • API
    • Added Derivative pre-operator to SQL API, allowing calculation of differences for Prometheus Counter type metrics to calculate rates
    • Added TopK and Any operators to SQL API to get high-frequency or any values of specified Tags
    • Support for deleting cloud platforms by name
    • Optimized API performance of resource pages
  • CLI
    • Support for debugging eBPF Socket Data using deepflow-ctl
    • Added list command: deepflow-ctl domain additional-resource list --type <resource_type> --name <resource_name>
    • Provided deepflow-ctl for MacOS
  • Usability Improvements
    • Dropdown boxes on system resource pages support search filtering