v6.3 EE Release Notes

Created:2024-07-05 Last Modified:2024-07-05

This document was translated by ChatGPT

#1. Universal Map, Applications, Network, Infrastructure, Events

  • AutoMetrics
    • Support for parsing FastCGI protocol
    • Support for parsing Hang Seng T3 protocol
    • Dubbo protocol supports parsing event and serialization_id fields
    • Recognize RST disconnections in SLB health checks as normal behavior
    • Added endpoint field in the application aggregation metrics table
    • Support for collecting statement-id of MySQL calls to correlate COM_STMT_EXECUTE and COM_STMT_QUERY, thereby tracing SQL statements
  • AutoTracing
    • Support for parsing Tingyun's Tracing field X-Tingyun
    • Support for tracing calls before and after managed ALB/SLB services
    • Support for parsing TraceID in MySQL statements
    • Added allow_multiple_trace_ids_in_tracing_result configuration item to allow multiple TraceIDs in tracing results
    • Support for calling APM's Trace API to supplement tracing data
  • AutoTagging
    • Support for automatically associating K8s Annotations and Env tags
    • Added capability to automatically associate K8s containers via PID, solving the issue of marking container resource information for HostNetwork Pods in eBPF data
  • SQL
    • Added count operator to metrics to calculate the number of rows in raw data
  • Universal Map
    • Optimized service list
      • Only display performance metrics when the service is called
      • Support for switching signal sources via signal_source
      • Adjusted default displayed metrics and added bar charts to show relative sizes of metrics
    • Added service topology
      • Support for defining integrated business access topology for cloud and on-premises
  • Applications
    • Distributed Tracing
      • Adjusting the split ratio of the right-slide page does not scale the flame graph
      • Support for tracing starting from network Span
      • Flame graph supports displaying network Span's collection NIC information
      • Left quick filter box supports quick filtering and switching of signal sources
    • Continuous Profiling
      • Support for linked display of tables and flame graphs
      • Optimized merging logic of Function Stack in flame graph
      • Compressed storage of Function Stack in ClickHouse
      • Support for eBPF collection of On-CPU Profile data for compiled (Golang/Rust, etc.) and interpreted (Java, etc.) languages
  • Infrastructure
    • Added host and container pages, displayed based on Prometheus metrics
  • Events
    • Split the events page into three pages: resource changes, file read/write, and alert practices
  • GUI
    • Enhanced snapshot search capabilities: support for sorting, condition copying, etc.
    • Search bar supports pasting Key: value search conditions
    • Search bar supports modifying operators of existing conditions
    • Simplified page search conditions and conditions carried by the right-slide page to improve usability
    • Comprehensive UI optimization of the page
    • Optimized left quick filter box: support for search filtering, displaying matched data volume, filtering metric value ranges, switching query areas, and switching data tables
    • Support for continuously opening multiple right-slide pages and jumping between different right-slide pages
    • Event data displayed in the right-slide page can be switched to view client-side or server-side events separately
    • Popup page for viewing database fields supports displaying table names
    • Support for viewing search conditions of the current Panel

#2. Dashboard, Metrics, Alerts, Reports

  • Dashboard
    • Support for dragging to modify table size
    • Optimized display details of bar charts
    • Added new Panel type: overview map
    • Refactored Panel editing page
    • Optimized layout of Panel buttons
    • Merged line chart and Top line chart
    • Optimized display when template variable names are too long
    • Template variable list supports drag-and-drop sorting
  • Metrics
    • Support for inputting PromQL to query data
  • Alerts
    • Support for directly creating alert policies (no need to create a Dashboard)
    • HTTP push endpoints support using Tags to render push content
    • Email push titles support using variables
    • Optimized display of system alert events
  • GUI
    • Unified search condition input boxes for Panel editing page, metrics search page, and alert policy editing page

#3. Resources, System

  • Resources
    • Optimized display of POD list, VPC list, availability zone list, and region list, added container node count and collector status columns
    • Support for synchronizing CloneSet and Advanced StatefulSet workloads in OpenKruise
    • Support for independently configuring synchronization intervals for different cloud platforms
    • Support for synchronizing IP addresses on loopback interfaces (usually VIPs)
  • Integration
    • Prometheus Integration
      • PromQL supports topk and bottomk functions
      • PromQL API supports RFC3339 time format
      • Support for obtaining HTTP Headers in RemoteWrite as additional Labels
      • Optimized storage performance of RemoteWrite, and query performance of RemoteRead and PromQL
    • OpenTelemetry Integration
      • Support for running without ClickHouse
  • Agent
    • Plugin
      • Added support for so plugins, providing C SDK
      • Wasm Demo: Parse error codes in HTTP Payload and reassign response_code and response_exception
      • Wasm Demo: Parse Protobuf messages in Payload
    • Changed the periodic reporting interval of long streams from absolute 0 seconds (start of each minute) to relative 0 seconds (relative to the start time of the stream every 60 minutes)
      • Advantage: Reduced pressure of sending stream logs at absolute 0 seconds, avoiding splitting streams with a lifecycle of less than 60 seconds into two stream logs
    • Configuration
      • Support for configuring CPU affinity and priority
      • Added kprobe-blacklist configuration item to set port number blacklist for eBPF data collection, avoiding collection loops
      • Added options to ignore statistics positions for flow logs (l4_log_ignore_tap_sides) and call logs (l7_log_ignore_tap_sides) to reduce data collection volume
    • Adaptation
      • Support for running in Tencent TCE's DPDK host machine
      • Removed HostNetwork requirement for container collectors
      • Support for environments where the number of matching results for collection NICs (tap_interface_regex) exceeds 255
      • Support for running in business Pods in Sidecar mode
      • Support for deployment as a BlueKing Plugin
  • Server
    • Support for configuring system alert mailboxes on the page
    • Default value for automatic deletion time of abnormal controllers and data nodes set to 30 days
    • Unified storage of alert events in ClickHouse
    • Real-time push to Agent when resource information changes are detected
    • Support for disabling K8s cluster auto-discovery, allowing synchronization as an affiliated K8s cluster of public cloud
    • Support for specifying (fixing) Agent for K8s resource information synchronization
    • Added deepflow identifier to all HostPath paths required for deployment
  • CLI
    • Released deepflow-ctl for MacOS