Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: Expand ai-statistics plugins to enhance tracing capacity #1246

Merged
merged 4 commits into from
Aug 27, 2024

Conversation

shalldid
Copy link
Contributor

Ⅰ. Describe what this PR did

扩展统计插件,增加tracing能力,可根据配置文件动态设置追踪标签,增强追踪数据的可关联性和信息丰富度。

Ⅱ. Does this pull request fix one issue?

fixes #1234

Ⅲ. Special notes for reviews

主要变更:

  • 扩展AIStatisticsConfig结构体,新增TracingLabel数组用于配置追踪标签。
  • 新增onHttpRequestHeaders和onHttpResponseHeaders函数,用于处理请求和响应头的追踪标签设置。
  • 调整parseConfig函数以解析并存储TracingLabel配置。
  • 实现fetchTracingLabelValue函数,用于根据配置的标签值源获取相应的值,并设置追踪标签。

…值源(如请求头、响应头、属性等)来设置追踪标签。此外,实现了在请求处理流程中获取请求头和响应头值,并将这些值作为标签发送给追踪系统。

主要变更:
- 扩展AIStatisticsConfig结构体,新增TracingLabel数组用于配置追踪标签。
- 新增onHttpRequestHeaders和onHttpResponseHeaders函数,用于处理请求和响应头的追踪标签设置。
-调整parseConfig函数以解析并存储TracingLabel配置。
- 实现fetchTracingLabelValue函数,用于根据配置的标签值源获取相应的值,并设置追踪标签。

此更新使得可以根据请求和响应头的值动态设置追踪标签,增强追踪数据的可关联性和信息丰富度。
@shalldid shalldid changed the title 扩展统计插件,增加tracing能力 feature: Expand ai-statistics plugins to enhance tracing capacity Aug 23, 2024
config.Enable = configJson.Get("enable").Bool()

// Parse tracing label.
traceLabelConfigArray := configJson.Get("tracing_label").Array()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

字段名改成 tracing_span

ValueSource: traceLabel.Get("value_source").String(),
Value: traceLabel.Get("value").String(),
}
config.TracingLabel[traceLabel.ValueSource] = traceLabel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

一种source类型应该支持多对kv,这里为什么设计成一种类型只能有一对kv呢

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是个数组,我们设计的是配置可以是如下格式的,不是最顶层用 source 分类:

  • key: "input_tokens"
    value_source: "property"
    value: "input_tokens"
  • key: "output_tokens"
    value_source: "property"
    value: "output_tokens"

Copy link
Contributor Author

@shalldid shalldid Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnlanni 这种方式也可以实现一种source类型支持多对kv


// sets the input_tokens and output_tokens tags in the tracing span.
func setTracingTokenCostTag(config AIStatisticsConfig, inputToken int64, outputToken int64, log wrapper.Log) {
fetchTracingLabelValue(config, "input_tokens", nil, fmt.Sprintf("%d", inputToken), log)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input token和output token不应该是属于property类型么

endTime, _ := strconv.ParseInt(endTimeStr, 10, 64)
totalTime := endTime - startTime

fetchTracingLabelValue(config, "total_time", nil, fmt.Sprintf("%d", totalTime), log)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

total time不是一种value source,复用这个函数显得很奇怪

ctx.DontReadResponseBody()
return types.ActionContinue
}
contentType, _ := proxywasm.GetHttpResponseHeader("content-type")
if !strings.Contains(contentType, "text/event-stream") {
ctx.BufferResponseBody()
}

// calculate total cost time and set tracing span tag.
startTimeStr, _ := proxywasm.GetHttpResponseHeader("req-arrive-time")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不要依赖这这个特定的header来解析时间,因为这几个header是higress特有的,这个插件应该可以比较通用才对。
在 onHttpRequestHeader阶段记个时间,在onHttpStreamingBody的第一个chunk和最后一个chunk分别记个时间,就可以得到首token响应时间和完整响应时间了

1.优化自定义统计配置字段,trace_label->trace_span
2.耗时统计方法优化,移除httpResponseHeader 头定义依赖,自定义context属性实现耗时计算
3.优化代码结构
@shalldid
Copy link
Contributor Author

@johnlanni 已按照建议修改提交

@codecov-commenter
Copy link

codecov-commenter commented Aug 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 44.32%. Comparing base (ef31e09) to head (c3ab551).
Report is 64 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1246      +/-   ##
==========================================
+ Coverage   35.91%   44.32%   +8.41%     
==========================================
  Files          69       75       +6     
  Lines       11576     9821    -1755     
==========================================
+ Hits         4157     4353     +196     
+ Misses       7104     5140    -1964     
- Partials      315      328      +13     

see 90 files with indirect coverage changes

1.修改README.md
2.修复setTracingSpanValueBySource()对入参tracingSource与tracingSpanEle.ValueSource未进行校验的问题
Copy link
Collaborator

@johnlanni johnlanni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rinfx rinfx merged commit 6701a86 into alibaba:main Aug 27, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Higress AI可观测_ai-statistics插件扩展设计
5 participants