0%

一次Fegin CPU占用过高导致的事故

记录一下
一次应用事故分析、排查、处理

背景介绍

9号上午收到CPU告警,同时业务反馈依赖该服务的上游服务接口响应耗时太长

1
2
3
4
5
6
7
8
应用告警-CPU使用率 告警变更
【WARNING】项目XXX,集群qd-aliyun,分区bbbb-prod,应用customer,实例customer-6fb6448688-m47jz, POD实例CPU请求使用率 >= 90.000000% 当前值138.4971051199925%
发生时间:2024/10/09 11:17:33


项目XXX,集群qd-aliyun,分区bbbb-prod,应用customer,实例customer-6fb6448688-28pvs, POD实例CPU请求使用率 >= 90.000000% 当前值157.7076205766934%告警已恢复
发生时间: 2024/10/09 11:06:33
恢复时间: 2024/10/09 12:24:33

服务访问量

单实例峰值QPS100左右

为啥要关注QPS,因为QPS100不应该消耗这么多CPU啊,而且请求、响应体都不大。


POD监控

POD配额

  • CPU请求 2 Core CPU上限 3 Core
  • 内存请求 7GiB 内存上限 9GiB



从图中可以看出

  • CPU负载一直很高
  • TCP链接及线程数从11点40开始陡峭上升

Arms

看下Trace监控发现,耗时主要是customer通过fegin调用外围接口导致的。

临时方案

临时处理方案:扩实例并增加CPU配置。

根因分析

此处略过排查三方接口跟开放平台网关的过程,此处的结论是:依赖的三方接口跟开放平台网关没有问题。
为啥会先排查三方接口跟开放平台网关是因为中Trace上来看是调用三方接口响应时间过长。


从Arms图看可以看出

  • CPU耗时集中在fegin调用的Decoder、Encoder
  • Decoder、Encoder耗时都集中在
    • HttpMessageConverters#getDefaultConverters()=>
    • WebMvcConfigurationSupport#addDefaultHttpMessageConverters=>
    • ……(具体调用链看下方摘要)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
feign.ReflectiveFeign$BuildTemplateByResolvingArgs.create(Object[]) (14.37%, 1.43 minutes)
feign.ReflectiveFeign$BuildEncodedTemplateFromArgs.reesolve(Object[], RequestTemplate, Map) (14.37%, 1.43minutes)
org.springframework.cloud.openfeign.support.SpringEndcoder.encode(Object, Type, RequestTemplate) (14.28%,1.42 minutes)
com.jiankunking.common.core.feign.FeignClientsConfig$$ambda$938.56729293.get0bject() (13.98%, 1.39 minutes
com.jiankunking.common.core.feign.FeignClientsConfig.lambda$feignEncoder$2() (13.98%, 1.39 minutees)
org.springframework.boot.autoconfigure.http.HttpmessaageConverters.<init>(HttpMessageConverter[]) (12.03%,1.19 minutes)
prg.springframework.boot.autoconfigure.http.Http.HttpMessageConverters.<init>(Collection) (12.03%, 119 minutes)
org.springframework.boot.autoconfigure.http.HttpmessaageConverters.<init>(boolean, Collection) (12.03%, 1.19 minutes)
prg.springframework.boot.autoconfigure.http.Http.HttpMessageConverters.getDefaultConverters()(12.02%, 1.19 minutes
org.springframework.boot.autoconfigure.http.HttpmessageConverters$1.defaultMessageConverters() (12.02%, 119 minutes)
org.springframework.web.servlet.config.annotation.WebMvcConfigurationSupport.getMessageConverters() (12.02%, 1.19 minutes)
org.springframework.web.servlet.config.annotation. WebMvcConfigurationSupport.addDefaultHttpMessageConverters(List) (12.02%, 1
org.springframework.http.converter.json.Jackson2ObjectMapperBuilder.build() (5.93%, 0.59 minutes)
org.springframework.http.converter.json.Jackson2ObjectMapperBuilder.configure(ObjectMapper)(5.91%, 0.59 minutes)
org.springframework.http.converter.json.Jackson2Objec:tMapperBuilder.registerWellKnownModulesIfAvailable(Map)(5.89%,0.58 min
org.springframework.util.ClassUtils.forName(String, CClassLoader)(5.84%, 0.58 minutes)
java.lang.Class.forName(String, boolean, Classloader) (5.83%, 0.58 minutes)
java.lang.Class.forName0(String, boolean, ClassLoader, Class) (5.83%, 0.58 minutes)
......

自定义Encoder、Decoder

Encoder

看下jiankunking.common.core.feign.FeignClientsConfig中的Encoder

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
public Encoder feignEncoder() {
ObjectFactory<HttpMessageConverters> objectFactory = () -> new HttpMessageConverters(new RMappingJackson2HttpMessageConverter());
return new SpringEncoder(objectFactory);
}

public class RMappingJackson2HttpMessageConverter extends MappingJackson2HttpMessageConverter {

public RMappingJackson2HttpMessageConverter(ObjectMapper objectMapper) {
super(objectMapper);
List<MediaType> mediaTypes = new ArrayList<>();
mediaTypes.add(MediaType.valueOf(MediaType.APPLICATION_JSON_UTF8_VALUE));
mediaTypes.add(MediaType.valueOf(MediaType.TEXT_HTML_VALUE + ";charset=UTF-8"));
setSupportedMediaTypes(mediaTypes);
}

RMappingJackson2HttpMessageConverter() {
List<MediaType> mediaTypes = new ArrayList<>();
mediaTypes.add(MediaType.valueOf(MediaType.APPLICATION_JSON_UTF8_VALUE));
mediaTypes.add(MediaType.valueOf(MediaType.TEXT_HTML_VALUE + ";charset=UTF-8"));
setSupportedMediaTypes(mediaTypes);
}
}

Decoder

看下jiankunking.common.core.feign.FeignClientsConfig中的Decoder

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
public Decoder feignDecoder() {
HttpMessageConverter jacksonConverter = new MappingJackson2HttpMessageConverter(customObjectMapper());
ObjectFactory<HttpMessageConverters> objectFactory = () -> new HttpMessageConverters(jacksonConverter);
return new ResponseEntityDecoder(new RSpringDecoder(objectFactory));
}

public ObjectMapper customObjectMapper() {
ObjectMapper objectMapper = new ObjectMapper();

objectMapper.registerModule(new StringToDateModule());
objectMapper.configure(JsonParser.Feature.ALLOW_COMMENTS, true);
objectMapper.configure(JsonParser.Feature.ALLOW_UNQUOTED_FIELD_NAMES, true);
objectMapper.configure(JsonParser.Feature.ALLOW_SINGLE_QUOTES, true);
objectMapper.configure(JsonParser.Feature.ALLOW_UNQUOTED_CONTROL_CHARS, true);
objectMapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false);

return objectMapper;
}

Google了一下:‘spring feign encode jackson cpu usage high’
=> https://segmentfault.com/a/1190000043037032
=> https://mp.weixin.qq.com/s/RuqltkN9VdVQ1K3GKuJ-Gw
=> https://meantobe.github.io/2019/12/21/ClassLoader/

源码分析

查看registerWellKnownModulesIfAvailable处的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53

@SuppressWarnings("unchecked")
private void registerWellKnownModulesIfAvailable(Map<Object, Module> modulesToRegister) {
try {
Class<? extends Module> jdk8ModuleClass = (Class<? extends Module>)
ClassUtils.forName("com.fasterxml.jackson.datatype.jdk8.Jdk8Module", this.moduleClassLoader);
Module jdk8Module = BeanUtils.instantiateClass(jdk8ModuleClass);
modulesToRegister.put(jdk8Module.getTypeId(), jdk8Module);
}
catch (ClassNotFoundException ex) {
// jackson-datatype-jdk8 not available
}

try {
Class<? extends Module> javaTimeModuleClass = (Class<? extends Module>)
ClassUtils.forName("com.fasterxml.jackson.datatype.jsr310.JavaTimeModule", this.moduleClassLoader);
Module javaTimeModule = BeanUtils.instantiateClass(javaTimeModuleClass);
modulesToRegister.put(javaTimeModule.getTypeId(), javaTimeModule);
}
catch (ClassNotFoundException ex) {
// jackson-datatype-jsr310 not available
}

// Joda-Time present?
if (ClassUtils.isPresent("org.joda.time.LocalDate", this.moduleClassLoader)) {
try {
Class<? extends Module> jodaModuleClass = (Class<? extends Module>)
ClassUtils.forName("com.fasterxml.jackson.datatype.joda.JodaModule", this.moduleClassLoader);
Module jodaModule = BeanUtils.instantiateClass(jodaModuleClass);
modulesToRegister.put(jodaModule.getTypeId(), jodaModule);
}
catch (ClassNotFoundException ex) {
// jackson-datatype-joda not available
}
}

// Kotlin present?
if (KotlinDetector.isKotlinPresent()) {
try {
Class<? extends Module> kotlinModuleClass = (Class<? extends Module>)
ClassUtils.forName("com.fasterxml.jackson.module.kotlin.KotlinModule", this.moduleClassLoader);
Module kotlinModule = BeanUtils.instantiateClass(kotlinModuleClass);
modulesToRegister.put(kotlinModule.getTypeId(), kotlinModule);
}
catch (ClassNotFoundException ex) {
if (!kotlinWarningLogged) {
kotlinWarningLogged = true;
logger.warn("For Jackson Kotlin classes support please add " +
"\"com.fasterxml.jackson.module:jackson-module-kotlin\" to the classpath");
}
}
}
}

可以看到其逻辑为若classpath中有JodaTime的LocalDate,则加载Jackson对应的JodaModule.LaunchedURLClassLoader.

为啥没有怀疑jdk8ModuleClass、javaTimeModuleClass这两个地方呢?因为common包中已经依赖了下面两个包

1
2
compile "com.fasterxml.jackson.datatype:jackson-datatype-jdk8:${v.jacksonDatatype}"
compile "com.fasterxml.jackson.datatype:jackson-datatype-jsr310:${v.jacksonDatatype}"

那么解决方案就很清晰了

解决方案

避免ClassLoader反复加载

将这个依赖添加到工程中。加载一次后,再次调用可以通过findLoadedClass获得,减少加载类导致的资源消耗。

1
2
3
4
5
<dependency>
<groupId>com.fasterxml.jackson.datatype</groupId>
<artifactId>jackson-datatype-joda</artifactId>
<version>x.x.x</version>
</dependency>

避免HttpMessageConverters重复初始化

1
2
3
4
5
6
7
8
9
10
11
12
public Decoder feignDecoder() {
HttpMessageConverter jacksonConverter = new MappingJackson2HttpMessageConverter(customObjectMapper());
ObjectFactory<HttpMessageConverters> objectFactory = () -> new HttpMessageConverters(false, Collections.singletonList(jacksonConverter));
return new ResponseEntityDecoder(new RSpringDecoder(objectFactory));
}


public Encoder feignEncoder() {
HttpMessageConverter jacksonConverter = new RMappingJackson2HttpMessageConverter(customObjectMapper());
ObjectFactory<HttpMessageConverters> objectFactory = () -> new HttpMessageConverters(false, Collections.singletonList(jacksonConverter));
return new SpringEncoder(objectFactory);
}

总结

大家在自定义 Feign 的编解码器时,如果用到了 SpringEncoder / SpringDecoder,应避免 HttpMessageConverters 的重复初始化。如果不需要使用那些默认的 HttpMessageConverter,可以在初始化 HttpMessageConverters 时将第一个入参设置为 false,从而不初始化那些默认的 HttpMessageConverter。

欢迎关注我的其它发布渠道