Knative - Autoscaling #2 (테스트)

1. 테스트 환경

- knative serving v0.14.3, istio 1.3, Kubernetes 1.16.15

- Kubeflow 1.2에 포함된 knative serving, istio를 사용함

2. 사전 작업

a. concurrency 조회를 위한 로그 레벨 조정

- Knative service에 대한 autoscaling의 stable/panic concurrency metric을 얻기 위해서는 Autoscaler의 log level을 debug로 변경해야 한다.

$ k edit cm config-logging -n knative-serving
apiVersion: v1
data:
  loglevel.autoscaler: debug
...
$ k rollout restart deployment autoscaler -n knative-serving
$

- 테스트용 'autoscale-go' knative service의 revision 정보를 조회한다. 예제에서는 autoscale-go-hnfpq이다.

$ k get ksvc -n yoosung-jeon
NAME                               URL                                                                         LATESTCREATED                            LATESTREADY                              READY   REASON
autoscale-go                       http://autoscale-go.yoosung-jeon.kf-serv.acp.kt.co.kr                       autoscale-go-hnfpq                       autoscale-go-hnfpq                       True
bert-large-predictor-default       http://bert-large-predictor-default.yoosung-jeon.kf-serv.acp.kt.co.kr       bert-large-predictor-default-4phdz       bert-large-predictor-default-4phdz       True
bert-large-transformer-default     http://bert-large-transformer-default.yoosung-jeon.kf-serv.acp.kt.co.kr     bert-large-transformer-default-sm2dp     bert-large-transformer-default-sm2dp     True
flowers-sample-predictor-default   http://flowers-sample-predictor-default.yoosung-jeon.kf-serv.acp.kt.co.kr   flowers-sample-predictor-default-zwwvl   flowers-sample-predictor-default-zwwvl   True
helloworld-python                  http://helloworld-python.yoosung-jeon.kf-serv.acp.kt.co.kr                  helloworld-python-tsnqc                                                           False   RevisionMissing
$

- Autoscaler Pod에서 autoscale-go-hnfpq의 로그를 조회한다.

$ export REVISION=autoscale-go-hnfpq
$ k logs -n knative-serving -l app=autoscaler -f | grep ${REVISION}
{"level":"debug","ts":"2021-10-12T04:41:15",...,"msg":"DesiredStablePodCount = 0.000, DesiredPanicPodCount = 0.000, ReadyEndpointCount = 1, MaxScaleUp = 1000.000, MaxScaleDown = 0.000","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T04:41:15",...,"msg":"Observed average scaling metric value: 0.000, targeting 7.000.","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency","mode":"stable"}
{"level":"debug","ts":"2021-10-12T04:41:15",...,"msg":"Observed average scaling metric value: 0.000, targeting 7.000.","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency","mode":"panic"}
{"level":"debug","ts":"2021-10-12T04:41:15",...,"msg":"Operating in stable mode.","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T04:41:15",...,"msg":"PodCount=1 Total1PodCapacity=10 ObsStableValue=0 ObsPanicValue=0 TargetBC=200 ExcessBC=-190 NumActivators=3","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}

b. hey 설치

- hey is a tiny program that sends some load to a web application
https://github.com/rakyll/hey

$ wget https://hey-release.s3.us-east-2.amazonaws.com/hey_linux_amd64
$ chmod 744 hey_linux_amd64 && mv hey_linux_amd64 ~/bin/hey
$ hey --help
…
 -c  Number of workers to run concurrently. Total number of requests cannot
     be smaller than the concurrency level. Default is 50.
 -z  Duration of application to send requests. When duration is reached,
     application stops and exits. If duration is specified, n is ignored.
…
$

3. Knative Autoscaling 테스트

- Deploy the sample Knative service

✓ KPA(Knative Pod Autoscaler)를 사용하고, metric은 concurrency (soft limit) 타입이며 값은 10으로, 최소 Pod 수는 1로 설정

$ git clone -b "release-0.26" https://github.com/knative/docs knative-docs
$ cd knative-docs
$ cat docs/serving/autoscaling/autoscale-go/service.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: autoscale-go
  namespace: yoosung-jeon
spec:
  template:
    metadata:
      annotations:
        # Target 10 in-flight-requests per pod.
        autoscaling.knative.dev/target: "10"
        autoscaling.knative.dev/minSacle: "1"
    spec:
      containers:
      - image: gcr.io/knative-samples/autoscale-go:0.1
$ k apply -f docs/serving/autoscaling/autoscale-go/service.yaml
$ k get pod -n yoosung-jeon -l serving.knative.dev/service=autoscale-go
NAME                                             READY   STATUS    RESTARTS   AGE
autoscale-go-hm69r-deployment-7956b95556-zwf4j   2/2     Running   0          14d
$

✓ Knative custom domain을 'example.com'에서 'kf-serv.acp.kt.co.kr'로 변경하였으며, URL 주소는 아래와 같다.

Knative - Custom domain 변경

$ k get ksvc autoscale-go -n yoosung-jeon
NAME           URL                                                     LATESTCREATED        LATESTREADY          READY   REASON
autoscale-go   http://autoscale-go.yoosung-jeon.kf-serv.acp.kt.co.kr   autoscale-go-hm69r   autoscale-go-hm69r   True
$

- Make a request to the autoscale app to see it consume some resources

DNS에 Custom domain(ex. *.kf-serv.acp.kt.co.kr)이 등록되어 있지 않는 경우는 호출 시 Host 정보를 추가해야 함.

curl -H "Host: autoscale-go.yoosung-jeon.kf-serv.acp.kt.co.kr" http://${IP_Address}

$ curl "http://autoscale-go.yoosung-jeon.kf-serv.acp.kt.co.kr?sleep=100&prime=10000&bloat=5"
Allocated 5 Mb of memory.
The largest prime less than 10000 is 9973.
Slept for 100.17 milliseconds.
$

- Stress test : Send 30 seconds of traffic maintaining 50 in-flight requests

$ hey -z 30s -c 50 "http://autoscale-go.yoosung-jeon.kf-serv.acp.kt.co.kr?sleep=100&prime=10000&bloat=5"

Summary:
  Total:	30.1095 secs
  Slowest:	0.2362 secs
  Fastest:	0.1077 secs
  Average:	0.1164 secs
  Requests/sec:	428.9671

  Total data:	1291505 bytes
  Size/request:	99 bytes

Response time histogram:
  0.108 [1]	|
  0.121 [10736]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.133 [1905]	|■■■■■■■
  0.146 [182]	|■
  0.159 [35]	|
  0.172 [17]	|
  0.185 [4]	|
  0.198 [7]	|
  0.211 [0]	|
  0.223 [13]	|
  0.236 [16]	|


Latency distribution:
  10% in 0.1109 secs
  25% in 0.1121 secs
  50% in 0.1143 secs
  75% in 0.1182 secs
  90% in 0.1237 secs
  95% in 0.1275 secs
  99% in 0.1402 secs

Details (average, fastest, slowest):
  DNS+dialup:	0.0001 secs, 0.1077 secs, 0.2362 secs
  DNS-lookup:	0.0000 secs, 0.0000 secs, 0.0030 secs
  req write:	0.0000 secs, 0.0000 secs, 0.0003 secs
  resp wait:	0.1161 secs, 0.1076 secs, 0.2165 secs
  resp read:	0.0001 secs, 0.0000 secs, 0.0050 secs

Status code distribution:
  [200]	12916 responses
  
$

- Autoscaling 테스트 결과

✓ Knative service에 부하가 증가되면 autoscale-go-hm69r-deployment deployment의 scale이 자동으로 증가됨 (Pod 증가)

autoscaling.knative.dev/target: "10"

✓ 일정 시간동안 Knative service에 요청이 발생되지 않으면 scale이 자동으로 1로 축소됨 (Pod 감소)

autoscaling.knative.dev/minSacle: "1"

scale-to-zero-grace-period: "30s" ⇠ config-autoscaler (Configmap)에 설정되어 있음

$ k get deployments.apps autoscale-go-hm69r-deployment -n yoosung-jeon
NAME                            READY   UP-TO-DATE   AVAILABLE   AGE
autoscale-go-hm69r-deployment   7/7     7            7           39d
$ k get pod -n yoosung-jeon -l serving.knative.dev/service=autoscale-go
NAME                                             READY   STATUS    RESTARTS   AGE
autoscale-go-hm69r-deployment-7956b95556-4fbwj   2/2     Running   0          18s
autoscale-go-hm69r-deployment-7956b95556-6bdpl   2/2     Running   0          18s
autoscale-go-hm69r-deployment-7956b95556-8klcr   2/2     Running   0          12s
autoscale-go-hm69r-deployment-7956b95556-hw5j5   2/2     Running   0          20s
autoscale-go-hm69r-deployment-7956b95556-s2j9s   2/2     Running   0          20s
autoscale-go-hm69r-deployment-7956b95556-zh7jk   2/2     Running   0          20s
autoscale-go-hm69r-deployment-7956b95556-zwf4j   2/2     Running   0          14d

...<일정 시간 경과>

$ k get deployments.apps autoscale-go-hm69r-deployment -n yoosung-jeon
NAME                            READY   UP-TO-DATE   AVAILABLE   AGE
autoscale-go-hm69r-deployment   1/1     1            1           39d
$ k get configmap config-autoscaler -n knative-serving -o yaml | grep scale-to-zero-grace-period | head -n 1
    scale-to-zero-grace-period: "30s"
$

4. Concurreny target의 이해

- Stable window와 Panic window에서 concurrency target의 의미는 아래와 같이 정의되어 있다.

✓ ObsStableValue: the average number of requests received by a pod within a stable window of 60 seconds.

✓ ObsPanicValue: the average number of requests received by a pod within a panic window of 6 seconds.

✓ 측정된 concurrency target과 설정된 target(autoscaling.knative.dev/target) 값을 비교하여 Pod의 Autoscaling 여부를 판단한다.

                                                       |
                                  Panic Target--->  +--| 20
                                                    |  |
                                                    | <------Panic Window
                                                    |  |
       Stable Target--->  +-------------------------|--| 10   CONCURRENCY
                          |                         |  |
                          |                      <-----------Stable Window
                          |                         |  |
--------------------------+-------------------------+--+ 0
120                       60                           0
                     TIME

- 테스트 조건

✓ KPA 조건

autoscaling.knative.dev/target: "10"

autoscaling.knative.dev/minScale: "1"

✓ 부하 조건

120초 동안 동시 요청수는 5이고, 서비의 응답 시간은 1차는 100ms, 2차는 1,000ms이다.

1차: hey -z 120s -c 5 "http://autoscale-go.yoosung-jeon.kf-serv.acp.kt.co.kr?sleep=100&prime=10000&bloat=5"

2차: hey -z 120s -c 5 "http://autoscale-go.yoosung-jeon.kf-serv.acp.kt.co.kr?sleep=1000&prime=10000&bloat=5"

- 테스트 결과 (그래프)

✓ ObsStableValue와 ObsPanicValue는 위에 정의된 대로 2초 간격으로 계산된다.

✓ Panic concurrency가 Stable concurrency보다 일찍 반영되기 때문에 급격한 부하 발생 시 빠르게 autoscaling 할 수 있다. (5장 참조)

✓ Knative service의 응답시간은 concurrency에 영향을 미치지 않는다.

- 테스트 결과 (로그)

autoscaler의 debug 로그를 이용하여 위 그래프를 수작업으로 작성하였다.

$ k logs -n knative-serving -l app=autoscaler | grep autoscale-go-hnfpq | grep ObsStableValue | awk -F',' '{print $2, $5}'
...
"ts":"2021-10-12T07:07:05.296Z" "msg":"PodCount=1 Total1PodCapacity=10 ObsStableValue=3.65 ObsPanicValue=4.666666 TargetBC=200 ExcessBC=-194 NumActivators=3"
"ts":"2021-10-12T07:07:07.296Z" "msg":"PodCount=1 Total1PodCapacity=10 ObsStableValue=3.816666 ObsPanicValue=4.833333 TargetBC=200 ExcessBC=-194 NumActivators=3"
"ts":"2021-10-12T07:07:09.296Z" "msg":"PodCount=1 Total1PodCapacity=10 ObsStableValue=3.966666 ObsPanicValue=4.833333 TargetBC=200 ExcessBC=-194 NumActivators=3"
"ts":"2021-10-12T07:07:11.406Z" "msg":"PodCount=1 Total1PodCapacity=10 ObsStableValue=4.099999 ObsPanicValue=4.5 TargetBC=200 ExcessBC=-195 NumActivators=3"
...

- Autoscaler 로그 대신, Prometheus와 Grafana를 이용하여 Knative를 위한 모니터링 환경을 구성해야 한다.

https://miro.medium.com/max/2000/1*C59TynesT42mG-cUJv4kcQ.png

5. Panic mode 전환 테스트

- KPA provides two auto scaling modes: stable and panic.

✓ KPA calculates the average number of concurrent requests per pod within a stable window of 60 seconds.

✓ If the number of concurrent requests reaches twice the concurrency target, KAP switches to the panic mode.

✓ In panic mode, KPA scales pods within a shorter time window than in stable mode.

- 테스트용 Knative service를 생성한다.

KPA(Knative Pod Autoscaler)를 사용하고, metric은 concurrency (soft limit) 타입이며 값은 10으로, 최소 Pod 수는 1로 설정

$ vi autoscale-go.yaml
apiVersion: serving.knative.dev/v1
kind: Service
  name: autoscale-go
  namespace: yoosung-jeon
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/metric: concurrency
        autoscaling.knative.dev/minScale: "1"
        autoscaling.knative.dev/target: "10"
    spec:
      containerConcurrency: 0
      containers:
      - image: gcr.io/knative-samples/autoscale-go:0.1
$ k apply -f autoscale-go.yaml
$

- 동시에 100개의 요청을 한 번만 발생시킨다.

$ cat stress.sh
#!/bin/bash

for ((i=1;i<=100;i++))
do
   curl "http://autoscale-go.yoosung-jeon.kf-serv.acp.kt.co.kr?sleep=1000&prime=10000&bloat=5" &
done
$ ./stress.sh

- 부하량에 따라 자동으로 Pod가 증감한다.

$ watch -n 1 kubectl get pod -l app=autoscale-go-hnfpq -n yoosung-jeon
Every 1.0s: kubectl get pod -l app=autoscale-go-hnfpq -n yoosung-jeon                       ysjeon-Dev.local: Tue Oct 12 14:13:00 2021

NAME                                             READY   STATUS    RESTARTS   AGE
autoscale-go-hnfpq-deployment-74c9fc5454-6vppp   2/2     Running   0          45s
autoscale-go-hnfpq-deployment-74c9fc5454-sgvc4   2/2     Running   0          178m
autoscale-go-hnfpq-deployment-74c9fc5454-t8mfn   2/2     Running   0          45s

- Autoscaler의 로그를 조회하면 panic mode 여부, stable/panic mode에서 conccurency (Pod당 동시 요청 처리수) 값을 확인할 수 있다.

✓ Panic mode 기간 (05:15:05 ~ 05:16:07)

"2021-10-12T05:15:05.296Z",...,"Operating in panic mode.",...

"2021-10-12T05:16:07.296Z",...,"Un-panicking.",...

✓ Panic mode에서 Pod 수 변경

12T05:15:07.299Z 시점의 로그를 보면 DesiredPanicPodCount 값이 0에서 3까지 증가되었으며, Pod가 증가하였다.

{...,"2021-10-12T05:15:07",...,"msg":"DesiredStablePodCount = 1.000, DesiredPanicPodCount = 3.000,...",...

그 시점의 stable concurrency는 1.69이고 panic concurrency는 19.9이다.
{...,"2021-10-12T05:15:07",...,"msg":"PodCount=1 Total1PodCapacity=10 ObsStableValue=1.694915 ObsPanicValue=19.999999 ...",...

$ export REVISION=autoscale-go-hnfpq
$ k logs -n knative-serving -l app=autoscaler -f | grep ${REVISION} | egrep "ObsStableValue|DesiredStablePodCount|Operating in panic mode|Un-panicking"
...
{"level":"debug","ts":"2021-10-12T05:15:03",...,"msg":"DesiredStablePodCount = 0.000, DesiredPanicPodCount = 0.000, ReadyEndpointCount = 1, MaxScaleUp = 1000.000, MaxScaleDown = 0.000","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:03",...,"msg":"PodCount=1 Total1PodCapacity=10 ObsStableValue=0 ObsPanicValue=0 TargetBC=200 ExcessBC=-190 NumActivators=3","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:03",...,"msg":"DesiredStablePodCount = 1.000, DesiredPanicPodCount = 1.000, ReadyEndpointCount = 1, MaxScaleUp = 1000.000, MaxScaleDown = 0.000","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:03",...,"msg":"PodCount=1 Total1PodCapacity=10 ObsStableValue=0.016666 ObsPanicValue=0.166666 TargetBC=200 ExcessBC=-191 NumActivators=3","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:05",...,"msg":"DesiredStablePodCount = 1.000, DesiredPanicPodCount = 3.000, ReadyEndpointCount = 1, MaxScaleUp = 1000.000, MaxScaleDown = 0.000","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:05",...,"msg":"Operating in panic mode.","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:05",...,"msg":"PodCount=1 Total1PodCapacity=10 ObsStableValue=1.666666 ObsPanicValue=16.666666 TargetBC=200 ExcessBC=-192 NumActivators=3","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:07",...,"msg":"DesiredStablePodCount = 1.000, DesiredPanicPodCount = 3.000, ReadyEndpointCount = 1, MaxScaleUp = 1000.000, MaxScaleDown = 0.000","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:07",...,"msg":"Operating in panic mode.","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:07",...,"msg":"PodCount=1 Total1PodCapacity=10 ObsStableValue=1.694915 ObsPanicValue=19.999999 TargetBC=200 ExcessBC=-192 NumActivators=3","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:09",...,"msg":"DesiredStablePodCount = 1.000, DesiredPanicPodCount = 3.000, ReadyEndpointCount = 1, MaxScaleUp = 1000.000, MaxScaleDown = 0.000","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:09",...,"msg":"Operating in panic mode.","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:09",...,"msg":"PodCount=1 Total1PodCapacity=10 ObsStableValue=1.666666 ObsPanicValue=16.499999 TargetBC=200 ExcessBC=-192 NumActivators=3","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:11",...,"msg":"DesiredStablePodCount = 1.000, DesiredPanicPodCount = 0.000, ReadyEndpointCount = 3, MaxScaleUp = 3000.000, MaxScaleDown = 1.000","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:11",...,"msg":"Operating in panic mode.","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:11",...,"msg":"PodCount=3 Total1PodCapacity=10 ObsStableValue=1.724137 ObsPanicValue=0 TargetBC=200 ExcessBC=-172 NumActivators=3","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:13",...,"msg":"DesiredStablePodCount = 1.000, DesiredPanicPodCount = -0.000, ReadyEndpointCount = 3, MaxScaleUp = 3000.000, MaxScaleDown = 1.000","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:13",...,"msg":"Operating in panic mode.","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:13",...,"msg":"PodCount=3 Total1PodCapacity=10 ObsStableValue=1.694915 ObsPanicValue=-1e-06 TargetBC=200 ExcessBC=-172 NumActivators=3","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:15",...,"msg":"DesiredStablePodCount = 1.000, DesiredPanicPodCount = -0.000, ReadyEndpointCount = 3, MaxScaleUp = 3000.000, MaxScaleDown = 1.000","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:15",...,"msg":"Operating in panic mode.","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:15:15",...,"msg":"PodCount=3 Total1PodCapacity=10 ObsStableValue=1.694915 ObsPanicValue=-1e-06 TargetBC=200 ExcessBC=-172 NumActivators=3","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
...
{"level":"debug","ts":"2021-10-12T05:16:03",...,"msg":"DesiredStablePodCount = 1.000, DesiredPanicPodCount = -0.000, ReadyEndpointCount = 3, MaxScaleUp = 3000.000, MaxScaleDown = 1.000","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:16:03",...,"msg":"Operating in panic mode.","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:16:03",...,"msg":"PodCount=3 Total1PodCapacity=10 ObsStableValue=1.677966 ObsPanicValue=-1e-06 TargetBC=200 ExcessBC=-172 NumActivators=3","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:16:05",...,"msg":"DesiredStablePodCount = 0.000, DesiredPanicPodCount = -0.000, ReadyEndpointCount = 3, MaxScaleUp = 3000.000, MaxScaleDown = 1.000","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:16:05",...,"msg":"Operating in panic mode.","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:16:05",...,"msg":"PodCount=3 Total1PodCapacity=10 ObsStableValue=0 ObsPanicValue=-1e-06 TargetBC=200 ExcessBC=-170 NumActivators=3","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:16:07",...,"msg":"DesiredStablePodCount = -0.000, DesiredPanicPodCount = -0.000, ReadyEndpointCount = 3, MaxScaleUp = 3000.000, MaxScaleDown = 1.000","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"info", "ts":"2021-10-12T05:16:07",...,"msg":"Un-panicking.","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
{"level":"debug","ts":"2021-10-12T05:16:07",...,"msg":"PodCount=3 Total1PodCapacity=10 ObsStableValue=-1e-06 ObsPanicValue=-1e-06 TargetBC=200 ExcessBC=-170 NumActivators=3","commit":"bcda051","knative.dev/key":"yoosung-jeon/autoscale-go-hnfpq","metric":"concurrency"}
...

6. Concurrency type별 테스트 (hard limit vs Soft limit)

- Metric for KPA
✓ The default KPA Autoscaler supports the concurrency and rps (request per second) metrics.

a. concurrency

▷ concurrency type

i. Soft limit (autoscaling.knative.dev/target)
The soft limit is a targeted limit rather than a strictly enforced bound. In some situations, particularly if there is a sudden burst of requests, this value can be exceeded.
ii. Hard limit (containerConcurrency)

The hard limit is an enforced upper bound. If concurrency reaches the hard limit, surplus requests will be buffered and must wait until enough capacity is free to execute the requests.

Using a hard limit configuration is only recommended if there is a clear use case for it with your application.

Having a low hard limit specified may have a negative impact on the throughput and latency of an application, and may cause additional cold starts.

- 본 테스트는 Concurrency type별 특징을 파악하기 위하여 진행하였다.

✓ hard limit: concurrency 이상으로 요청이 발생되면, Pod 마다 concurrency 만큼만 요청이 전달하여 처리하고, 나머지는 버터링 한다.

✓ soft limit: concurrency 이상으로 요청이 발생되면, Pod에 concurrency를 초과하여 요청이 전달되어 치리 된다.

a. Concurrency (hard limit)

- Knative service 설정

Concurrency에 대한 hard limit은 spec의 containerConcurrency key로 설정한다.

$ k edit ksvc -n yoosung-jeon autoscale-go
...
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/metric: concurrency
        autoscaling.knative.dev/minScale: "1"
      creationTimestamp: null
    spec:
      containerConcurrency: 10
...
$

- 'autoscale-go' Knative service용 Pod 조회

$ k get pod -n yoosung-jeon -l app=autoscale-go-6qn5q -o wide
NAME                                            READY   STATUS    RESTARTS   AGE   IP             NODE    NOMINATED NODE   READINESS GATES
autoscale-go-6qn5q-deployment-fb67ff5bd-5qzbj   2/2     Running   0          41m   10.244.3.126   iap11   <none>           <none>
$

- Pod별 동시 요청 수 파악 방법

✓ autoscale-go 애플리케이션의 Container에는 sh 명령어가 제공되지 않아서 'k exec {pod-name} -it -- sh' 명령어로 접근을 할 수 없다.

✓ 해당 Container가 실행하고 있는 서버에 직접 접속 후 autoscale-go 애플리케이션의 TCP Session 수를 이용하여 동시 요청 수를 파악하였다.

$ k exec autoscale-go-6qn5q-deployment-fb67ff5bd-5qzbj -n yoosung-jeon -it -c user-container -- sh
OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "exec: \"sh\": executable file not found in $PATH": unknown
command terminated with exit code 126
$

## Docker ID 정보 얻기
$ k describe pod -n yoosung-jeon -l app=autoscale-go-6qn5q | grep "user-container:" -A1
  user-container:
    Container ID:   docker://03b18d7f780b179effe0ea0e03c6373a62505c182b6aa59ef8dd2e8750d8d82f
$

## Docker container가 실행 중인 서버에 접근하여 해당 어플리케이션의 TCP 세션 수를 파악
[iap@iap01 ~]$ iap11
Last login: Thu Oct  7 10:17:46 2021 from iap01
[root@iap11 ~]# ps -ef | grep 03b18d7f780b179effe0ea0e03c6373a62505c182b6aa59ef8dd2e8750d8d82f | grep -v grep
root       8966   2930  0 09:44 ?        00:00:00 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/03b18d7f780b179effe0ea0e03c6373a62505c182b6aa59ef8dd2e8750d8d82f -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-nvidia
[root@iap11 ~]# ps -ef | grep 8966 | grep -v grep
root       8966   2930  0 09:44 ?        00:00:00 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/03b18d7f780b179effe0ea0e03c6373a62505c182b6aa59ef8dd2e8750d8d82f -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-nvidia
root       9013   8966  0 09:44 ?        00:00:03 /sample
[root@iap11 ~]# lsof -p 9013 | grep sock
sample  9013 root    3u     sock    0,7      0t0 2971741143 protocol: TCPv6
[root@iap11 ~]# lsof -p 9013 | grep sock | wc -l | awk '{print $1-1}'
0
[root@iap11 ~]#

- 부하 발생

동시 요청 수 : 13, 요청당 응답 시간 : 5000ms, Pod 수 = 1

#!/bin/bash

for ((i=1;i<=13;i++))
do
   curl "http://autoscale-go.yoosung-jeon.kf-serv.acp.kt.co.kr?sleep=5000&prime=10000&bloat=5" -A "" &
done

- 테스트 결과

한 Pod에 최대 10개까지만 요청이 전달되었다.

[root@iap11 ~]# while true; do echo `date` : `lsof -p 9013 | grep sock | wc -l | awk '{print $1-1}'`; sleep 1; done
Thu Oct 7 14:07:14 KST 2021 : 0
Thu Oct 7 14:07:15 KST 2021 : 10
Thu Oct 7 14:07:17 KST 2021 : 10
Thu Oct 7 14:07:18 KST 2021 : 10
Thu Oct 7 14:07:19 KST 2021 : 10
Thu Oct 7 14:07:20 KST 2021 : 10
Thu Oct 7 14:07:22 KST 2021 : 10
Thu Oct 7 14:07:23 KST 2021 : 10
Thu Oct 7 14:07:24 KST 2021 : 10
Thu Oct 7 14:07:25 KST 2021 : 3
Thu Oct 7 14:07:26 KST 2021 : 3
Thu Oct 7 14:07:28 KST 2021 : 3
Thu Oct 7 14:07:29 KST 2021 : 3
Thu Oct 7 14:07:30 KST 2021 : 3
Thu Oct 7 14:07:31 KST 2021 : 0

b. Concurrency (soft limit)

- Knative service 설정

Concurrency에 대한 soft limit은 annotation key(autoscaling.knative.dev/target)로 설정한다.

$ k edit ksvc -n yoosung-jeon autoscale-go
...
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/metric: concurrency
        autoscaling.knative.dev/target: "10"
        autoscaling.knative.dev/minScale: "1"
      creationTimestamp: null
    spec:
      containerConcurrency: 0
...
$

- 부하 발생

동시 요청 수 : 13, 요청당 응답 시간 : 5000ms, Pod 수 = 1

Pos 수 = 1, 동시 요청 수 = 33, 요청당 응답 시간 = 5000ms

- 테스트 결과

Concurrency를 10으로 설정하였으나, 33개의 요청이 하나의 Pod로 전달되어 처리되었다. 그리고 Panic mode 조건에 부합되어 아래와 같이 신규로 3개 Pod가 기동 되었다.

Thu Oct 7 10:56:01 KST 2021 : 0
Thu Oct 7 10:56:02 KST 2021 : 0
Thu Oct 7 10:56:03 KST 2021 : 33
Thu Oct 7 10:56:04 KST 2021 : 33
Thu Oct 7 10:56:05 KST 2021 : 33
Thu Oct 7 10:56:07 KST 2021 : 33
Thu Oct 7 10:56:08 KST 2021 : 33
Thu Oct 7 10:56:09 KST 2021 : 33
Thu Oct 7 10:56:10 KST 2021 : 33
Thu Oct 7 10:56:11 KST 2021 : 33
Thu Oct 7 10:56:13 KST 2021 : 0
Thu Oct 7 10:56:14 KST 2021 : 0
...

Every 1.0s: kubectl get pod -n yoosung-jeon | grep autoscale-go                             ysjeon-Dev.local: Thu Oct  7 10:53:45 2021

autoscale-go-ql4mq-deployment-6559787f76-2nvbx          2/2     Running   0          22s
autoscale-go-ql4mq-deployment-6559787f76-mxjr9          2/2     Running   0          22s
autoscale-go-ql4mq-deployment-6559787f76-rpf6k          2/2     Running   0          4m20s
autoscale-go-ql4mq-deployment-6559787f76-v65q6          2/2     Running   0          22s

'Kubernetes > Management' 카테고리의 다른 글

Knative - Private docker registry certificates 설정 (0)	2021.10.13
Knative - SKS Mode (Proxy, Serve) 이해 (0)	2021.10.13
Knative - Autoscaling #1 (개념) (0)	2021.10.09
istio - Access logs 설정 (0)	2021.10.08
Knative - Custom domain 변경 (0)	2021.10.06

일주일만 하면 ...

Knative - Autoscaling #2 (테스트)

'Kubernetes > Management' 카테고리의 다른 글

댓글

티스토리툴바

Knative - Autoscaling #2 (테스트)

'Kubernetes > Management' 카테고리의 다른 글

관련글

댓글

티스토리툴바