Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support k8s demo (Imagenet) #716

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions python/examples/imagenet/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,13 @@ client send inference request
python resnet50_rpc_client.py ResNet50_vd_client_config/serving_client_conf.prototxt
```
*the port of server side in this example is 9696

### Launch Paddle Serving on Kubernetes

Paddle Serving support deployment on Kubernetes (K8S) clusters. From `imagenet_k8s_rpc.yaml` we define Serving as K8S Deployment and Service. User can deploy Serving on containers and expose internal or external service.

We strongly recommend [Baidu Cloud CCE Cluster](https://cloud.baidu.com/search.html?q=CCE)

```
kubectl apply -f imagenet_k8s_rpc.yaml
```
10 changes: 10 additions & 0 deletions python/examples/imagenet/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,13 @@ client端进行预测
python resnet50_rpc_client.py ResNet50_vd_client_config/serving_client_conf.prototxt
```
*server端示例中服务端口为9696端口

### K8S启动

还可以运用K8S启动,在`imagenet_k8s_rpc.yaml`中定义了Serving的Deployment和Service,用户可以在K8S集群上启动Deployment并且通过Service对外暴露服务,用户可以在此基础上进行二次开发。

推荐百度云的[CCE(K8S)集群](https://cloud.baidu.com/search.html?q=CCE)

```
kubectl apply -f imagenet_k8s_rpc.yaml
```
55 changes: 55 additions & 0 deletions python/examples/imagenet/imagenet_k8s_rpc.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: paddleserving
labels:
app: paddleserving
spec:
replicas: 1
template:
metadata:
name: paddleserving
labels:
app: paddleserving
spec:
containers:
- name: paddleserving
image: hub.baidubce.com/paddlepaddle/serving:latest
imagePullPolicy: Always
workingDir: /
command: ['/bin/bash', '-c']
args: ['pip install -U paddle-serving-server paddle-serving-client paddle-serving-app && \
python -m paddle_serving_app.package --get_model resnet_v2_50_imagenet && \
tar xf resnet_v2_50_imagenet.tar.gz && \
python -m paddle_serving_server.serve --model resnet_v2_50_imagenet_model/ --port 9696']
ports:
- containerPort: 9696
name: serving

---

apiVersion: v1
kind: Service
metadata:
name: paddleserving
spec:
ports:
- name: paddleserving
port: 9696
targetPort: 9696
selector:
app: paddleserving
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
107 changes: 107 additions & 0 deletions python/examples/pipeline/faster_rcnn/benchmark.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
import sys
import os
import yaml
import requests
import time
import json
import cv2
import base64
try:
from paddle_serving_server_gpu.pipeline import PipelineClient
except ImportError:
from paddle_serving_server.pipeline import PipelineClient
import numpy as np
from paddle_serving_client.utils import MultiThreadRunner
from paddle_serving_client.utils import benchmark_args, show_latency

def cv2_to_base64(image):
return base64.b64encode(image).decode('utf8')

def parse_benchmark(filein, fileout):
with open(filein, "r") as fin:
res = yaml.load(fin)
del_list = []
for key in res["DAG"].keys():
if "call" in key:
del_list.append(key)
for key in del_list:
del res["DAG"][key]
with open(fileout, "w") as fout:
yaml.dump(res, fout, default_flow_style=False)

def gen_yml(device):
fin = open("config.yml", "r")
config = yaml.load(fin)
fin.close()
config["dag"]["tracer"] = {"interval_s": 10}
if device == "gpu":
config["op"]["bert"]["local_service_conf"]["device_type"] = 1
config["op"]["bert"]["local_service_conf"]["devices"] = "2"
with open("config2.yml", "w") as fout:
yaml.dump(config, fout, default_flow_style=False)

def run_http(idx, batch_size):
print("start thread ({})".format(idx))
url = "http://127.0.0.1:18082/faster_rcnn/prediction"
with open(os.path.join(".", "000000570688.jpg"), 'rb') as file:
image_data1 = file.read()
image = cv2_to_base64(image_data1)

start = time.time()
for i in range(10):
data = {"key": [], "value": []}
for j in range(batch_size):
data["key"].append("image_" + str(j))
data["value"].append(image)
r = requests.post(url=url, data=json.dumps(data))
print("done")
end = time.time()
return [[end - start]]

def multithread_http(thread, batch_size):
multi_thread_runner = MultiThreadRunner()
result = multi_thread_runner.run(run_http , thread, batch_size)

def run_rpc(thread, batch_size):
client = PipelineClient()
client.connect(['127.0.0.1:9998'])
with open("data-c.txt", 'r') as fin:
start = time.time()
lines = fin.readlines()
start_idx = 0
while start_idx < len(lines):
end_idx = min(len(lines), start_idx + batch_size)
feed = {}
for i in range(start_idx, end_idx):
feed[str(i - start_idx)] = lines[i]
ret = client.predict(feed_dict=feed, fetch=["res"])
start_idx += batch_size
if start_idx > 1000:
break
end = time.time()
return [[end - start]]


def multithread_rpc(thraed, batch_size):
multi_thread_runner = MultiThreadRunner()
result = multi_thread_runner.run(run_rpc , thread, batch_size)

if __name__ == "__main__":
if sys.argv[1] == "yaml":
mode = sys.argv[2] # brpc/ local predictor
thread = int(sys.argv[3])
device = sys.argv[4]
gen_yml(device)
elif sys.argv[1] == "run":
mode = sys.argv[2] # http/ rpc
thread = int(sys.argv[3])
batch_size = int(sys.argv[4])
if mode == "http":
multithread_http(thread, batch_size)
elif mode == "rpc":
multithread_rpc(thread, batch_size)
elif sys.argv[1] == "dump":
filein = sys.argv[2]
fileout = sys.argv[3]
parse_benchmark(filein, fileout)

60 changes: 60 additions & 0 deletions python/examples/pipeline/faster_rcnn/benchmark.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
export FLAGS_profile_pipeline=1
alias python3="python3.6"
modelname="bert"
# HTTP
ps -ef | grep web_service | awk '{print $2}' | xargs kill -9
sleep 3
python3 benchmark.py yaml local_predictor 1 cpu
rm -rf profile_log_$modelname
for thread_num in 1
do
for batch_size in 1 2
do
echo "----FasterRCNN thread num: $thread_num batch size: $batch_size mode:http ----" >>profile_log_$modelname
rm -rf PipelineServingLogs
rm -rf cpu_utilization.py
python3 web_service.py >web.log 2>&1 &
sleep 3
nvidia-smi --id=2 --query-compute-apps=used_memory --format=csv -lms 100 > gpu_use.log 2>&1 &
nvidia-smi --id=2 --query-gpu=utilization.gpu --format=csv -lms 100 > gpu_utilization.log 2>&1 &
echo "import psutil\ncpu_utilization=psutil.cpu_percent(1,False)\nprint('CPU_UTILIZATION:', cpu_utilization)\n" > cpu_utilization.py
python3 benchmark.py run http $thread_num $batch_size
python3 cpu_utilization.py >>profile_log_$modelname
ps -ef | grep web_service | awk '{print $2}' | xargs kill -9
python3 benchmark.py dump benchmark.log benchmark.tmp
mv benchmark.tmp benchmark.log
awk 'BEGIN {max = 0} {if(NR>1){if ($modelname > max) max=$modelname}} END {print "MAX_GPU_MEMORY:", max}' gpu_use.log >> profile_log_$modelname
awk 'BEGIN {max = 0} {if(NR>1){if ($modelname > max) max=$modelname}} END {print "GPU_UTILIZATION:", max}' gpu_utilization.log >> profile_log_$modelname
cat benchmark.log >> profile_log_$modelname
#rm -rf gpu_use.log gpu_utilization.log
done
done
# RPC
exit
ps -ef | grep web_service | awk '{print $2}' | xargs kill -9
sleep 3
python3 benchmark.py yaml local_predictor 1 gpu

for thread_num in 1 8 16
do
for batch_size in 1 10 100
do
echo "----Bert thread num: $thread_num batch size: $batch_size mode:rpc ----" >>profile_log_$modelname
rm -rf PipelineServingLogs
rm -rf cpu_utilization.py
python3 web_service.py >web.log 2>&1 &
sleep 3
nvidia-smi --id=2 --query-compute-apps=used_memory --format=csv -lms 100 > gpu_use.log 2>&1 &
nvidia-smi --id=2 --query-gpu=utilization.gpu --format=csv -lms 100 > gpu_utilization.log 2>&1 &
echo "import psutil\ncpu_utilization=psutil.cpu_percent(1,False)\nprint('CPU_UTILIZATION:', cpu_utilization)\n" > cpu_utilization.py
python3 benchmark.py run rpc $thread_num $batch_size
python3 cpu_utilization.py >>profile_log_$modelname
ps -ef | grep web_service | awk '{print $2}' | xargs kill -9
python3 benchmark.py dump benchmark.log benchmark.tmp
mv benchmark.tmp benchmark.log
awk 'BEGIN {max = 0} {if(NR>1){if ($modelname > max) max=$modelname}} END {print "MAX_GPU_MEMORY:", max}' gpu_use.log >> profile_log_$modelname
awk 'BEGIN {max = 0} {if(NR>1){if ($modelname > max) max=$modelname}} END {print "GPU_UTILIZATION:", max}' gpu_utilization.log >> profile_log_$modelname
#rm -rf gpu_use.log gpu_utilization.log
cat benchmark.log >> profile_log_$modelname
done
done
17 changes: 17 additions & 0 deletions python/examples/pipeline/faster_rcnn/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
dag:
is_thread_op: false
tracer:
interval_s: 10
http_port: 18082
op:
faster_rcnn:
local_service_conf:
client_type: local_predictor
concurrency: 2
device_type: 1
devices: '2'
fetch_list:
- save_infer_model/scale_0.tmp_1
model_config: serving_server/
rpc_port: 9998
worker_num: 20
80 changes: 80 additions & 0 deletions python/examples/pipeline/faster_rcnn/label_list.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
person
bicycle
car
motorcycle
airplane
bus
train
truck
boat
traffic light
fire hydrant
stop sign
parking meter
bench
bird
cat
dog
horse
sheep
cow
elephant
bear
zebra
giraffe
backpack
umbrella
handbag
tie
suitcase
frisbee
skis
snowboard
sports ball
kite
baseball bat
baseball glove
skateboard
surfboard
tennis racket
bottle
wine glass
cup
fork
knife
spoon
bowl
banana
apple
sandwich
orange
broccoli
carrot
hot dog
pizza
donut
cake
chair
couch
potted plant
bed
dining table
toilet
tv
laptop
mouse
remote
keyboard
cell phone
microwave
oven
toaster
sink
refrigerator
book
clock
vase
scissors
teddy bear
hair drier
toothbrush
35 changes: 35 additions & 0 deletions python/examples/pipeline/faster_rcnn/pipeline_http_client.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# from paddle_serving_server.pipeline import PipelineClient
import numpy as np
import requests
import json
import cv2
import base64
import os


def cv2_to_base64(image):
return base64.b64encode(image).decode('utf8')


url = "http://127.0.0.1:18082/faster_rcnn/prediction"
with open(os.path.join(".", "000000570688.jpg"), 'rb') as file:
image_data1 = file.read()
image = cv2_to_base64(image_data1)

for i in range(1):
data = {"key": ["image"], "value": [image]}
r = requests.post(url=url, data=json.dumps(data))
print(r.json())
Loading