serverless-sample-apps

module
v0.0.0-...-8a44729 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 15, 2023 License: MIT

README

Serverless-sample-apps

Applications for serverless test

Quick Start

Deploy http server

make docker-build
kubectl apply -f install/http_server.yaml

Test http server

  1. Get pod name
kubectl get pods
  1. Forward port in another terminal
kubectl port-forward <pod name> 8080:8080
  1. Test
curl localhost:8080

Model Test

This part shows how to download models from Hugging Face (via Transformers).

Prerequests
  1. Create a virtual environment.
  2. Install PyTorch and Transformers
Download the model

Please note, by default, the following line will download 'opt-125m' which contains 125 million parameters and try to distribute the model over all available devices.

python python/opt/download.py

To limit the devices used, please set CUDA_VISIBLE_DEVICES. For example, the following lines run two opt-13b models on GPU 0,1 and 2,3, which is the case shown in the below figure.

CUDA_VISIBLE_DEVICES=0,1 python python/opt/download.py --model-name opt-13b &
CUDA_VISIBLE_DEVICES=2,3 python python/opt/download.py --model-name opt-13b &
image

CPP Model Test

This part shows how to do model inference with C++. Make sure you have finished the above steps.

Prerequests
  1. Download and install PyTorch C++ library according to this. Please make sure you can successfully run the example.
  2. Transform the model to TorchScript. The following line will transform the model to TorchScript and save it to python/opt/[model_name].pt.
cd python/opt
CUDA_VISIBLE_DEVICES=0 python download_and_trace.py --model-name opt-1.3b
Build and run
mkdir build
cd build
cmake ..
cmake --build . -j $(nproc)
./inference ../opt-1.3b.pt
TODO

Please compare the results with the Python version.

Directories

Path Synopsis
cmd
http_server command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL