AI-assisted annotation
Tator allows for registration of applets that can be launched from the annotation view, enabling developers to deploy computer vision models to speed up annotation. This guide will show you how to set up Nvidia Triton Inference Server on a remote server, then use an annotation menu applet in Tator to make calls to it for an object detection task. For convenience, we will use an AWS EC2 instance proxied by Amazon API Gateway to set up the standing inference server, and rely on pre-trained object detection models from TensorFlow Model Zoo. Note that these could easily be replaced by on-premise infrastructure or proprietary algorithm models.
Set up an EC2 node
To begin, we will create an EC2 node with a GPU suitable for inference. Open the AWS EC2 console and click Launch instances
.
Click AWS Marketplace
and then search for nvidia deep learning
and press enter. The Nvidia Deep Learning AMI should be at the top of the list. Click Select
.
You will be shown pricing details for various instances. Click Continue
. You will be prompted to select an instance type. Choose a g4dn.2xlarge
. These instances have a single Nvidia T4 GPU suitable for inference with relatively low cost. Click Review and Launch
.
Click Launch
. When prompted for a key pair, choose Create a new key pair
, name it triton-example
and download it. Click Launch Instances
.
We need to allow inbound traffic on port 8000. Go to the main EC2 console and click Security Groups
. Select the security group with name NVIDIA Deep Learning AMI-...
. Click Edit inbound rules
, then Add rule
.
Set the Type
to Custom TCP
, Port range
to 8000
, and Source
to Anywhere
. Click Save rules
.
Go back to the main EC2 console. When the instance is ready, select the instance and click Actions > Connect
.
Click Connect
to start a web-based terminal, or configure SSH following the steps outlined using the key downloaded earlier. You should now have a terminal open to the EC2 instance.
Configure Triton Inference Server
First create a model repository that includes Faster R-CNN with Inception v2:
apt install -y python3-pip
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist --upgrade nvidia-dali-cuda110
docker pull nvcr.io/nvidia/tritonserver:22.02-py3
git clone -b stable https://github.com/cvisionai/tator
cd tator/doc/examples/faster_rcnn_applet
./setup_model_repo.sh
The model repository includes a preprocessing model that uses Nvidia DALI and an ensemble model that combines preprocessing with Faster R-CNN.
Start the inference server.
docker run --gpus 1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v $(pwd)/model_repo:/models nvcr.io/nvidia/tritonserver:22.02-py3 tritonserver --model-repository=/models
Proxy the inference server
Triton Inference Server accepts unencrypted HTTP traffic, but to make it accessible to Tator we want to securely expose our model to the internet. For simplicity, we will use Amazon API Gateway, a managed service we can use as a reverse proxy.
Go to the console for API Gateway and click Build
under HTTP API
.
Give the gateway a name and click Review and Create
.
Click Create
.
Click Routes
from the left menu, then click Create
. Enter /{proxy+}
to cover all routes, then click Create
.
Click Integrations
from the left menu. Select the route we just created, then click Create and attach an integration
.
Choose HTTP URI
and then enter the public DNS name for your EC2 instance, followed by :8000/{proxy}
. This is necessary to correctly serve all routes.
Click CORS
from the left menu. Type *
into Access-Control-Allow-Origin
, click Add
, then click Save
.
Now test the API server by constructing a request to the inference server with the Invoke URL
of your API gateway and a REST URI for Triton Inference Server.
Register annotation applet
Create a project using the object detection template, then go to project settings. Click Localization Types
then click Boxes
. Click New Attribute
.
Set the Name
to Score
, Data Type
to float
, and Visible
to Yes
. Click Save
.
Download the example applet file and edit it, replacing the domain YOUR_API_DOMAIN
with your domain (the one assigned by API Gateway above). Save it somewhere locally so you can upload it.
Click Applets
then + Add new
. Call the applet Faster R-CNN
. Add a category called annotator-menu
. For the HTML file, upload the modified example file. Click Save
.
Now upload some media and open it in the annotation view.
On any frame, right click and select Faster R-CNN
.
Click Detect
. If the algorithm finds any objects, new boxes will be created on the frame. Note that the first inference takes longer than subsequent inferences.