Skip to main content

Deployment Modes

There are 3 deployment modes - off, test and production. You can control these in the deployment tab in the project page. You can freely change between any of these modes, however changes might take a few minutes to complete.

note

Your project needs to be in a ready state before the deployment tab is available, hang on tight while our system verify and prepare your project for deployment!

This section details the purpose or use cases for the different deployment modes, for information regarding connection please refer to inference API documentation.

Off

The default mode for all projects, no infrastructure is setup and hence no inference request can be made. You will not be billed while your project is in this mode.

Test

You can test and/or verify your connection to the inference server in this mode. In this mode, there should only be a single instance serving requests made.

Do note that in this mode, you will be billed hourly. You can take advantage of the test mode to verify your connection to keep cost low before moving to the production mode.

Production

In this mode, we will deploy your model in a reliable, auto-scaling infrastructure. As the load of requests increases/decreases, we will allocate sufficient resources to meet your needs.

Similar, you will also be billed hourly when you are in this mode.

Non-Use Timings

Non-Use Timings are time period(s) which you do not expect any requests to be sent. This will help you reduce cost

warning

You can expect requests made during the Non-Use Timings to be rejected/fail