# Introduction
DANGER
The Operator program is currently invite-only. If you're interested in running an inference node, please reach out to us at hello@inferencegrid.ai.
So you want to run an inference node?
An inference node is a server that can process requests from the Inference Grid. It can be self-hosted on your own hardware or cloud provider and connects to a relay node. The inference node and relay node work together to:
- Route user requests to inference nodes that meet the user's requirements.
- Return the model output and a Lightning invoice to the user.
- Validate the node's responses.
- Validate the user's payment.
- Update reputation scores across the network.
But you don't need to worry about any of this, we provide the code to automatically do all of this!
# Docker
Start by creating a config.json
file with the following:
{
"private_key": "...",
"spark_mnemonic": "...",
"display_name": "My Awesome Node (optional)",
"website": "https://inferencegrid.ai (optional)",
"models": [
"dolphin-mixtral-8x7b",
"llama-3.2-11b-vision-instruct"
]
}
The private key and Spark mnemonic are used to identify your node and receive funds respectively. To set these up, check out the configuration section.
Then run the following command to start your node:
docker run --gpus all \
-v $(pwd)/config.json:/app/config.json \
-p 8080:8080 inference-grid/inference-node
Assuming you have sufficient GPU resources, your node will start processing requests and you'll be able to
view your node's performance at http://localhost:8080
. Once you've served some requests and gotten paid,
use the wallet UI to withdraw your funds.
# Serverless
We also support serverless deployment via Modal (opens new window). This enables you to deploy a large number of different models and configurations without having to worry about the underlying infrastructure and only pay for the resources you use.
DANGER
Coming soon!