# Model Selector

The Inference Grid supports a variety of models. Therefore, instead of specifying a model directly, we encourage you to use a selector config to specify your requirements.

# Example

For example, you can pass the following JSON string to the model parameter in the OpenAI-compatible API:

model_name = json.dumps({
    "tier_selector": ">=3",
    "min_context_length": 10_000,
    "flags": ["vision", "uncensored"],
})

This will match any model in tier 3 or 4, supports vision, is uncensored, and has a context length of at least 1,000 tokens. When routing your request, the Inference Grid will attempt to find the cheapest provider that meets your requirements.

# Configuration

Here are the configuration options you can use to specify your requirements.

tier_selector: A selector for the tier of the model. It can be a single value, an inequality, or a set of values. For example, 3, >=2, and {1,2,3} are all valid selectors.
flags: You can specify the following flags to indicate capabilities that you require.
- vision: Whether the model has vision capabilities.
- tool-use: Whether the model supports tool use.
- uncensored: Whether the model is uncensored.
min_context_length: The minimum context length.

When providers join the Inference Grid, we run validators to ensure that they support the capabilities that they claim to provide and run benchmarks to classify them into tiers.

# Tier Definitions

The Inference Grid uses a tier system to classify models into tiers based on their capabilities. As new models are released, additional tiers will be added, but as of early-2025, the reference points are:

Tier 1. phi3-mini, tinyllama-v1.1
Tier 2. llama-3.2-11b-vision-instruct, pixtral-12b, dolphin-mixtral-8x7b
Tier 3. llama-3.2-90b-vision-instruct, pixtral-large-2411, sao10k/l3.3-euryale-70b
Tier 4. sonnet-3.5, gpt-4o, deepseek-v3

We are constantly validating the performance of models on the grid by comparing them against these reference points and updating their tier accordingly. For example, if a new provider joins the grid with a new model, we will take their model, generate a set of outputs, and compare the quality against models from each of these tiers. The results will be used to set the tier of the new model.

This system ensures that regardless of what the underlying model is, when you specify that you want a tier 3 model, you will get an LLM that offers comparable performance to the reference models.

← Platform Introduction →