Models

MARA Cloud Models

MARA Cloud provides access to high-performance open-source models optimized for enterprise inference workloads. Models are deployed across dedicated node clusters for consistent, low-latency performance.

Production models

Production models are intended for use in production environments and meet our high standards for speed, quality, and reliability.
ModelModel IDPrice per 1M tokensContext WindowHugging Face
GPT OSS 120Bgpt-oss-120B$0.15 input / $0.75 output128KModel card
DeepSeek V3.1DeepSeek-V3.1$0.60 input / $1.70 output128KModel card
MiniMax M2.5MiniMax-M2.5$0.30 input / $1.20 output160KModel card
Note: Pricing is per 1 million tokens. Input and output tokens are priced separately.

Usage example

Use the model ID when making API requests:
python
completion = client.chat.completions.create(
    model="MiniMax-M2.5",
    messages=[
        {"role": "user", "content": "Hello, world!"}
    ],
)

Next steps