NVIDIA Launches Free GPU Endpoint for MiniMax M3 Long-Context Multimodal Model

NVIDIA offers accelerated inference for MiniMax M3, supporting text, image and video reasoning with long context. Available immediately via free GPU endpoint.

NVIDIAMiniMaxMultimodalInferenceEnterprise AI

At a glance

NVIDIA has partnered with MiniMax to provide a free GPU-accelerated inference endpoint for the newly released MiniMax M3 long-context multimodal model.

What changed

MiniMax released M3, a model capable of unified reasoning over text, image, and video with extended context length. NVIDIA immediately made the model available through a free, GPU-accelerated endpoint on its platform, allowing direct access without local infrastructure.

Why it matters

Teams can integrate long-context multimodal capabilities with near-zero setup time, reducing infrastructure cost and accelerating workflow deployment. Commercially, organisations gain faster access to competitive multimodal performance without incurring additional model-hosting expenses. For compliance-aware teams the NVIDIA-hosted endpoint provides a governed, auditable inference path that aligns with enterprise procurement and security policies.

Key details

Model handles text, image, and video reasoning in a single system.
Long-context design supports extended input sequences.
Free GPU endpoint removes the need for dedicated hardware allocation.
Integration available immediately through NVIDIA's AI platform.

Sources

Notes for citation

Publication date reflects source timestamps of 11–12 June 2026. Endpoint availability confirmed in official NVIDIA announcement. All operational details limited to information supplied in the referenced posts.

Want to discuss how this affects your workflows? Book a call →