Home / Inference APIs / Modal

Modal

Free Trial

Run generative AI models, large-scale batch jobs, job queues, and much more.

Modal supports deploying and scaling a variety of AI models, including language models like LLaMA 2 and Mistral for text generation, Stable Diffusion models for image generation tasks, and allows for custom fine-tuning of models such as Flan-T5. This positions Modal as a versatile platform for a wide range of AI development needs, from text and image processing to specialized model optimization.

Pricing: Per compute

Hosting Cloud

Pricing Usage Based, from $30/mo free credits

HQ 🇺🇸 United States

Founded 2021

License PROPRIETARY

Compliance SOC 2 · HIPAA · GDPR · SSO

Visit website →

Screenshot of Modal webpage

Resources

Erik Bernhardsson of Modal.com

Creating our Own Kubernetes & Docker to Run Our Data...

Modal Alternatives

Explore 79 products in the Inference APIs category. View all Modal alternatives.

OpenRouter

Unified API for 400+ AI models across 60+ providers, OpenAI SDK-compatible, pay-as-you-go

Free Trial From Free (25+ free models)

OurToken

Unified OpenAI-compatible API gateway that routes requests across multiple LLM providers

WAYSCloud

Norwegian cloud platform with an OpenAI-compatible LLM API running open-weight models in Oslo

IONOS AI Model Hub

OpenAI-compatible API for open-weight LLMs and image models, hosted in IONOS EU data centers

From From $0.17/1M tokens (Llama 3.1 8B)

Opper

EU-hosted AI gateway serving 300+ models through one OpenAI-compatible API

From Provider token rates + 3% credit fee

CheapestInference

Flat-rate unlimited inference on open-weight models, sold in daily 8-hour windows

From $6.99/mo (Core pool)

View all Inference APIs tools ≫

Compare

Modal vs RunPod

Also listed in

🧠 Fine-tuning

Work on Modal? Feature it at the top of Inference APIs.

Is your product missing?

Add it here →