Inference UI
The Open Innovation Platform offers a streamlined interface for running inferences on deployed models. To begin, go to the Deployment section in your workspace, select the desired deployment, and click the Inference tab. You can then provide input data and request predictions from your model.
1. LLM Inference
1.1 Chat Inference
Interact with a language model by providing a conversation-like input (messages).
1.2 Completion Inference
Submit a single prompt or partial text for the model to complete.
1.3 Sequence Classification
Obtain classification labels or sentiment scores from short text inputs.
1.4 Automatic Speech Recognition (ASR)
Convert spoken audio into text.
1.5 Text to Speech (TTS)
Generate spoken audio from textual prompts.
1.6 Text to Image
Produce images based on textual descriptions.
1.7 Translation
Translate text from one language to another.
1.8 Reranking
Reorder a list of text items (e.g., search results) based on relevance.
1.9 Embedding
Obtain vector embeddings from text for similarity search or clustering tasks.
2. Classical ML Models
For classical ML deployments, the Inference tab provides a form for submitting numerical or structured data. You’ll receive predictions in real time to validate model outputs.
Next Steps
- Model Inference Overview – Learn about the input formats and response structures for different model types.
- Deployments UI – Discover how to manage and monitor your deployed models in detail.
- Performance Benchmark – Evaluate your model’s performance under simulated load conditions.