Ollama Cloud
Ollama Cloud lets you generate natural‑language text by sending your prompts to the Ollama Turbo models that run in the cloud. It’s a quick way to add powerful AI text generation to your Nappai workflows without having to host any models yourself.
How it Works
When you add the component to a workflow, Nappai takes the values you enter (or map from other components) and builds a request to the Ollama Cloud API. The request is authenticated with a stored Ollama Cloud API credential that you set up in the Nappai credentials section. The API processes the prompt using the selected model and returns the generated text. If you enable Stream, the response is sent back piece‑by‑piece so you can show it in real time. The component also returns a reference to the model that was used, which can be passed to other components that need a language‑model object.
Inputs
Mapping Mode
This component has a special mode called “Mapping Mode”. When you enable this mode using the toggle switch, an additional input called Mapping Data is activated, and each input field offers you three different ways to provide data:
- Fixed – You type the value directly into the field.
- Mapped – You connect the output of another component to use its result as the value.
- Javascript – You write Javascript code to dynamically calculate the value.
This flexibility allows you to create more dynamic and connected workflows.
Input Fields
- Base URL – The endpoint of the Ollama Cloud API. It defaults to
https://ollama.com
for Turbo mode. - Credential – Select a stored credential for Ollama Cloud. This is required for authentication.
- Format – Specify the format of the output (e.g.,
json
). - Input – The prompt or text you want the model to process.
- Mapping Mode – Toggle to enable batch processing of multiple records.
- Metadata – Add custom metadata to the run trace for easier debugging and logging.
- Mirostat – Enable or disable Mirostat sampling, which helps control the model’s perplexity.
- Mirostat Eta – Learning rate for the Mirostat algorithm (default: 0.1).
- Mirostat Tau – Balance between coherence and diversity of the output (default: 5.0).
- Model Name – Choose from the list of available Ollama Cloud models. Click the refresh button to update the list.
- Context Window Size – Size of the context window for generating tokens (default: 2048).
- Number of GPUs – Number of GPUs to use for computation (default: 1 on macOS, 0 to disable).
- Number of Threads – Number of threads to use during computation (default: detected for optimal performance).
- Repeat Last N – How far back the model looks to prevent repetition (default: 64, 0 = disabled, -1 = num_ctx).
- Repeat Penalty – Penalty for repetitions in generated text (default: 1.1).
- Stop Tokens – Comma‑separated list of tokens that signal the model to stop generating text.
- Stream – Stream the response from the model. Streaming works only in chat mode.
- System – The system to use for generating text.
- System Message – System message to pass to the model.
- Tags – Comma‑separated list of tags to add to the run trace.
- Temperature – Controls the creativity of model responses.
- Template – Template to use for generating text.
- TFS Z – Tail free sampling value (default: 1).
- Timeout – Timeout for the request stream.
- Top K – Limits token selection to top K (default: 40).
- Top P – Works together with top‑k (default: 0.9).
- Verbose – Whether to print out response text.
Credential Setup
- Create the credential – In the Nappai credentials section, add a new credential of type Ollama Cloud API.
You’ll need your Ollama Cloud API Key, which you can obtain from the Ollama Cloud Console. - Select the credential – In the component’s Credential field, choose the credential you just created.
Outputs
- Text – The generated text from the model. It’s returned as a Message object and can be used as input to other components or displayed in dashboards.
- Model – A reference to the language model that was used. It’s returned as a LanguageModel object and can be passed to components that require a model instance.
Usage Example
Suppose you want to create a quick FAQ generator:
- Add the Ollama Cloud component to your workflow.
- Set the Model Name to
llama3.1
(or any other available model). - Enter a prompt in the Input field, e.g.,
Generate a short FAQ for a new product launch.
- Enable Stream if you want the answer to appear in real time.
- Connect the Text output to a Display component to show the generated FAQ in your dashboard.
If you have a list of product descriptions and want to generate FAQs for each one, enable Mapping Mode, connect the list to Mapping Data, and map the Input field to the description field. The component will process each record and output a list of FAQs.
Related Components
- ChatOllamaTurboBase – The base component that provides common functionality for all Ollama Turbo models.
- Text Formatter – Use this after the Ollama Cloud component to clean up or format the generated text.
- Data Store – Store the generated text in a database or file for later use.
Tips and Best Practices
- Use a small temperature (e.g., 0.2) for factual or deterministic responses; increase it for more creative output.
- Enable Mirostat if you notice the model producing overly repetitive or nonsensical text.
- Set a reasonable timeout to avoid hanging requests, especially when processing large batches.
- Refresh the Model Name list after installing new models in Ollama Cloud to keep the dropdown up to date.
- Test with a single prompt first before enabling Mapping Mode to ensure your prompt and settings work as expected.
Security Considerations
- The Credential field stores your Ollama Cloud API key securely in Nappai’s credential vault.
- Never expose the API key in the workflow or share the workflow file without removing the credential reference.
- Use Metadata and Tags to track which runs used which credentials, helping with audit and compliance.