ComfyUI-Gemini
Run ComfyUI Easily with InstaSD
Skip the complex setup. InstaSD helps creative professionals build workflows and deploy them to the world:
- One-click deployment
- Any model, any node
- Powerful GPUs for rapid iteration
Available Nodes
Gemini_API_Chat_Zho
Documentation for ComfyUI Node: Gemini_API_Chat_Zho
Overview
The Gemini_API_Chat_Zho node is part of the ComfyUI-Gemini project, which integrates Google Gemini capabilities into the ComfyUI platform. This specific node allows users to interact with the Gemini chat model, enabling contextual conversations using advanced language capabilities.
Node Functionality
The Gemini_API_Chat_Zho node facilitates interactions with the Gemini chat models (gemini-pro and gemini-1.5-pro-latest), potentially incorporating contextual understanding and conversation history. It is designed to act as a chatbot within your ComfyUI workflow, providing contextual responses to user prompts.
Inputs
This node accepts the following inputs:
- prompt: A string input where the user provides the prompt or question they wish to pose to the Gemini chat model. This input can be multiline, allowing for more complex queries.
- model_name: The model choice input, where users select between the available models:
gemini-progemini-1.5-pro-latest
- api_key: A string input for the user's Gemini API Key. This input is necessary for authenticating the requests to the Gemini service.
Optional Inputs
- image: An optional image input which is not necessary for the
gemini-promodel but can be utilized with thegemini-1.5-pro-latestmodel to enhance the conversation with visual information.
Outputs
The node produces the following output:
- response: A string output containing the chat response from the model, formatted as a conversational exchange. It includes the chat history to provide contextual insight into the ongoing dialogue.
Usage in ComfyUI Workflows
In ComfyUI workflows, the Gemini_API_Chat_Zho node is primarily utilized for conversational purposes. Here’s an example of how it might be used:
- User Input: A user begins a conversation by feeding a prompt into the node.
- Model Selection: The workflow specifies which Gemini model to use, based on the requirements for the conversation.
- API Key: The user provides an API key to authenticate the interaction with the Gemini API.
- Chat Execution: The node processes the input and delivers a contextual response, simulating a chat with a virtual assistant or AI companion.
This node can be particularly useful for applications requiring natural language understanding and generation, such as virtual customer assistants, interactive storytelling, or educational tools that guide users through conceptual topics.
Special Features
- Contextual Understanding: The node maintains a history of the conversation, which allows it to deliver responses that are contextually aware and coherent across multiple interactions.
- Image Integration: While primarily a text-based interaction tool, the node can incorporate images, providing a richer context for conversations in scenarios where visual input is beneficial.
Considerations
- API Key Security: It is important to handle the API Key with care. In scenarios where the workflow is shared, avoid exposing the key publicly to prevent unauthorized access to the Gemini service.
- Model Capabilities: Different models have different strengths; selecting the appropriate model (text-based vs. multimodal) is crucial for achieving the desired interaction.
For further details or to contribute, visit the ComfyUI-Gemini GitHub repository.