Objective: To provide a standardized way to integrate various Text-to-Image generation services (APIs, local models, etc.) into the Lollms ecosystem.
Core Idea: Lollms uses a modular approach. Each specific TTI implementation (like Stable Diffusion via Automatic1111, ComfyUI, an online API like DALL-E, Midjourney, or Google Imagen/Gemini) is contained within its own binding or module. These modules all inherit from a common base class, LollmsTTI, ensuring they provide a consistent interface for the main Lollms application.
1. The LollmsTTI Base Class
The lollms.tti.LollmsTTI class serves as the foundation for all TTI modules. It defines the essential methods and properties that Lollms expects any TTI service to provide.
- Inheritance:
LollmsTTIinherits fromlollms.service.LollmsSERVICE. This is important becauseLollmsSERVICEprovides the basic framework for any Lollms service, including:- Access to the main
LollmsApplicationinstance (self.app). - Handling of configuration via
TypedConfig(self.service_config). - A unique
namefor the service. - Standard logging methods via
self.app(e.g.,self.app.info,self.app.error).
- Access to the main
- Key Purpose: To define a contract. Any class inheriting from
LollmsTTImust implement (or override) certain methods to be compatible with Lollms’ image generation features. - Initialization (
__init__):name: str: A unique identifier for this specific TTI service (e.g., “google_gemini”, “automatic1111”, “comfyui”).app: LollmsApplication: The main Lollms application instance, providing access to global settings, paths, logging, etc.service_config: TypedConfig: An object holding the specific configuration settings for this TTI service (e.g., API keys, model paths, default parameters). This is usually created from aConfigTemplate.output_folder: str | Path | None: Specifies where generated images should be saved. IfNone, it defaults to a standard location within the Lollms personal outputs path (lollms_paths.personal_outputs_path / name). The constructor ensures this folder exists.
- Core Generation Methods:
paint(...): The primary method for standard text-to-image generation.- Input:
positive_prompt,negative_prompt, and common Stable Diffusion-style parameters (sampler_name,seed,scale,steps,width,height). It also takes optionaloutput_folderandoutput_file_nameoverrides. - Implementation Detail: Crucially, not all TTI backends support all these parameters. A specific module (like
LollmsGoogleGemini) might ignore parameters likeseed,steps,sampler_name, or derivewidth/heightfrom anaspect_ratiosetting, as seen in the example. The implementation should handle the parameters relevant to its backend API or model. - Output:
Tuple[Path | None, Dict | None]. A tuple containing thePathto the first successfully generated image and aDictwith its corresponding metadata, OR(None, error_dict)on failure.
- Input:
paint_from_images(...): Intended for image-to-image, inpainting, or image variation tasks.- Input:
positive_prompt, aList[str]of input image file paths (images),negative_prompt, and the same optional parameters aspaint. - Implementation Detail: Similar to
paint, the specific module must adapt this to its backend’s capabilities. Some backends might only use the first image, others might support multiple. TheLollmsGoogleGeminiexample shows using the first input image with Gemini for image+text prompting. - Output: Same format as
paint:Tuple[Path | None, Dict | None]. Returns the path to the first successfully generated output image and its metadata, or(None, error_dict)on failure or if the backend doesn’t support image-to-image.
- Input:
- Static Utility Methods: These methods are called by Lollms before an instance of the TTI module is necessarily created. They operate at the class level.
verify(app: LollmsApplication) -> bool: Checks if the necessary prerequisites for this TTI module are met. This usually involves checking if required libraries are installed (e.g.,google-generativeaifor the Gemini example) or if essential configuration (like an API key placeholder) exists. It often uses helper libraries likePackageManager. Should returnTrueif usable,Falseotherwise.install(app: LollmsApplication) -> bool: Attempts to install any missing prerequisites identified byverify. Typically usesPackageManager.install_package. Should returnTrueon success,Falseon failure.get(app: LollmsApplication, config: Optional[dict]=None, lollms_paths: Optional[LollmsPaths]=None) -> 'LollmsTTI': A factory method. Lollms calls this static method to get an actual instance of the specific TTI class (e.g.,LollmsGoogleGemini). It passes theappinstance and potentially pre-loaded configuration (config) and paths (lollms_paths).
2. How TTI Modules are Used by Lollms
- Discovery: Lollms scans designated directories for bindings/modules.
- Verification & Installation: For each potential TTI module found, Lollms might call its static
verifymethod. If verification fails and installation is requested or automatic, it calls the staticinstallmethod. - Listing: Verified modules are presented to the user as available TTI services.
- Selection & Configuration: The user selects a TTI service. Lollms loads its configuration template (
ConfigTemplate) and presents the settings (like API key, model choice, etc.) to the user. User inputs are saved. - Instantiation: When the user wants to generate an image using the selected service, Lollms calls the static
getmethod of the chosen module’s class, passing theappinstance and the loaded serviceconfig. This returns an active instance (e.g., an instance ofLollmsGoogleGemini). - Generation Request: When the user provides prompts (and potentially other parameters or input images), Lollms calls the appropriate method on the instantiated object:
- For text-to-image:
instance.paint(...) - For image-to-image/variation:
instance.paint_from_images(...)
- For text-to-image:
- Execution: The module’s
paintorpaint_from_imagesmethod executes:- It potentially initializes its specific client or loads its model if not already done (as seen in
_initialize_clientin the example). - It translates the Lollms parameters (prompts, dimensions, etc.) into the format required by its backend (API call parameters, model inference arguments). It may ignore unsupported Lollms parameters.
- It communicates with the backend (makes the API call, runs the inference).
- It receives the image data (e.g., bytes, base64). Even if the backend generates multiple images, the implementation should focus on processing and saving the first one for the standard return value.
- It saves the first image’s data to a file in the designated output folder (using helpers like
find_next_available_filenameif no specific name is given). PIL (Pillow) is commonly used for saving. - It constructs the metadata dictionary for the saved image.
- It returns the tuple
(saved_path, metadata_dict)or(None, error_dict)back to Lollms.
- It potentially initializes its specific client or loads its model if not already done (as seen in
- Display: Lollms receives the tuple. If the first element (
image_path) is notNone, it displays the generated image and potentially the metadata. If it’sNone, it displays the error message from the second element (error_dict).
3. Developing a New TTI Module
Here’s a step-by-step guide based on the LollmsGoogleGemini example:
- Create the Module File: Create a Python file in the appropriate Lollms bindings directory (e.g.,
bindings/google_gemini/binding.py). Add header comments detailing the binding (Title, Author, Licence, Description, Requirements). - Import Necessary Libraries: Import
LollmsTTI,LollmsApplication,Path,List,Dict,Optional,Tuple, configuration classes (TypedConfig,ConfigTemplate), utility functions (find_next_available_filename,ASCIIColors,trace_exception), and any libraries required for your specific TTI backend (e.g.,requests,google.generativeai,PIL,diffusers, etc.). - Handle Dependencies:
- Use a
try...except ImportErrorblock to check if the required backend libraries are installed. - If missing, use
ASCIIColors.infoto inform the user andPackageManager.install_packageto install them. Re-import after installation or raise an error if it fails. This ensures the binding doesn’t crash Lollms if dependencies are missing initially.
- Use a
- Define the Class: Create a class that inherits from
LollmsTTI.from lollms.tti import LollmsTTI # ... other imports class LollmsMyTTIService(LollmsTTI): # ... implementation ... - Implement
__init__:- Define the
__init__method acceptingapp,service_config, and optionallollms_paths(if needed beyond whatappprovides) orconfigdict. - Define the
ConfigTemplatefor your service’s settings (API keys, models, defaults). - Create the
TypedConfiginstance:service_config = TypedConfig(service_config_template, config or {}). - Crucially, call the parent constructor:
super().__init__("my_tti_service_name", app, service_config, output_folder). Ensure thenameis unique. Handle theoutput_folderlogic as in the base class or example. - Initialize any state needed for your service (e.g., set API clients to
None, load default values fromservice_config). - Optionally, attempt to initialize the backend client immediately if configuration allows (like the Gemini example checking for an API key).
- Define the
- Implement
paint:- Define the
paintmethod with the-> Tuple[Path | None, Dict | None]signature. - Add logic to ensure your backend client/model is initialized (call an internal
_initialize_clientor similar helper if needed). Handle initialization failures by returning(None, {"error": ...}). - Retrieve necessary parameters from
self.service_config(e.g., model name, specific quality settings). - Translate the input parameters (
positive_prompt,negative_prompt, etc.) into the format expected by your backend API or model. Handle unsupported parameters and combine prompts as needed. - Make the call to your TTI backend within a
try...exceptblock. Catch specific API/model errors and generic exceptions. Inexceptblocks, log the error usingself.app.errorandtrace_exception, thenreturn (None, {"error": f"Descriptive error message: {e}"}). - Process the response. Extract the image data for the first image received.
- Determine the output filename (use
output_file_nameif provided, otherwise generate one usingfind_next_available_filename). - Save the image data to a file using
PILor another library. Ensure the output folder exists (self._ensure_output_folder). If saving fails, return(None, {"error": "Failed to save image"}). - Create the
metadatadictionary for the saved image. - Return
(saved_file_path, metadata). If processing fails after the API call but before saving, return(None, {"error": "Failed to process image data"}).
- Define the
- Implement
paint_from_images:- Define the
paint_from_imagesmethod with the-> Tuple[Path | None, Dict | None]signature. - Check if your backend supports image inputs. If not, log an error and return
(None, {"error": "Backend does not support image-to-image"}). - Load the first input image from
images[0]. Handle errors (file not found, load errors) by returning(None, {"error": ...}). - Follow similar steps as
paint: initialize client, translate parameters, make the backend call (providing the image data), handle errors (returning(None, error_dict)), process the response for the first output image, save it, create metadata (include input image info), and return(saved_file_path, metadata).
- Define the
- Implement Static Methods:
verify(app: LollmsApplication) -> bool: Implement the check for dependencies (e.g.,return PackageManager.check_package_installed("my_required_library")).install(app: LollmsApplication) -> bool: Implement the installation command (e.g.,return PackageManager.install_package("my_required_library")).get(app: LollmsApplication, config: Optional[dict]=None, lollms_paths: Optional[LollmsPaths]=None) -> 'LollmsMyTTIService': Implement the factory method:return LollmsMyTTIService(app=app, config=config, lollms_paths=lollms_paths).
- Helper Methods (Optional but Recommended):
- Create private helper methods like
_initialize_client,_ensure_output_folder, etc., to keeppaintandpaint_from_imagescleaner, as shown in the Gemini example. - Consider a
settings_updatedmethod if client re-initialization is needed when settings (like API key) change via the UI.
- Create private helper methods like
- Testing: Test thoroughly with different prompts, negative prompts, parameters, error conditions (invalid API key, network down, invalid input image), and ensure the correct tuple
(Path, Dict)or(None, Dict)is returned in all scenarios.