Featured
HomeStaging AI - Image Generation/Inpaint/Upscale by AI
AI
Pytorch + CUDA 12.8DiffusersTransformersStable DiffusionControlNetOpenCVDockerxFormersGPUPython

KandyAI HomeStaging — Project Description (EN)

Distributed GPU worker that orchestrates 4+ AI models in pipeline to transform interior photographs into realistic staged renders. Built as the core processing engine of a SaaS platform for AI-assisted interior design.
Multi-Model Pipeline Orchestration
- Chains Stable Diffusion 1.5, ControlNet (depth-guided), Apple DepthPro, and Meta Segment Anything in a dependency-aware execution graph
- Tasks declare their prerequisites declaratively (depth map, segmentation masks); the runtime resolves and caches intermediate assets automatically — a depth map generated once is reused across all subsequent jobs for the same source image
- Each pipeline stage can fail independently without cascading — graceful degradation with per-stage error checkpoints and structured traceback logging
Generation (Depth-Guided Image Synthesis)

- Generates photorealistic interior scenes from a source photograph, preserving room geometry via ControlNet depth conditioning
- 10 design styles (Scandinavian, Minimalist, Art Deco, Japanese…) x 5 room types, driven by a structured prompt catalog (250+ prompt pairs with positive/negative guidance)
- LoRA adapter blending — loads multiple Low-Rank Adaptation models with individual weights and merges them at inference time for fine-grained style control
- Aspect ratio preservation with intelligent resizing: calculates optimal dimensions for a 1M-pixel budget, rounds to multiples of 8 for VAE compatibility, handles both upscaling and downscaling
Inpainting (Selective Region Modification)

- Full pipeline: semantic segmentation → mask extraction → color preprocessing → depth-guided inpainting → soft blending
- Color preprocessing applies desaturation (Rec. 601 luma), white lift, gamma correction, and alpha tinting per material preset (wall, parquet, tiles) — normalizes lighting before diffusion to improve coherence

- Gaussian-feathered soft masking with gamma-controlled falloff eliminates hard seams between inpainted and original regions
- Mask extraction from segmentation tensors by semantic class label, enabling targeted modification of walls, floors, or ceilings independently
Architecture & Reliability
- Stateless multi-worker design with PostgreSQL row-level locking — horizontal scaling by adding GPU instances, no shared state between workers
- Exponential backoff with jitter on all network calls (configurable retries, separate connect/read timeouts, thundering-herd prevention)
- Pydantic V2 discriminated unions for type-safe job dispatch — the API response is parsed into the correct task type at deserialization time, with automatic model/prompt injection from configuration
- Per-job contextual logging via Python context managers — each job gets its own timestamped log file with dual output (file + console)
- Idempotent model management — HuggingFace
snapshot_downloadwith local cache, existence checks before every download, extensible registry pattern for adding new models
Containerization & Deployment
- Docker with NVIDIA CUDA 12.8 base, in-container venv isolation, layer ordering optimized by change frequency
- Cross-platform subprocess management for the upscaler binary (platform-specific encoding, path normalization)
- Discord webhook notifications with image attachments for real-time monitoring
Tech Stack
Python 3PyTorch + CUDA 12.8DiffusersTransformersStable Diffusion 1.5ControlNetLoRADepthProSegment AnythingxformersPydantic V2NumPyOpenCVPillowDocker (NVIDIA CUDA)PostgreSQL