Vistoria
My Role
Solo Designer, UX Researcher
Description
A B2B SaaS multimodal co-editing platform that unified fragmented AI workflows into a single canvas, linking narrative and visuals, structuring messy ideas into assets, and shipping production-ready creative contents
Scope
Concept → Research → Interaction model → High-fi deployed prototype → Field test
Timeline
12 weeks / May 2025 - Sep 2025
Tool
Design (Figma), Research (Survey, Interview, Wizard-of-Oz, Controlled Study)
Platform
Web-based
Context and Problem
Where Creativity Breaks Today
We target at B2B creative production teams are people that need fast and contextualized iterations from concept to prototype.
We transform divergent thinking into a structured, reusable, and auditable creative pipeline, allowing text and visuals to be isomorphic in real time on the same canvas, branches to be traceable, and exports to be delivered immediately.
Product Strategy and Value Loop
+01 Studio SaaS Seat-based annual subscription for co-editing canvas, two-way text↔visual sync, Instruments, one-click exports etc.
+02 Enterprise SaaS SSO, VPC/private models, governance (Style Tokens, brand/IP policy), provenance/audit, SLAs
+03 SDK/API read/write cards, export packs, embed canvas into DAM/MAM/game build tools
Vistoria is a multimodal authoring-and-control platform based on the video-led creator economy’s rapid growth and needs for auditable AI workflows.
+02 Enterprise SaaS SSO, VPC/private models, governance (Style Tokens, brand/IP policy), provenance/audit, SLAs
+03 SDK/API read/write cards, export packs, embed canvas into DAM/MAM/game build tools
Vistoria is a multimodal authoring-and-control platform based on the video-led creator economy’s rapid growth and needs for auditable AI workflows.
Market Size
Go-to-Market Strategy
Design Thinking
Research to Design
We designed a Wizard-of-Oz (WoZ) co-design study, positioning participants as 8 active co-designers and treating text, sketches, and images as shared design materials
Result
Solution
Architecture
Input: Multimodal Free Canvas -> Output: Setting Cards
Solving Pain Point #1: Difficult to express and summarize fragments of ideas
Live Multimodality Edit and Sync
Solving Pain Point #2: Flow interrupted and reworked due to fragmented and linear tools
Output Cluster, Instant Revision, One-Click Output
Difficult to quickly delivering usable contents from massive storyboards
Final Design
Validation and Results
Reflection
What Vistoria Achieved
Validated Multimodal Interplay
Formative studies (N=10) confirmed that text and images play distinct, complementary roles: text facilitates abstract imagination, while images provide concrete grounding and inspiration.
Instrumental Interaction Implementation
Successfully reified abstract creative intents into four instrumental operations (Lasso, Collage, Perspective Shift, Filter) that synchronize edits across text and image simultaneously.
Enhanced Expressiveness & Immersion
A controlled study (N=12) demonstrated significantly higher expressiveness and immersio compared to text-only LLM baselines, fostering divergent narrative exploration.
Preservation of Agency
Despite higher reported cognitive workload ($M=5.16$) due to modality switching, users maintained a strong sense of ownership and agency, viewing the system as a creative partner rather than a replacement.
Next Steps
Longitudinal Field Deployment
Conduct studies with diverse expertise levels to assess Vistoria's scalability for long-form writing and its consistency over extended creative cycles.
Observe how writers spontaneously capture and cluster dispersed inspirational fragments in authentic contexts.
Adaptive Mixed-Initiative Workflows
Implement meta-instruments to regulate the division of labor, allowing smooth transitions between model-led suggestions (for "blank slate" moments) and user-led instrumental refinement.
Expanded Multimodality
Integrate audio cues and ambient sound effects to support atmosphere building and provide an additional channel for creative inspiration.
Rigorous Evaluation Metrics
Incorporate construct-grounded measures and validated scales (e.g., Mixed-Initiative CSI) to move beyond self-reporting and better evaluate objective story quality and originality.