Ji Kim

Visual, Interaction, Experience Design

Ji Kim is a visual experience & interaction designer with a solid foundation in graphic design. She specializes in crafting engaging and playful visual experiences in digital media and has extensive experience collaborating with multidisciplinary teams. Most recently, Ji was part of a design R&D team in a global brand, where she focused on creating bespoke retail experiences. 
Currently open to new projects

narrative.ji@gmail.com
Linkedin


Terrarium Spatial Video Player Case Study

2024
Personal Project 
Final Project at UXD for the ARVR Certificate, NYU Tandon

Spatial Video UI Exploration

Role
UX/UI, Interaction Design, Visual Design, Prototyping in Unity,
Exploring Spatial Video Browsing Beyond Players

Overview
Terrarium is a spatial video gallery exploration that investigates how video browsing behavior changes when freed from flat screens. Rather than treating VR as a backdrop for 2D players, this project explores video as spatial objects in which depth, proximity, and movement serve as interaction primitives.

This work began with a 2D web-based exploration and evolved into an XR prototype built and tested in Unity. The goal was not to build a feature-complete product, but to understand how short-form videos behave when browsing becomes spatial.


Prototype v.1.3 Screenshot


01. Problem: Video Browsing is Still Flat


Most VR video players replicate familiar 2D metaphors

    • Rectangular panels floating in space
    • Theater-like environments
    • One video at a time, enlarged for focus


This approach works well for long-form media. 
But it breaks down for short-form, memory-driven videos because users want to

      • Browse many clips
      • Compare moments
      • Mentally recreate the atmosphere rather than focus on a single narrative


    The core question quickly became,

    “What if video browsing in XR was spatial instead of sequential?”


    Current UX Problems & Pain Points
    Too Big, Too Fuzzy, Too Dizzy

    Most spatial video players enlarge the video dimensions, mimicking a theater setting. Often, users are too close to the video surface. A 360 video player stretches the media to texture-map it onto a sphere. It degrades the asset’s resolution and can cause motion sickness.
    Stuck in Design Convention

    Most spatial video players adhere to the traditional 2D video player design principles. Bottom line is that most existing video players are designed to accommodate long-form media like movies.
    Mimic Real World

    Most spatial video player apps use a realistic or skeuomorphic screening environment as their base space, such as a movie theater. This reproduction of the physical world limits the ability to watch only one video at a time.

    User Behaviour Analysis on Different Video Contents
    Do users behave differently depending on what they watch? What are the differences? 
    Movies, TV Shows (10 mins ~)

    Browse the contents briefly. Watching individual content takes longer.
    Occupy the space by sitting or lying back. 
    Experience is semi-public or public.
    Short-form videos (~3 mins)

    Browse content longer. Watching individual content takes less time.
    The player's device is handheld. 
    Experience is personal.


    02. Precursor Exploration: When 2D Hit Limits


    Before moving into XR, I explored this concept through a web-based experiment, www.sceneries.site
    The site allowed multiple videos to coexist on a single plane, emphasizing simultaneity rather than playlists.

    Although visually compelling, this experiment revealed the hard limits of 2D interaction.

        • Hierarchy had to be flattened into scale and overlap
        • Browsing and focus faceted within the same plane
        • Navigation relied entirely on scrolling and clicking


    This clarified that the challenge was not visual—it was structural.
    The experience required depth, proximity, and embodied movement, which directly led to the exploration of the concept in XR.

    www.sceneries.site


    03. Core Insight: Short-form Video Behaves Differently in Space


    User behavior research revealed a key distinction.

        • Long-form Video: Sit, settle, focus
        • Short-form Video: Browse, skim, loop, compare


    Short-form video naturally supports,

        • Simultaneous playback
        • Non-linear attention
        • Ambient presence rather than full focus


    This insight justified a spatial approach in which multiple videos coexist and are available for individual attention upon request.



    UI Analysis on Different 2D Video Platforms
    Video player platform UI and Key interactions
    YouTube Website

    Click/Touch GUI
    Focus on watching and interacting with the selected content
    TikTok, Snapchat

    Gesture forward, fewer GUIs
    Focus on navigating through content


    Competitive Analysis on Video Player Layouts



    Spatial Video Player Key Interaction Sketches


    04. Spatial Design Principles

    Based on early exploration and testing, the project was guided by the following principles.

    • Multiplicity Over Sequence
      Multiple videos are played simultaneously to simulate memory density.

    • Depth as Hierarchy
      Near videos invite focus; distant videos remain ambient.

    • Surface Shape Matters
      Elliptical video surfaces reduce edge distortion and feel portal-like rather than screen-like.

    • Environment as Context
      Passthrough grounds digital content in the user’s physical space, improving comfort and presence.

    • Physical Movement Over the Controller
      Room-scale movement replaces joystick rotation to reduce disorientation and increase embodiment.



    User Flowchart




    05. Prototyping & Iteration in Unity

    The project evolved through four rounds of Unity builds, each driven by qualitative user testing.

    MVP :  Flat Metaphors

    • Rectangular video surfaces
    • Skybox environment
    • Stationary boundary

    Outcome 
    Felt detached and screen-like. Reinforced 2D habits.


    v1.2:  Spatial Surfaces

    • Elliptical video players
    • Reduced environmental framing
    • Simultaneous playback

    Outcome
    Stronger spatial pull. Videos felt like portals, but sensory overload became an issue.



    v1.3:  Grounded Interaction

    • Passthrough background (screen-recording video above doesn’t capture it)
    • Room-scale boundary
    • Interactable video objects
    • Sound muting for sensory control

    Outcome 
    Users felt grounded, playful, and more willing to explore. Spatial interaction became intuitive rather than novel.




    06. User Testing Highlights

    Qualitative testing surfaced several key insights.

        • Elliptical shapes invited the user's approach and exploration
        • Black void environments felt detached and flattened the space
        • Passthrough increased emotional connection and comfort
        • Audio requires a hierarchy to avoid overwhelm
        • Users wanted focus modes without breaking simultaneity


    These insights directly informed the final prototype decisions.





    07. Conclusion

    Learnings

    • Spatial UI requires restraint more than features
    • Depth, distance, and scale communicate hierarchy better than an explicit 2D GUI
    • Audio is as important as visual hierarchy in spatial systems
    • Familiar metaphors slow down spatial understanding
    • Prototyping in XR is essential—many issues only appear in motion


    Next Steps

    • Introduce focus modes without collapsing simultaneity
    • Explore hand-gesture input for more intimate interaction
    • Support annotation through voice and text
    • Extend the system to Apple Vision Pro
    narrative.ji@gmail.com