Skip to main content

Development of automatic audio panning system for immersive sound through stage size-aware model and object-tracking

Buy Article:

$15.00 + tax (Refund Policy)

We investigated a novel AI-assisted automated 'immersive' audio panning system designed to track audio-related objects within a video clip. This system comprises four sequential steps: Object-Tracking, Stage dimension Estimation, XY-Coordinate Calculation, and Object Audio Rendering. The system is designed to overcome existing challenges arising from the rapid and frequent movement of target objects by employing a pre-trained object-tracking model and integrating depth information to ensure stability in subsequent tasks. Additionally, we introduce a stage size-aware model to extrapolate stage dimension using our manually collected dataset, formatted as (Image, Width, Depth), which facilitates model training. Consequently, the system calculates XY-Coordinate pairs, serving as panning values for conventional audio mixers or decoders to enable immersive audio reproduction. We anticipate that this video- and space-aware automatic panning system will be valuable for the rapid production of new media.

The requested document is freely available to subscribers. Users without a subscription can purchase this article.

Sign in

Document Type: Research Article

Affiliations: Korea Advanced Institute of Science and Technology (KAIST)

Publication date: 04 October 2024

More about this publication?
  • The Noise-Con conference proceedings are sponsored by INCE/USA and the Inter-Noise proceedings by I-INCE. NOVEM (Noise and Vibration Emerging Methods) conference proceedings are included. All NoiseCon Proceedings one year or older are free to download. InterNoise proceedings from outside the USA older than 10 years are free to download. Others are free to INCE/USA members and member societies of I-INCE.

  • Membership Information
  • INCE Subject Classification
  • Ingenta Connect is not responsible for the content or availability of external websites
  • Access Key
  • Free content
  • Partial Free content
  • New content
  • Open access content
  • Partial Open access content
  • Subscribed content
  • Partial Subscribed content
  • Free trial content