Skip to Content

Immersive Sound Engine

Immerse yourself into soundscapes.

ISE

Immerse yourself into soundscapes.

Why spatial audio tools are overcomplicated


The spatial audio industry has fragmented into incompatible camps, each with its own algorithms, parameters, and limitations. VBAP (Vector Base Amplitude Panning) works well for discrete source positioning but creates holes in the sound field between speakers. Ambisonics offers elegant mathematical foundations but requires high orders for precise localization, demanding channel counts that most systems can't handle. Wave Field Synthesis promises physical reconstruction of sound fields but needs hundreds of speakers and milliseconds of latency.


For practitioners, this fragmentation creates constant friction. A theatre sound designer learns VBAP for one venue, then faces an Ambisonic system at the next. An installation artist masters one tool, only to find it incompatible with the hardware at their exhibition space. A game audio developer implements one spatialization approach, then discovers it doesn't translate to consumer playback systems.


Beyond the learning curve, these algorithms behave differently. A source positioned at 45 degrees in VBAP doesn't sound the same as a source at 45 degrees in Ambisonics. Moving sources reveals discontinuities in some algorithms and smooth transitions in others. Creative intent gets lost in translation between tools.


What practitioners actually need isn't more algorithms — it's fewer. One tool that adapts to different scenarios. One set of parameters that make sense. One behavior that translates across systems.

How ISE works


ISE unifies gain-based and delay-based spatialization into a single, coherent engine built on acoustic physics. Every source is positioned using the inverse square law for amplitude — the same law that governs how sound actually propagates through air. Timing differences between speakers follow the speed of sound, creating natural interaural cues that your brain interprets instinctively.


But physics alone doesn't make a creative tool. The breakthrough in ISE is the Focus parameter: a single control that morphs the algorithm's behavior from tight, discrete rendering to wide, distributed spatialization. Low Focus spreads energy across many speakers, creating diffuse, enveloping sound fields reminiscent of Wave Field Synthesis. High Focus concentrates energy on the nearest speakers, producing precise, localized sources like a nearest-neighbor approach. Medium Focus balances these extremes, delivering natural panning behavior similar to VBAP.


One algorithm. One parameter. Three worlds of spatial behavior.

This isn't a compromise or a lowest-common-denominator approach. By varying the mathematical curve that maps distance to gain — linear, exponential, or logarithmic — ISE genuinely reproduces the characteristics of different spatialization philosophies. You're not simulating VBAP; you're getting VBAP-like behavior from the same physics-based engine that also delivers WFS-like behavior when you need it.


The practical implications are profound. You don't need to learn multiple systems. You don't need to re-create your spatial design when moving to different hardware. You adjust one parameter and the algorithm adapts. Sources that need precise localization get high Focus; ambient elements that should envelop the listener get low Focus. The same project, the same session, the same tool.


What sets ISE apart


One algorithm for all behaviors


The Focus parameter isn't a blend between presets — it's a continuous control over the fundamental mathematics of spatial rendering. At any Focus setting, ISE delivers coherent, physically-grounded spatialization. You're not choosing between algorithms; you're choosing where on a continuum of spatial behaviors your source should live.



Real physics foundation


Sound in the real world follows the inverse square law: double the distance, quarter the energy. ISE's gain calculations honor this relationship, producing natural distance perception without artificial processing. The speed of sound determines timing differences between speakers, creating interaural cues that your auditory system interprets automatically.



Intelligent normalization


When positioning sources at different distances, raw inverse square law calculations would produce huge level differences — a source 1 meter away would be 100 times louder than a source 10 meters away. ISE's normalization system subtracts the smallest attenuation from all channels, preserving relative distance relationships while maintaining consistent headroom.



Delay compensation


Real-world speaker arrays have real-world dimensions. A source positioned near one speaker might be 5 meters from another. ISE calculates these timing differences accurately, but also offers a "tight" mode that subtracts the minimum delay from all channels. This preserves relative timing for spatial impression while minimizing absolute latency for applications where every millisecond matters.



Massive scalability


128 simultaneous sources. 128 output speakers. These aren't theoretical limits — they're tested, working configurations. Need more? The architecture extends further. ISE handles intimate 4-speaker installations and massive 100+ speaker arrays with the same codebase, the same parameters, the same behavior.



Ultra-low latency


No FFT. No block processing. Just gain calculations and delays. ISE adds only a few samples of processing latency at any sample rate, making it suitable for live performance, interactive installations, and real-time applications where perceptible delay destroys the experience.




The Focus Parameter

One knob, three worlds


Understanding Focus is understanding ISE. This single parameter controls how sound energy distributes across your speaker array when rendering a source at any given position.



Low Focus (Logarithmic curve)


Energy spreads gradually across many speakers. A source at 45 degrees isn't just in the two nearest speakers — it's present across the entire front array with smooth, gentle rolloff. This creates diffuse, enveloping spatial impressions. Ambient sounds feel like they exist in the space rather than emanating from points. Moving sources transition smoothly without discrete jumps. The sound field feels continuous, immersive, physical.


This behavior resembles Wave Field Synthesis without the extreme speaker counts and latency. It's ideal for environmental audio, room tone, atmospheric elements, and any content that should surround rather than localize.



Medium Focus (Linear curve)


Energy concentrates on nearer speakers but maintains presence in neighbors. This delivers natural panning behavior — the kind you expect when you turn a pan knob. Sources localize clearly while maintaining spatial continuity. Movement sounds smooth and natural. The sweet spot is wide enough for multiple listeners.


This is your everyday spatialization mode. Dialogue, music, sound effects, Foley — content that needs clear positioning without aggressive localization. It's the behavior most similar to traditional VBAP, but with smoother transitions and more forgiving off-axis response.



High Focus (Exponential curve)


Energy concentrates sharply on the nearest speakers. A source at 45 degrees lives almost entirely in the single closest speaker, with minimal spillover to neighbors. This produces precise, point-source localization. Sources feel like they're actually at the speaker positions.

Use High Focus when you need pinpoint accuracy: spotlit performers, precisely located sound effects, objects that must track visuals exactly. The tradeoff is a smaller sweet spot and more obvious speaker discretization — but when precision matters, nothing else delivers.



Specifications


ParameterValue
InputMono per source (unlimited sources up to 128)
OutputUp to 128 speakers (extensible)
LatencyFew samples @ system sample rate (no FFT)
Sample rates44.1 / 48 / 96 / 192 kHz
Bit depth16-bit, 24-bit, 32-bit float
Coordinate systemCartesian (x, y, z) or Spherical (azimuth, elevation, distance)
ProcessingGain law + delays (parametric)
Speed of soundAdjustable (default 343 m/s)


Per-source parameters


  • Position: x, y, z coordinates or azimuth, elevation, distance
  • Gain: Individual source level
  • Focus: Spatial distribution curve (linear → logarithmic)
  • Delay mode: Absolute timing or tight (minimum-subtracted)



Applications


Live Sound & Theatre


Theatre sound design has entered the immersive era. Audiences expect sound to move through space, to surround them, to participate in the storytelling. But theatrical productions face constraints that studio-based immersive formats ignore: different venues every night, labor costs that limit setup time, repertory schedules that demand quick changeovers between productions.


ISE addresses these realities. A sound designer creates their spatial score once, using ISE's straightforward position and Focus parameters. The design travels with the production, adapting to each venue's speaker configuration automatically. A thrust stage with wrap-around speakers becomes a proscenium with a frontal array — same ISE session, different speaker mapping, coherent spatial result.


The Focus parameter proves particularly valuable for theatrical applications. Environmental sounds — rain, wind, forest ambience — get low Focus settings, surrounding the audience in atmosphere. Dialogue and sound effects get medium Focus, localizing clearly without aggressive directionality. Spot effects that must precisely track performers get high Focus for pinpoint accuracy.


Real-time operation means automation follows cues without rendering delays. A sound cue fired from the show control system spatializes instantly. Moving sources track physical staging in real-time. Interactive elements respond to performer positions without perceptible latency.


For touring productions, ISE's single-codebase approach eliminates the "which venue has which system" problem. No need for separate VBAP, Ambisonic, and matrix versions of your spatial design. ISE adapts. 

Immersive Installations


Museums, galleries, theme parks, exhibitions — immersive installations have become expected rather than exceptional. Visitors experience spatial audio at major attractions worldwide, raising expectations for every subsequent encounter. But installation audio presents unique challenges: content that runs continuously for months or years, hardware that varies wildly between venues, maintenance by staff who aren't audio specialists.


ISE's simplicity makes it installation-friendly. Position your sources. Set your Focus values. Done. There's no arcane parameter tweaking, no mysterious coefficient adjustments, no behaviors that change unexpectedly with content. Curators and exhibit designers can understand and modify spatial behavior without calling in audio specialists.


The scalability handles installation realities. A temporary exhibition in a small gallery might have 8 speakers. A permanent installation in a major museum might have 64. The same ISE project serves both, with only the speaker mapping changed. Content created for one venue deploys to another without re-authoring.


For interactive installations, ISE's low latency enables real-time response to visitor position, gesture, or input. A visitor's tracked position controls source locations in real-time. Gesture recognition triggers spatial events instantly. The installation feels responsive because it actually responds — no buffering, no prediction, no latency hiding.


Spatial Music Production


Immersive music formats have moved from novelty to expectation. Apple Music delivers Spatial Audio to hundreds of millions of listeners. Dolby Atmos Music is a checkbox on every major release. But many producers find immersive mixing intimidating — too many parameters, unfamiliar tools, behaviors that don't match their musical intuitions.


ISE offers an alternative approach. Position your elements in space using coordinates or angles — the same conceptual framework as panning in stereo, extended to three dimensions. Adjust Focus to control whether elements localize precisely or spread across the sound field. No objects versus beds debate. No channel-count anxiety. No renderer compatibility concerns.


The Focus parameter maps to musical intent. A lead vocal needs to localize clearly — medium to high Focus. A synth pad should envelop the listener — low Focus. A drum kit might have the kick and snare at high Focus for punch while overheads spread at low Focus for room. These are musical decisions expressed through a single parameter.


For electronic music production, ISE integrates with any DAW that outputs to a multichannel interface. Automate positions and Focus in your DAW's timeline. Render to whatever speaker count your delivery format requires. The same creative intent translates to 5.1, 7.1.4, or custom speaker arrays.


VR/AR & Gaming


Interactive media demands real-time spatial audio that responds to user input without perceptible delay. A game character moves, and their audio moves with them. A VR user turns their head, and the sound field rotates appropriately. Any latency destroys the illusion; any discontinuity breaks immersion.


ISE's few-sample latency makes it suitable for the most demanding interactive applications. Position updates from the game engine translate to spatial changes within microseconds. The gap between visual and audio spatial information stays imperceptibly small.


The Focus parameter serves game audio design well. Ambient environment sounds get low Focus, filling the virtual space with atmosphere that doesn't localize to obvious point sources. Character dialogue and important game events get medium Focus, directing player attention without aggressive directionality. UI sounds and critical alerts might get high Focus for unmistakable positioning.


For VR applications, ISE's physics-based foundation ensures that spatial rendering matches visual expectations. A sound at 3 meters looks like it's at 3 meters because ISE's distance modeling follows real acoustic behavior. The inverse square law that determines visual brightness also determines audio loudness. The brain accepts the illusion because the physics are consistent.

OEM licensing


We’re selling ISE as a one-time payment for one brand. With this licence you will have access to ISE source code and DSP code, for you to use as you like.


If you’re planning to sell parts of ISE code in B2B solutions, you will need to buy one licence per brand you’re planning to sell to. For this type of use we’re making discount on multiple licence payement.


If you want to test ISE, please contact us and we will send you a demo to test.





Contact us | Get an appointment