Overview
SlamPunk is a fast-paced competitive couch-sports game combining parkour, tag, and basketball in a futuristic arena environment. As Lead Audio Systems Designer and Music Producer, the primary challenge was building an audio system that functioned as a reactive participant in the match rather than a static background layer.
The core problem with most competitive game audio is that static loops cause auditory fatigue, and the music frequently masks the gameplay cues players need to make decisions. The solution was a modular 15-stem dynamic music engine that scales arrangement complexity in real time based on live match variables, paired with a submix hierarchy that surgically clears frequency space for competitive feedback events.
This document covers the full production process across all three phases (preproduction, full production, and postproduction) totalling approximately 600 hours of work and resulting in a 28-page Audio Design Document submitted alongside the project.
Design Pillars
Two core design pillars governed every decision made throughout the project. They were not abstract goals; they were active filters applied to every technical and creative choice.
The audio must drive persistent forward momentum. The soundtrack is engineered to dynamically increase in intensity as the match progresses, evolving alongside the kinetic movement of the gameplay. Standard 16-bar loops are mathematically manipulated into a continuous escalating journey that snaps back seamlessly to its beginning before transitioning into a more built-up version as the match progresses.
Every player action must have immediate auditory consequence. Movement, scoring events, and critical interactions must cut through the music instantly. To ensure this, scoring mechanics and movement cues aggressively duck competing musical and ambient frequencies, ensuring every hit feels deliberate and impactful.
These pillars created a productive constraint: the music needed to be as energetic as possible, while simultaneously being engineered to get out of its own way the moment the gameplay demanded it.
Aesthetic Pivot
Production started with a dark, industrial Cyberpunk aesthetic in REAPER — five prototype tracks built around aggressive wavetable synthesis and deep sub-frequency kicks. Early playtests flagged the tone as too oppressive for a couch-sports party environment. The industrial direction was contradicting the game's visual identity and the team's brief.
Two decisions were made simultaneously. First, production migrated from REAPER to FL Studio; the piano roll and sequencing workflow in FL Studio allowed for faster iteration on melodic material, which was now the priority. Second, the aesthetic pivoted to a Synth-Funk and Electro-Swing hybrid based on client feedback that the game should feel "fun and silly" while maintaining competitive intensity.
The reference points that shaped the final direction were: Knockout City and SpeedRunners for the tightly syncopated drum breaks and brass stabs that create a funk aesthetic; Ghostrunner for droning synth textures in the outdoor level; and Paradise Killer for the vaporwave-inspired airy textures used in the City in the Sky level.
A Synth-Funk / Electro-Swing hybrid with a locked 140 BPM grid across the entire soundtrack. Every arena theme shares the same underlying rhythmic pulse, making transitions between gameplay states feel seamless and intentional.
Composition & Synthesis
The 140 BPM Foundation
Locking the entire project to a rigid 140 BPM grid was a deliberate engineering decision as much as a musical one. All subsequent MetaSound implementation depends on stems being sample-accurate at loop boundaries; any tempo inconsistency would cause rhythmic drift between layers during intensity transitions. Establishing this grid in week four and never deviating from it eliminated an entire category of integration problems later in production.
The rhythmic heartbeat of each track was built from CYMATICS percussion samples programmed into syncopated drum breaks. These patterns were engineered to be sharp and punchy to sit clearly above the melodic layers without cluttering the mid-range where the gameplay SFX live.
VITAL for synthesis
VITAL served as the primary synthesiser for bass, arpeggios, atmospheric drones, and lead textures. Its strength for this project was procedural filter automation; LFO modulation applied directly within the instrument engine kept digital textures shifting continuously, preventing the 16-bar loops from sounding static across extended play sessions. For Intensity 3 specifically, macro controls were pushed into grit and saturation within the synth itself rather than relying on downstream processing, giving the lead sounds enough character to cut through a fully dense mix.
Crucially, VITAL patches were edited between intensity levels. For the transition from Int1 to Int2, LFO rates were increased and filter cutoffs opened, making the same patches feel physically faster and more urgent without changing the underlying notes. This gave the escalation a procedural quality rather than simply adding more instruments.
FLEX for organic grounding
FLEX was used to integrate real-world instrument samples into arrangements that would otherwise feel entirely synthetic. Sharp brass stabs and trumpet leads reinforced the funk aesthetic in the Runners Theme; deep sampled basslines anchored the harmonic foundation of specific arena tracks. The result was a hybrid sound that maintained musical richness without losing the synthetic, mechanical drive the game required.
Arrangement per intensity level
15-Stem Architecture & Export
The 15-stem architecture is the structural backbone of the entire audio system. Each arena track is split into five instrument groups (Bass, Data, Harmony, Rhythm, Topline) across three intensity levels, producing 15 individually mixed stems per theme. These stems are what MetaSound interpolates between in real time during gameplay.
Mixing process
Each intensity level was mixed as its own session within FL Studio, with constant reference back to the master prototype loop to ensure balance was maintained across the full progression. The key constraint was surgical frequency slotting: every instrument group needed to occupy a defined spectral space so that when all 15 stems eventually played together at Intensity 3, the mix remained coherent without masking. The Data Bus, for example, was high-passed at 200Hz in Int1 to preserve low-end headroom for bass and kick transients.
Export pipeline
Stems were batch-exported using FL Studio's Split Mixer Tracks function, processed sequentially (all five Int1 groups first, then Int2, then Int3) to maintain control over the mix progression at each stage. The export used Wrap Remainder tail configuration to ensure reverb and delay tails looped seamlessly within MetaSounds; 512-point sinc resampling guaranteed that procedural LFOs and complex synthesis remained artifact-free once integrated into Unreal Engine.
The sequential export order (Int1 complete before Int2 begins) was deliberate; each intensity was balanced against the previous one rather than in isolation. This ensured the escalation felt proportional rather than producing jarring volume jumps at transition points.
Strict naming convention
Every stem followed an identical naming pattern: Theme_Int[n]_BUS_[GROUP].wav. This wasn't aesthetic; MetaSound construction depends on being able to identify instrument groups and intensity levels at a glance across 45 files per theme. A single naming inconsistency would break the Wave Player mapping and cause silent stems in-game.
Sound Effects Pipeline
A hybrid approach was taken for the SFX library: 50% sourced from Freesound and Pixabay, 50% self-produced through foley recording and synthesis. The sourced assets provided high-fidelity foundations; the self-produced recordings added a physical texture that purely synthetic assets lack.
Tag.wav — the hybrid foley process
The Tag mechanic is one of the highest-priority competitive cues in the game; it needed to be unmissable at any music intensity. The asset was built by layering a sourced explosion kick (sub-frequency foundation) with a self-recorded clap (high-frequency snap).
The layering process in REAPER was built around precise transient alignment rather than fade-based blending. Both signals were aligned at the sample level so the sub-frequency energy and the high-frequency snap reached the listener's ear at the exact same millisecond. This produced a cohesive single hit rather than two separate sounds playing simultaneously.
Processing on the clap: High Pass filter at 300Hz (2 oct) to remove boxy low-end rumble; High Shelf at 4000Hz (2 oct) to accentuate snap; transient controller with attack at 30% and sustain at -19% to keep the tail tight. Processing on the kick: Low Pass filter at 150Hz (2 oct) to isolate pure sub-frequency energy. Both signals routed to a BUS TAG with a 4dB boost at 200Hz for body, ReaVerbate (wet at -8dB to preserve transient clarity), and ReaComp at -19dB threshold, 5:1 ratio, 30ms release to glue the two elements into a single punchy hit.
Procedural environmental sound
Static loops for environmental assets were replaced with procedural MetaSound graphs to prevent auditory fatigue across extended matches. The wind texture used two LFO nodes (Sine waves at 0.03 and 0.05 Hz) to modulate pitch shift and volume, creating a naturally shifting texture that felt reactive rather than looping. Crowd emitters used a custom Blueprint (BP_Crowd_Emitter) with randomised start time (0.0–28.0s), randomised delay (0.0–4.0s), and pitch multiplier variation (0.92–1.08) on each instance to prevent all crowd actors from synchronising on level load.
MetaSound Integration
The transition from static Spawn Sound 2D loops to MetaSound graphs in week 11 was the most significant technical shift in the project. The static approach introduced auditory fatigue and inconsistency between Unreal's internal mixing and the FL Studio reference; MetaSounds provided the DSP environment needed to implement precise real-time mixing and mastering.
Wave Player mapping
All 15 stems were imported into the Unreal Content Browser, organised by theme and intensity level using the strict naming convention established at export. Inside the MetaSound graph, each stem was mapped to an individual Wave Player node, categorised by instrument group and intensity. This manual mapping established the organisational foundation before any crossfade logic was added; every stem was verified as correctly assigned and looping cleanly before the interpolation layer was built on top.
Solving rhythmic drift
A critical problem with multi-stem playback is timing drift between layers as they loop independently. The solution was a Trigger Any (2) node that consolidated the initial On Play trigger and the On Finished loop trigger from the master Rhythm (drums) stem. Every other instrument group's start time was tethered to this single rhythmic pulse, ensuring all 15 stems remained locked to the same grid regardless of how many times the sequence looped or intensity transitions occurred.
Without sample-accurate sync, even a few milliseconds of drift per loop cycle accumulates into audible phase misalignment within minutes of gameplay. The Trigger Any approach eliminated this entirely by using the drum stem as the single authoritative clock source for all other layers.
Internal DSP chain
Within the graph, stems were routed into five individual Mixer nodes (one per instrument group) before reaching the final output. This provided modular control over every layer independently; the Data arpeggios in Int3 could be boosted for rhythmic drive while the atmospheric Harmony pads from Int1 were pulled back, preserving the frequency slotting established in FL Studio within the game engine itself. The final output passed through a master Mixer, Compressor, and Limiter chain to prevent digital clipping at Intensity 3's maximum density while maintaining loudness within professional delivery standards.
Blueprint Logic
Level Music Matrix
To avoid hard-coding music into every arena level, a Map variable called Level Music Matrix was built as a centralised database mapping String level names (such as CityLevel or OutdoorLevel) to their corresponding MetaSound assets. New arenas could be integrated globally by updating a single matrix entry rather than modifying individual level Blueprints; when City in the Sky was added late in production, the music implementation took minutes rather than hours.
Initialisation logic
A Custom Event triggered at match start queries the matrix using Get Current Level Name, then implements a 5.5-second delay to account for level loading and the pre-match countdown before triggering the MetaSound via Spawn Sound 2D. The return value of the Spawn Sound 2D node was promoted to an Active Music variable; this persistent reference is what allows the Blueprint to send real-time parameters to the MetaSound graph as match variables change, without requiring inefficient re-initialisation of the audio assets.
Score-driven intensity updates
A real-time update loop fires at the conclusion of every round. The logic casts to the Scoring GameState, retrieves both player scores, and uses a MAX node to identify the highest current score. A mathematical formula converts this integer into the required intensity level (1, 2, or 3), which is passed to the Set Integer Parameter node targeting the Intensity input on the Active Music reference. An Execute Trigger Parameter titled Update then fires, forcing the MetaSound to re-evaluate its internal crossfades and transition to the correct intensity layer.
Intensity interpolation
To prevent jarring volume snaps between intensity layers, InterpTo nodes within the MetaSound graph convert the incoming intensity integer into a floating-point value. This allows stem gains to crossfade smoothly over a 2-second interpolation window rather than snapping; melodic and percussive layers swell into place, which maintains immersion during transitions and prevents the music from distracting the player at critical scoring moments.
Character-specific SFX logic
Jump vocalisations used a logic gate within the player Blueprint to check the Player State Index via a Get Player State node and Branch. Based on the character's identity index, the system fires a Play Sound 2D node for either Female_Jump or Male_Jump; auditory feedback is consistent with the visual character model, reinforcing the personalised competitive feel for both players.
Reactive Submix Hierarchy
The flat audio output approach used in early production was impossible to manage once the 15-stem architecture was running; the simultaneous stems caused immediate frequency masking. The solution was a three-branch submix hierarchy: Submix_Music, Submix_SFX, and Submix_Environment. Every audio asset in the project was manually assigned to one of these three branches.
Priority system
The hierarchy established a clear auditory priority: SFX at the top as the master control signal, Music as the medium-priority target for dynamic ducking, Environment as the low-priority target for both ducking and EQ treatment. This meant the system was not just balancing volumes; it was dynamically clearing frequency space for the most important information in real time.
FX_MusicDucking
A compressor on Submix_Music keyed to Submix_SFX as the sidechain source. Configuration: threshold -30.0 dB, ratio 4.0, attack 1.0ms, release 200.0ms. The 1ms attack ensures an immediate response when a high-priority SFX fires; the 200ms release allows the arrangement to swell back naturally once the trigger clears. A firm but musical dip rather than a heavy-handed cut.
FX_EnvDucking and EQ_Environment
The environment submix used a more aggressive ducking configuration (ratio 6.0, threshold -25.0 dB, release 80ms) since atmospheric sounds like crowd noise and fireworks are lower priority than music and should clear faster. Both FX_EnvDucking and EQ_Environment were stacked on the environment submix chain, with the ducking compressor first in the index to prioritise immediate suppression of ambient noise during scoring events, followed by the EQ to roll off low-mid frequencies.
EQ_Environment applied a single band with a cutoff at 120.0 Hz, bandwidth of 2.0 octaves, and a gain reduction of -8.0 dB. This rolled off the low-mid frequencies that caused muddiness in the final mix when environmental attenuation interacted with the music's bass content, seating the atmosphere under the soundtrack without removing the sense of physical presence.
Sound Class infrastructure
A Master Sound Class Mix with dedicated classes for Music, SFX, and Environment provided global gain staging across the entire project. Beyond enabling independent volume sliders in the game settings menu, this was a technical safeguard: manually calibrated master output multipliers for each category prevented digital clipping during Intensity 3 climaxes where the full 15-stem arrangement and scoring SFX peaked simultaneously.
Outcomes
The most significant outcome of the project is the relationship between the three layers of the system. The 15-stem architecture in FL Studio, the MetaSound interpolation logic in Unreal, and the submix hierarchy are not independent components; each one was designed with the constraints of the other two in mind. Frequency slotting decisions made during the FL Studio mixing phase determined what was possible in the submix EQ stage. The strict naming convention established at export determined how efficiently the MetaSound graph could be built. The Active Music variable referencing approach in Blueprint determined how cleanly intensity updates could fire without re-initialising assets.
The system demonstrates that dynamic audio for competitive games is fundamentally an architecture problem before it is a creative one. The music only works because the pipeline it runs through was engineered to the same standard as the music itself.