Fighting games typically use a variety of motion inputs and button combinations to add options to character kits in a way that couldn’t be added otherwise without either adding more buttons or adding many overlapping inputs. I believe fighting game input systems also have application outside of their genre, for example, I appreciate the use of these inputs to expand on your kit in Symphony of the Night without adding a menu to switch out skills or needing some sort of UI element that indicates the currently selected skill. This type of input system allows immediate access to many different states/moves/abilities near-instantly whenever you want, given you can remember the input for it. If you’re making a game in a context where you can’t get away with just using all the buttons and maybe a simple direction input system (a major way Devil May Cry adds options) then I recommend at least looking into this type of input system and experimenting to see if it improves the experience. Here I will try to outline the ideas that go into making one of these systems.

The Problem

The implementation for this kind of input system (which I’m going to refer to as a “motion input” system from now on) is a bit hard to wrap your head around if the only thing you’ve been exposed to in the past is the more immediate input-fetching system that a majority of games use. Something like this would be the standard way of checking for an input:

if Input.was_pressed("Jump") {
    player.jump()
}

In a motion input system this way of immediately fetching the inputs when you need them doesn’t work, because you need access to not just the current inputs, but the inputs over some slice of time.

The question this creates is: How do you test for these more complex motion inputs done over time while retaining a nice-looking API for checking if an input was recieved?

A Technically-Functional and Terrible Solution

Firstly if you want to keep track of inputs over time you need to do it either by storing all the inputs over some amount of time, or by consuming the present inputs and keeping track of the next thing you want to see input by the player before emitting some sort of indicator that an input was correctly done. I haven’t quite figured out how to cleanly do the “consuming the present inputs” method yet, so maybe I’ll write another post about that if I ever figure it out, but here I will be talking about matching on motion inputs using a stored buffer of past input.

When you store inputs in a buffer you can very easily match motion inputs by simply entering the inputs you want for a motion, and then checking for equality to a slice of your input buffer. That looks something like this:

fn input_hadouken(inputs) -> bool {
    hadouken = [DOWN, DOWN_RIGHT, RIGHT, BUTTON_A]

    if inputs.get_last(hadouken.len()) == hadouken {
        return true
    } else {
        return false
    }
}

This technically solves the problem but it’s bad and it will make your game feel bad to people who are playing it. (In fact, I wrote this out of frustration with the lack of explanations of the real solution to motion input parsing!) You should never expect consistent frame-perfect motion inputs from your players, they are not superhuman. So let’s try fixing this, and see if it’s possible to accept motion inputs with much greater variation in timing.

A Mostly-Correct Solution

If we want to allow for variations in timing and for other inputs to occur while a motion input is being done at the same time (think doing 236P236P as a way to cancel from fireball into super, using the first 236 as the first half of the super input) then we need to define how much time is allowed between each input for it to still register as a motion input. To do this you need to know the timings of each stored input, you can either explicitly store the time (not recommended) or poll for inputs during a slice of time at a consistent rate and implicitly reflect that rate in your code (this is my recommended method).

For this next example let’s assume that we are polling for inputs every single frame of the game, which is running at 60fps (polling at 60fps is also a bad idea in an actual implementation as it will be a slow enough poll rate to make inputs feel bad, but it’s the easiest way to demonstrate this system for now). If we want to check for the motion input 236A, with an allowed lenience of 8 frames between each input at most, we could do it like this:

// very similar API to the Input.just_pressed("button") method
// all you need to do is pass it your buffer of inputs!
fn input_hadouken(inputs) -> bool {
    lenience = 8
    motion = [INPUT_DOWN, INPUT_DOWNRIGHT, INPUT_RIGHT]

    // check if the first frame of input has the A button pressed
    // we dont want any leniency on the last input of the motion because
    // that would cause the input to be recieved more than once 
    if !input.get(0).just_pressed(BUTTON_A) {
        return false
    }

    needed_input_idx = 0
    frames_since_last_input_detected = 0
    for input_frame in inputs {
        // get currently needed input
        needed_input = motion.get(needed_input_idx)
        // check to return true when the input was recieved and we no longer want to
        // search for any more parts of the motion input
        if needed_input.is_none() {
            return true
        }

        // check to ensure the inputs we want dont have more than 8 frames between them
        if frames_since_last_input_detected > lenience {
            return false
        }

        frames_since_last_input_detected += 1

        if input_frame.just_pressed(needed_input) {
            needed_input_idx += 1 // go to next input
            frames_since_last_input_detected = 0 // reset leniency counter
        }
    }
}

This will work well enough to ensure that the input is lenient so that players can input it without needing extreme precision.

If we think about this implementation a bit more, it becomes very clear that what we want as a solution is something similar to parser combinators. I think parser combinators apply very well to this problem because you can build up a bunch of tiny “primitive” input checking functions, stack them on top of eachother, and easily get more complex input matching functions. Sort of like how you would do something to parse a simple file format or piece of text input, you just need to adapt these concepts to operate on frames of input. You could probably even use regex for this if you wanted, its essentially the same problem.