Text sniffing by the sound of the pressed buttons

Keyboard eavesdropping by the sound of the pressed buttons

If you start to listen to the sounds of a keyboard when text is typed on it, then even with the “naked ear” you can understand that when you press different keys, the sounds are slightly different. It is especially simple to determine the pressing of the spacebar and the ENTER button.

This begs the question – is it possible to find out which buttons are pressed while listening to the keyboard? Yes it is!

And for this a working concept from a set of programs has already been prepared, called kbd-audio. This is a collection of command line tools and GUI tools for capturing and analyzing audio data. The most interesting tools are designed to analyze keyboard input by analyzing data capture from a microphone.


The most interesting tool is called keytap, it can guess the pressed buttons of the keyboard by analyzing the captured sound from a microphone of a computer.

The demonstration is shown in the following video:

First you need to complete the program training – press each button on the keyboard at least 3 times so that the program remembers which button has which sound. Do not type very fast; it’s best to even type characters with one finger.


Another interesting tool from the same set is called keytap2. It knows how to recover text from audio. Its peculiarity is that it does not need training data – instead, it uses static information about the frequency of letters and n-grams in English.

That is, exactly the same approach is used like in cryptanalysis when cracking simple substitution using frequency analysis.

A video demonstrating the restoration of text from a recorded print sound (look at the lower right rectangle – the text becomes more meaningful in it):

The video demonstrates the use of the Keytap2 tool. The tool is still under development. The tool is still in development. This video just demonstrates the process involved in recovering an unknown text simply from a microphone recording of the person pressing the keys.

Brief description of the involved steps:

  1. Detect the positions of the key strokes in the waveform (red lines)
  2. Calculate the key stroke similarity matrix
  3. Apply a combined clustering + substitution cipher attack algorithm
  4. Look for decoded words or patterns and "bind" them to assist the algorithm
  5. Manually identify the "Space" key strokes as they are easily distinguished from the other keys
  6. Repeat steps 3 - 6

Some info about the current audio dataset:

  • Length: ~70 seconds
  • Typed keys: ~230
  • Keyboard: Filco mechanical
  • Recorded on iMac, built-in mic

CTF: can you guess the text being typed?

Other kbd-audio tools

A brief description of the tools available. If the status of the instrument is not “stable,” then expect problems and unexpected results.

Name Type Status
record text stable
record-full text stable
play text stable
play-full text stable
view-gui gui stable
view-full-gui gui stable
keytap text stable
keytap-gui gui stable
keytap2 text development
keytap2-gui gui development
- extra -
guess_qp text experiment
guess_qp2 text experiment
key_detector text experiment
scale text experiment
subreak text experiment
key_average_gui gui experiment

A brief description of the kbd-audio tools


Record audio to a raw binary file on disk

./record-full output.kbd [-cN]


Playback a recording captured via the record-full tool

./play-full input.kbd [-pN]


Record audio only while typing. Useful for collecting training data for keytap

./record output.kbd [-cN]


Playback a recording created via the record tool

./play input.kbd [-pN]


Detect pressed keys via microphone audio capture in real-time. Uses training data captured via the record tool.


./keytap input0.kbd [input1.kbd] [input2.kbd] ... [-c N] [-p F] [-t F]


    -cN - select capture device N
    -pF - prediction threshold: CC > F
    -tF - background threshold: ampl > F*avg_background


Detect pressed keys via microphone audio capture in real-time. Uses training data captured via the record tool. GUI version.

./keytap-gui input0.kbd [input1.kbd] [input2.kbd] ... [-cN]

keytap2-gui (work in progress)

Detect pressed keys via microphone audio capture. Uses statistical information (n-gram frequencies) about the language. No training data is required. The 'recording.kbd' input file has to be generated via the record-full tool and contains the audio data that will be analyzed. The 'n-gram.txt' file has to contain n-gram probabilities for the corresponding language. Sample file sample_quadgrams.txt.

./keytap2-gui recording.kbd n-gram.txt


Visualize waveforms recorded with the record-full tool. Can also playback the audio data.

./view-full-gui input.kbd


Visualize training data recorded with the record tool. Can also playback the audio data.

./view-gui input.kbd

How to install kbd-audio

Installation kbd-audio on Kali Linux

Dependency Installation:

sudo apt install libsdl2-dev libfftw3-dev cmake

Cloning the source code and compiling the program:

git clone https://github.com/ggerganov/kbd-audio
cd kbd-audio
git submodule update --init
mkdir build && cd build
cmake ..

Programs will not be moved to the system folders - the compiled files will appear in the current working directory:

Installation kbd-audio in Arch Linux/BlackArch

On Arch Linux, the required dependencies are installed with the following command:

sudo pacman -S sdl2 fftw cmake

Cloning the source code:

git clone https://github.com/ggerganov/kbd-audio
cd kbd-audio
git submodule update --init
mkdir build && cd build

Open the ../CmakeLists.txt file:

gedit ../CMakeLists.txt

And at the very top, add the lines to it:

    message(WARNING "SDL2_LIBRARIES wasn't set, manually setting to SDL2::SDL2")
    set(SDL2_LIBRARIES "SDL2::SDL2")

save and close the file.

Continue the following commands:

cmake ..

How to choose a microphone to listen to

In programs that capture audio data, you can specify the microphone that they should use. This is useful if you have multiple microphones. If the option to select a listening device is not specified, the first device will be selected.

If you have only one microphone, then you do not have to follow the tips in this section.

To view a list of audio input devices (microphones), install the alsa-utils package.

In Kali Linux, Debian and derivatives:

sudo apt install alsa-utils

In Arch Linux, BlackArch and their derivatives:

sudo pacman -S alsa-utils

To view a list of devices, enter:

arecord -l

Example output when there is only one microphone:

**** List of CAPTURE Hardware Devices ****
card 0: PCH [HDA Intel PCH], device 0: ALC295 Analog [ALC295 Analog]
  Subdevices: 0/1
  Subdevice #0: subdevice #0

To specify a device, use the -c option, for example, to select a device with the number 2: -c2.

However, the -c option can be skipped.

How does the interception of keystrokes by sound

The bottom line is that with the help of the record program, first you need to collect data about what sounds have keys on the keyboard. All information is recorded in a file of a special format – there are sounds and pressed buttons.

After that, you need to run the keytap program by specifying one or several files received using record. The keytap program will perform the computing (as if it finds the “average” sound of each individual button from all pressings) and immediately starts listening to the microphone.

So, let's move on to running record.

kbd-audio training

You need to start by recording a file using the record program. Select the file name (I have it output.kbd) and run:

./record output.kbd

The program itself will select a microphone. You just need to enter the buttons from the keyboard: in any order, any number, but at least three times.

When done, just press CTRL+c.

If you want to use a different microphone, specify the -c option, for example:

./record output.kbd -c2

How to run keytap

The keytap program has only one required option – the file recorded using the record program. You can also specify the -c option and microphone number.

./keytap output.kbd

The program will perform the necessary calculations:

At the end, the words will appear:

[+] Ready to predict. Keep pressing keys and the program will guess which key was pressed
    based on the captured audio from the microphone.
[+] Predicting

They mean – ready to guess. Press the buttons and the program will guess which button you pressed, while only the sound from the microphone will be used.

Press the buttons on the keyboard and the program opposite will write a line that begins with the word Prediction: and the character that she thinks you pressed.

If nothing happens, that is, if you see only the characters you enter, then you need to adjust the value of the Prediction Threshold (option -p) and the value of the Background Threshold (option -t).

Since we don’t know which numbers to specify, it’s better to run the program with a graphical interface:

./keytap-gui output.kbd

After launching, begin pressing the buttons on the keyboard. If nothing happens, then moving the sliders you can choose the necessary threshold levels. The upper slider is Threshold CC and the second slider is Threshold background. If you set the values too low, then background noise (for example, from the cooling system) will be processed by the program and it will output incorrectly guessed letters.

After you determine the appropriate values, you can start keytap by specifying these thresholds in the command line:

./keytap output.kbd -p0.6 -t0.3

If everything works out for you, then at this stage you can UNPUG the KEYBOARD (pull out the cord) and press the buttons. Even if you have a regular (non-radio) keyboard, keytap will still show the input characters (pushed buttons). That is, the keyboard as a technical device can be de-energized – but based on the sounds, the program will know which buttons you pressed!

The author of the program writes that with a mechanical (separate from the laptop) keyboard the tool began to work normally. With a laptop keyboard, it can be difficult to recognize.


The programs look interesting. At least, I also thought for a long time about the possibility of decoding the text by the sounds of the pressed keyboard buttons. Despite various restrictions, the concept turned out to be impressive.

Last Updated on

Recommended for you:

Leave a Reply

Your email address will not be published.