← Back to Blog

Ping from Scratch in C

April 11, 2026

Ping from Scratch in C

Over the past month I built a working ping utility in C - twice. The first version uses raw sockets and lets the OS handle the IP header. The second version builds every layer manually: ICMP, IP, and Ethernet, byte by byte, down to ARP resolution and AF_PACKET sockets. Neither version uses any networking library. Both were built the same way as my previous projects - wrong attempts, specific questions, and trying again. Claude played the same role as before: not a code generator, but something closer to a patient senior engineer I could ask "why does this work?" rather than "what do I write?"


Why This Project

The HTTP library taught me what a network client does from the application layer down to syscalls. But there was still a layer I hadn't touched: everything below IP. When you use SOCK_RAW with AF_INET, the OS quietly builds an Ethernet frame around your packet, resolves the destination MAC address, and sends it out the right interface. You never see any of that. I wanted to see it.

The reference was Dr. Jonas Birch's ping series. I read summaries of his approach and looked at his header files, but I tried to build the implementation myself first. The gaps between what I understood and what the code required were where the actual learning happened.


Part 1: Standard Ping

Located in simple_version/icmp.c. This version took a day. It uses SOCK_RAW + IPPROTO_ICMP, which means the OS handles the IP header - you provide ICMP and everything below that is hidden. The interesting parts were:

The checksum. RFC 1071 defines a ones-complement sum over 16-bit chunks. You sum them into a 32-bit accumulator (because adding 16-bit values can overflow 16 bits), fold any overflow back into the lower 16 bits, then bitflip the result. The elegant property is that computing the checksum over a buffer that already contains a valid checksum gives you zero. That's how verification works - no need to compare against anything.

RTT measurement. CLOCK_MONOTONIC instead of gettimeofday(). Wall clock time can jump backwards if the system clock is adjusted. Monotonic time never does.

WSL2 doesn't work for raw sockets. The Windows NAT layer intercepts ICMP replies before they reach the raw socket. tcpdump showed requests leaving but no replies arriving. System ping worked; raw sockets didn't. The fix was moving to a VirtualBox VM running Ubuntu.

The first successful output - 200-something milliseconds RTT, real reply from 8.8.8.8 - was satisfying in a way that's hard to explain. You wrote the byte that went out on the wire. You parsed the byte that came back.


Part 2: Everything from Scratch

This is where the month actually went.

Preparation

Before writing any code I solved LeetCode #271 Encode and Decode Strings. This wasn't arbitrary. The eval_* functions in this codebase are serializers - they take a structured in-memory representation and produce a flat byte stream with known field sizes. That's encoding. The receiver reads fixed offsets rather than delimiters, rebuilds the structure. That's decoding. Solving #271 first meant I was implementing a known pattern, not inventing something.

I also worked through #206, #21, and #146 before touching the Ethernet layer - linked list and cache problems that built the pointer chain intuition needed for the three-layer struct hierarchy.

The Two-Struct Pattern

Every protocol layer in this codebase follows the same structure:

logical struct  →  eval_*()  →  raw packed struct  →  wire bytes
wire bytes  →  recv*()  →  logical struct

The logical struct is optimized for your code: pointer types, native integer sizes, internal enum tags instead of wire values. The raw struct is optimized for the wire: packed with __attribute__((packed)), fields in RFC-specified order, multi-byte values in network byte order. The eval_* function translates between them at send time. The recv* function translates in reverse at receive time. All the messy translation is centralized. Everything else works with clean logical structs.

LeetCode #271's connection to this pattern was immediate once I'd built eval_icmp(). Both encode structured data into a flat byte stream. The key insight in both cases: the receiver doesn't need delimiters if it knows the field sizes in advance. Fixed offsets, not delimiters. That's what network protocols are.


Phase 6 - ICMP Layer

The first thing to build was the ICMP layer using Jonas's architecture, starting from a blank file. This established the template that every subsequent layer would follow.

The icmp logical struct holds a kind enum (echo, echoreply), identifier, seq_num, size, and a uint8_t *data pointer. The enum is important - rather than hardcoding type = 8 scattered everywhere, the enum gives the wire value a name. The translation from echo to 8 on the wire happens in exactly one place: inside eval_icmp().

The rawicmp raw struct matches RFC 792 exactly: Type, code, checksum, identifier, seq_num, and a flexible array member data[]. The flexible array member is what lets you treat the struct as the header portion of a larger flat buffer - sizeof(rawicmp) gives you just the header size, and the payload sits immediately after in memory.

eval_icmp() follows the two-stage pattern established here and never deviated from across all three layers:

  1. Populate a rawicmp on the stack
  2. Allocate a heap buffer of sizeof(rawicmp) + payload_size
  3. Copy header in, advance write cursor, copy payload after it
  4. Compute checksum over the assembled buffer, patch it in place
  5. Return ret (the start of the buffer) - caller frees it

One detail that matters: the checksum field must be zeroed before computation. You compute the checksum over the assembled buffer including the checksum field - if that field contains a stale value, the result is wrong. So: zero first, assemble, compute last, patch in place.

The mkicmp() constructor, show_icmp() debug printer, and free_icmp() cleanup function completed the layer. Every subsequent layer would have the exact same set of companions. This repetition was intentional - the pattern becomes mechanical by Phase 8.

Decision that carried forward: identifier and seq_num were kept in the ICMP header per RFC 792, not nested in a s_ping payload struct as Jonas does. This simplified the struct chain - e->payload->payload->size rather than having to navigate through another level of nesting to reach the sequence number. Simpler, and still spec-compliant.


Phase 7 - IP Layer

The IP layer introduced the most technically interesting struct definition in the project.

The IP header stores version in the upper 4 bits and ihl in the lower 4 bits of the first byte. The naive reading of RFC 791 would lead you to declare version:4 first, then ihl:4. On a big-endian machine that would be correct. On little-endian x86, GCC fills bit fields from the least significant bit upward within each byte - so the first-declared field gets the low bits. To place version in the high nibble as the RFC requires, ihl must be declared first. The same logic applies to ecn:2 and dscp:6 sharing the second byte.

This isn't particularly mysterious once you know about it. The bug manifests in tcpdump output - you see version 0 instead of version 4, and it's immediately obvious something is wrong with the byte layout. From there the reasoning to the fix is direct. It became intuitive fairly quickly once the underlying principle (LSB-first fill on little-endian) clicked.

The other significant addition in Phase 7 was recvip() - the receive path mirror of evalip(). Where evalip() builds outward from logical structs to flat bytes, recvip() peels inward. It casts the receive buffer to rawip *, reads IHL to compute the header size, advances past it to find the ICMP data, verifies both checksums, and reconstructs the nested logical struct chain.

Decision that carried forward: recvip() was initially written to call recvfrom() internally - it owned both receiving and parsing. In Phase 8 this had to change. With AF_PACKET, what arrives from the socket is a full Ethernet frame, not an IP packet. If recv_frame() and recvip() both called recvfrom(), they'd read two different frames. The fix was making recvip() accept a uint8_t *buf and uint16_t n from the caller - recv_frame() handles the socket read, strips the Ethernet header, and passes just the IP bytes in. The refactor was clean but required updating the signature and removing the socket-related logic from inside recvip().

IP_HDRINCL removal in Phase 8: In Phase 7, setup() used AF_INET + SOCK_RAW with IP_HDRINCL set - telling the OS "I'm providing the IP header myself, don't add one." This made sense because the OS would otherwise prepend its own IP header on top of ours, producing a double-header packet. In Phase 8, switching to AF_PACKET made this option meaningless - at the Ethernet layer, the OS never touches the IP header at all, so there's nothing to suppress.


Phase 8 - Ethernet Layer

Phase 8 was the hardest part of the project by a significant margin. Not because any individual piece was complicated, but because the number of new concepts arriving simultaneously was high, and the feedback loop was opaque. A wrong bit in the IP layer gets rejected by tcpdump with a clear error. A wrong EtherType or malformed ARP packet just produces silence - the frame disappears and you don't know why.

The socket change. Moving from AF_INET to AF_PACKET + SOCK_RAW was the most significant mechanical change. With AF_INET, the OS builds the Ethernet frame, resolves the destination MAC, and sends it out the right interface. You never think about any of that. With AF_PACKET, the OS stops touching your data. What you write to the socket goes on the wire verbatim, Ethernet header and all. What you read from the socket is the complete raw frame exactly as the NIC received it.

Header ordering conflicts. <net/if.h> must be included before <linux/if_packet.h>. Both define struct ifreq, but through different header hierarchies - <linux/if_packet.h> internally pulls in <linux/if.h>, and if <net/if.h> hasn't been seen first, the compiler gets two incompatible definitions of the same struct. The fix is trivial once you understand the cause, but diagnosing it requires knowing that glibc headers and kernel headers can define the same types in incompatible ways.

sockaddr_ll and interface indices. AF_INET sockets route by IP address. AF_PACKET sockets don't route at all - they just need to know which physical network interface to send the frame out on. That's what sockaddr_ll.sll_ifindex provides: an integer representing the interface. if2idx() resolves an interface name like "enp0s3" to its integer index using ioctl(SIOCGIFINDEX).

Getting our own MAC and IP. send_arp() needs your machine's MAC and IP to fill in the sender fields of the ARP request. These come from get_mac() and get_ip(), both following the same ioctl() pattern as if2idx() - fill a struct ifreq with the interface name, call with a different request code (SIOCGIFHWADDR for MAC, SIOCGIFADDR for IP), read the result from a different field. get_mac() has a wrinkle: the result comes back as a 6-byte array in ifr.ifr_hwaddr.sa_data, and you want to store it in a uint64_t:48 bit field. Bit fields can't have their address taken - &result.addr is illegal. The fix is copying into a uint64_t tmp first, then assigning result.addr = tmp.

ARP. This was the piece with no equivalent in Phases 6 or 7. Before you can send an Ethernet frame to 8.8.8.8, you need the gateway's MAC address. 8.8.8.8 is not on your local network - the frame goes to your router first, which forwards it outward. But to address an Ethernet frame to your router, you need its MAC, and you don't know it.

ARP resolves this. You broadcast a frame to FF:FF:FF:FF:FF:FF - every device on the local network receives it - asking "who has IP 10.0.2.2?" The router recognizes its own IP and replies with its MAC in the sender hardware address field (sha). That MAC then becomes the dst field of every ping frame you send.

send_arp() builds the frame manually into a stack-allocated buffer. It can't use sendframe() because sendframe() calls evalether() which calls evalip() which assumes the payload is an IP packet. ARP has no IP layer - the ARP struct sits directly inside the Ethernet frame.

recv_arp() listens for the reply. The op field distinguishes requests from replies (1 = request, 2 = reply). The gateway's MAC is in the sha (sender hardware address) field - the gateway is the sender of the reply. Returns a zeroed mac struct as an error sentinel, since a real MAC is never all zeros.

recv_frame() and the redesign of recvip(). With AF_PACKET, the buffer from recvfrom() starts with the Ethernet header, not the IP header. recv_frame() makes the one recvfrom() call, casts to rawether * to read the Ethernet fields, checks the EtherType, subtracts sizeof(rawether) from the byte count, and passes buf + sizeof(rawether) and the reduced n to recvip(). The pointer arithmetic buf + sizeof(rawether) is identical in structure to what recvip() already does internally to skip past the IP header and find ICMP: buf + sizeof(rawip).

The moment it worked. tcpdump output after getting ARP and ping both working:

ARP, Request who-has 10.0.2.2 tell 10.0.2.15
ARP, Reply 10.0.2.2 is-at 52:54:00:12:35:02
IP 10.0.2.15 > 8.8.8.8: ICMP echo request, id 1, seq 1, length 19
IP 8.8.8.8 > 10.0.2.15: ICMP echo reply, id 1, seq 1, length 19

Your code asked a question on the wire. The network answered. Your code sent a packet using that answer. Google replied.


What the OS Was Hiding

The comparison between Part 1 and Part 2 is the most useful thing to sit with. In Part 1, setup() was:

socket(AF_INET, SOCK_RAW, 1);
setsockopt(s, SOL_IP, IP_HDRINCL, &one, sizeof(one));

Twenty lines of setup code. The OS handled interface selection, MAC address resolution, Ethernet frame construction, ARP, and routing decisions - silently, on every packet. Part 2 replaces that silence with about 400 lines of explicit code that does the same things, visibly.

The OS's network stack does exactly what Part 2 does, just faster and with all the error cases handled. Once you've written it yourself, the abstraction feels different - you know what's underneath it.


What the Pace Actually Looked Like

Honest accounting: the project took about five weeks. I intended two. The gap was almost entirely procrastination, not complexity. Some days I sat down and moved fast. Most days I didn't sit down at all.

The parts that actually took time had nothing to do with the networking concepts. The bit field ordering in the IP header was solved in an hour once I hit it. The header ordering conflict was diagnosed in one session. Phase 8 was genuinely the hardest part, but not because any single piece was opaque - it was the volume of new things arriving at once: AF_PACKET, sockaddr_ll, ioctl() for interface queries, ARP from scratch, recv_frame() design, the recvip() refactor. Each piece individually was manageable. All of them together, with a feedback loop that often just produces silence when something is wrong, made progress feel slow.

What took time was not sitting down consistently enough to build momentum. The lesson I'm still working on: the pace of understanding is actually fast once you're working. The slowdown is almost always upstream of the code.


What Actually Clicked

The two-struct pattern is more general than networking. You'll see it wherever code has to deal with serialization - file format parsers, IPC mechanisms, storage engines, protocol implementations. Convenient in-memory representation on one side, wire format on the other, translation centralized in one place. The pattern has a name once you've built it.

Each layer's design decision echoed into the layers above it. Choosing how to store the ICMP payload in Phase 6 determined what evalip() had to navigate in Phase 7. Making recvip() accept a buffer pointer instead of a socket in Phase 7 made recv_frame() composable in Phase 8. The architecture isn't static - it evolves, and earlier choices have consequences.

The OS is doing a lot. Part 1 used AF_INET and didn't think about MAC addresses or ARP or interface indices. Part 2 made all of that visible. Every ping your laptop sends, at some point, goes through exactly the steps you wrote in Part 2. The OS just does it faster and handles all the error cases.

tcpdump is the most useful debugging tool for this kind of work. Every significant bug in this project was diagnosed by staring at tcpdump output and asking why the bytes didn't match what the RFC said they should be. It makes the invisible visible.

Reading production source helps. The net/ethernet/eth.c kernel source confirmed the ETH_P_IP and ETH_P_ARP values. The iputils ping source showed how a production implementation handles things you'd never think of. You don't have to understand every line - just enough to see how your approach relates to how real systems do it.

Both projects were built the same way: wrong first, then corrected, then understood. The code that ended up in the final version doesn't show the wrong versions that preceded it. That's always the part that's hard to show - but it's the part that actually happened.