Every web framework (Express, Flask, Django, Rails) does the same thing under the hood. It creates a socket, binds it to a port, listens for connections, accepts them, reads requests, and writes responses. The framework just hides it behind a single call.
C doesn't hide anything. When you write a web server in C, you see every system call, every file descriptor, every queue. That's what makes it the best way to understand what's actually happening.
File Descriptors
In Unix, everything is a file. Sockets, pipes, printers: they all get a file descriptor (fd), which is just a non-negative integer the OS uses to track open resources.
Our server needs two:
int server_fd; // The listening socket, one per server
int client_fd; // A connection, one per clientThe server fd is created once and lives for the lifetime of the server. Client fds are created each time a connection is accepted, and destroyed when the connection closes. There's one server_fd but potentially thousands of client_fds.
Creating a Socket
The first system call is socket(). It asks the kernel for a new communication endpoint:
server_fd = socket(AF_INET, SOCK_STREAM, 0);Three parameters:
AF_INET: IPv4 address family. UseAF_INET6for IPv6.SOCK_STREAM: A reliable, ordered byte stream. This means TCP. For UDP, you'd useSOCK_DGRAM.0: Let the kernel pick the protocol. ForSOCK_STREAM, it always picks TCP.
At this point, we have a socket but it's not attached to any address or port. It's like having a phone but no phone number.
Configuring the Address
Before binding, we fill out a sockaddr_in structure that tells the kernel where to listen:
struct sockaddr_in address;
address.sin_family = AF_INET; // IPv4
address.sin_addr.s_addr = INADDR_ANY; // All interfaces (0.0.0.0)
address.sin_port = htons(8080); // Port in network byte orderINADDR_ANY means "listen on every network interface": Wi-Fi, Ethernet, loopback, all of them. This is convenient for development but risky in production. Many database leaks happen because someone accidentally exposed MongoDB on a public interface this way.
The htons() call is subtle but critical. Network protocols use big-endian byte order, but your CPU might use little-endian. htons (host-to-network-short) converts the port number to the right format:
Network Byte Order
Why htons() matters for port numbers
16-bit representation
8080 = 0x1F90
Host (Little-Endian)
Sent on wireMemory: [0x90, 0x1F]
Network (Big-Endian)
Memory: [0x1F, 0x90]
Misinterpreted
Without htons(), the host sends bytes in little-endian order. The receiver interprets them as big-endian and reads port 36895 instead of 8080.
Binding and Listening
With the address configured, two calls activate the server:
// Step 1: Register the address
bind(server_fd, (struct sockaddr *)&address, sizeof(address));
// Step 2: Actually start listening
listen(server_fd, 10);This is a detail most people miss: bind() does not start listening. It only associates the socket with the address and port. The kernel hasn't created any internal data structures for accepting connections yet.
listen() is what flips the switch. It creates the accept queue, a kernel-managed buffer where completed TCP connections wait to be picked up by your application. The second parameter (10) is the backlog: the maximum number of connections that can sit in this queue.
The Accept Queue
This is where things get interesting. When a client connects to your server, a three-step handshake happens entirely in the kernel:
- Client sends SYN → "I want to connect"
- Server responds SYN-ACK → "Acknowledged, go ahead"
- Client sends ACK → "Connected"
Once complete, the connection is placed in the accept queue. It sits there until your application calls accept().
Accept Queue Simulator
See how connections queue up and what happens when your server is too slow.
Accept Queue
backlog: 50/5 slots used
Accepted
0
In Queue
0
Rejected
0
Event Log
If the queue fills up because your application is too slow calling accept(), new connections get rejected. The kernel won't even respond with a SYN-ACK. The client sees a connection timeout or reset.
This is one of the most common production bottlenecks. A slow backend that can't drain the accept queue fast enough will silently drop connections.
The Server Loop
Every socket server has an infinite loop at its core. Accept a connection, read the request, send a response, close:
while (1) {
// 1. Accept (blocks until a connection arrives)
client_fd = accept(server_fd, ...);
// 2. Read (blocks until data is available)
read(client_fd, buffer, 1024);
// 3. Write the response
write(client_fd, http_response, strlen(http_response));
// 4. Close the connection
close(client_fd);
}Each of these calls can block. accept() halts execution until there's a connection in the queue. read() halts until the client sends data. This is why a single-threaded blocking server can only handle one request at a time, and why async I/O frameworks exist.
Step through the complete lifecycle to see what happens at each stage:
Server Lifecycle
Step through each system call and see what happens
C Code
server_fd = socket(
AF_INET, SOCK_STREAM, 0
);System State
What's happening
The kernel creates a new file descriptor (fd=3) for our TCP socket. No address or port assigned yet.
Data Flow: The Hidden Copies
When data moves between a client and your server, it doesn't teleport. It gets copied multiple times:
- Client sends data → arrives at your NIC (network interface card)
- NIC copies it to the kernel receive queue (per-connection buffer)
- Your
read()call copies it from the kernel into your application buffer - You process it and call
write(), which copies your response to the kernel send queue - The kernel copies it to the NIC for transmission
That's 3-4 memory copies for a single request-response cycle. Every one of those copies costs CPU time and memory bandwidth. This is why kernel developers are obsessed with "zero-copy" techniques. Reducing or eliminating these copies is one of the biggest performance wins in networking.
Request-Response Data Path
Every request involves multiple memory copies
Click "Play" to watch data flow through the network stack and count memory copies.
The Complete Server
Here's everything tied together in a working HTTP server, about 40 lines of C:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <netinet/in.h>
#define APP_MAX_BUFFER 1024
#define PORT 8080
int main() {
int server_fd, client_fd;
struct sockaddr_in address;
int address_len = sizeof(address);
char buffer[APP_MAX_BUFFER] = {0};
// Create socket
server_fd = socket(AF_INET, SOCK_STREAM, 0);
// Configure address
address.sin_family = AF_INET;
address.sin_addr.s_addr = INADDR_ANY;
address.sin_port = htons(PORT);
// Bind and listen
bind(server_fd, (struct sockaddr *)&address, address_len);
listen(server_fd, 10);
// Server loop
while (1) {
client_fd = accept(server_fd,
(struct sockaddr *)&address,
(socklen_t *)&address_len);
read(client_fd, buffer, APP_MAX_BUFFER);
const char *response =
"HTTP/1.1 200 OK\r\n"
"Content-Type: text/plain\r\n"
"Content-Length: 13\r\n\r\n"
"Hello, World!";
write(client_fd, response, strlen(response));
close(client_fd);
}
return 0;
}Compile and run it with gcc -g server.c -o server && ./server, then hit curl localhost:8080 and you'll see "Hello, World!".
Try it with a debugger. The highlighted lines above are the system calls that do the real work. Set breakpoints on them with GDB (gdb ./server, then break 18, break 26, etc.) and step through. You'll see the file descriptors change, the accept call block, and the data flow in real time. The -g flag in the compile command includes debug symbols for this.
For a fully annotated version of this server with detailed comments explaining every system call, check out the complete source on GitHub.
Wrapping Up
Every web server, whether it's Nginx serving millions of requests or a tiny Flask app, does exactly what we just built. The difference is in how they handle the accept loop:
- Single-threaded blocking (what we built): one connection at a time. Simple but can't scale.
- Multi-threaded: spawn a thread per connection. Better, but threads are expensive.
- Event-driven (epoll/kqueue): monitor thousands of connections with a single thread. This is what Node.js and Nginx do.
- io_uring: the newest Linux approach, reducing system call overhead even further.
Understanding the raw socket layer makes every framework decision click. When Express says it's "non-blocking," you now know what's not blocking. When Nginx talks about its "worker connections," you know it's about accept queue management. When someone says "zero-copy," you know exactly which copies they're trying to eliminate.
Resources
- Hussein Nasser's YouTube channel. This post was inspired by his backend engineering videos. His explanations of networking internals, TCP, and server architecture are some of the best on YouTube.
- Beej's Guide to Network Programming. The best free resource for learning socket programming in C. Covers everything from basic sockets to
select(),poll(), and advanced techniques. - How TCP Backlog Works in Linux. A deep dive into the SYN queue vs accept queue distinction and how the backlog parameter actually behaves.
- Eli Bendersky's Concurrent Servers series. A four-part series that starts where this post ends, building progressively from threads to
select()toepoll.