Most people regularly use email services but know little about how emails are delivered from one destination to another. In fact, there’s a lot going on under the hood. In this article, we’ll cover SMTP, which is the main protocol behind sending emails.
The inner workings of sending an email can seem completely opaque to end-users. It might seem that all it takes to send an email is to compose it, specify the recipient, and hit ‘Send’. Behind the scenes, however, there are many protocols working in concert to ensure that an email is successfully delivered. Chief among these is the Simple Mail Transfer Protocol (SMTP), perhaps somewhat of a misnomer since orchestrating how an email flows from one system to another is anything but simple.
In short, SMTP is a protocol from the TCP/IP suite of protocols that defines rules for transmitting an email from the sender’s computer over one or multiple networks to the recipient. More precisely, SMTP is mainly concerned with outgoing emails; incoming emails use SMTP in addition to other protocols. For those interested, a detailed description of the most recent version of SMTP and all of the commands and responses involved in a mail transaction can be found in this RFC5321 document.
A short history of email and SMTP
The first email is attributed to ARPANET, a project from the 1960s that aimed to connect research sites across the US and allow them to share resources. Before then, companies had used proprietary networking protocols that allowed communication only between computers of the same type connected to the same mainframe system acting as their intermediary. Cross-platform and remote connectivity had yet not been possible. Today, we take for granted being able to connect to a remote server regardless of the type of device we’re using, or whether we’re connecting to the internet via an Ethernet cable or 5G network. This is all thanks to the TCP/IP protocols that the developers of ARPANET created for their network.
The TCP/IP suite of protocols brought a quantum leap in network connectivity and enabled computers to connect regardless of hardware or network architecture. TCP/IP’s biggest contribution was in endowing each network end-point with the capacity to completely manage its communication with other devices, thus disposing of the middleman. Additionally, TCP/IP allows different protocols to be called upon for different applications. This is what allows SMTP to implement only the specifics of email transmission and ignore low-level implementation details (e.g. adding headers to data packets) since such tasks are performed by other protocols.
After the first email was sent across the ARPANET in 1971, it took slightly over a decade for the internet community to finally agree on SMTP as the standard for email transmission. The original (and verbose) paper on SMTP can be found here; however, the following sections will provide a higher-level step-by-step explanation of what SMTP is and how it works.
How does SMTP work?
The key users of the SMTP protocol are SMTP clients and SMTP servers. Let’s have a look at how clients and servers interact.
Email client
As an end-user, you interact only with the email client. For example, if you use Gmail, your email client is the Gmail interface that you access through a web browser or via a mobile app. Email clients take care of tasks such as forwarding your outgoing emails to your mail server (which then forwards them to the recipient), or listening to the mail server to make sure that all emails addressed to you are downloaded to your mailbox. The backend service that powers Gmail implements the SMTP protocol, so you can use a client like Mail.app on your Mac to send an email through a Gmail server over SMTP.
Email server
When you send an email, your email client (e.g. Gmail client) sends it to your mail server (e.g. Google’s server). The server then performs a DNS lookup to translate the recipient's domain (the part after the ‘@’ in user@domain.extension) into an MX record. An MX record is a piece of information that gives the server the location of the recipient’s mail server. For example, for Mailosaur it would be the following:
mailosaur.net. 86400 IN MX 0 mailosaur.net.
This report includes quite a few items, with the most important one being the final one which specifies that all emails sent to anyone using the @mailosaur.net
domain are sent to a server located at mailosaur.net
, or rather, to the IP address of that server.
SMTP session
Once the destination server is found, the sender’s mail server connects with the recipient’s mail server in what is known as an SMTP session. This session facilitates communication between the sender and the recipient by acting as their intermediary.
The session begins with a ‘handshake’ whose purpose is for the server and the client to identify themselves and agree on the reason for the connection. The handshake is initiated by the sender’s server sending a greeting HELO
or EHLO
command. An SMTP session is established only if the recipient’s server answers affirmatively.
Mail transaction
Upon a successful handshake, the SMTP client may start sending the email. This transaction is achieved through an exchange of multiple commands such as MAIL FROM
for specifying the sender, RCPT TO
for specifying the recipient, and DATA
for sending the actual contents of the email. Once the whole email has been sent, the SMTP
client sends the QUIT
command and terminates the connection. The whole exchange could look as follows:
[220] SMTP service established |
SMTP client attempts to establish a connection with the recipient’s SMTP server (typically via TCP at port 25) and the server answers affirmatively with code 220 |
---|---|
EHLO smtp.gmail.com |
The client begins an SMTP handshake by sending the EHLO command along with its hostname and domain name |
[250] 'Nice to meet you, client' |
The server answers with code 250 and accepts the connection |
MAIL FROM : sender@gmail.com |
The client specifies the sender of the email |
[250] 'New message started’ |
The SMTP server acknowledges the sender and prepares to accept the email |
RCPT TO : recipient@mailosaur.net |
The client specifies the recipient of the email |
[250] 'Recipient accepted’ |
The server acknowledges the recipient exists and can receive emails |
DATA |
The client requests to start sending the email |
[354] 'End message with period’ |
The server answers with code 354 which means that it’s ready to accept the email |
Date: 01.04.2021. 12:30 From: sender@gmail.com Subject: What is SMTP? To: recipient@mailosaur.net SMTP rocks! . |
The email is transmitted line by line until the final line that contains a dot indicating the end of an email |
[250] 'Mail accepted' |
The server confirms that it received the email |
QUIT |
The client requests to terminate the connection |
[221] 'Goodbye' |
The server answers 'Goodbye’ and terminates the connection |
Mail delivery
At last, the destination server retrieves the email in the form of data packets and assembles them into their original, before placing the email in the recipient’s mailbox.
Where is SMTP used?
Virtually all email services use SMTP under the hood. In the example above, we showed an exchange between an SMTP client and an SMTP server. However, the SMTP server that receives an email from a client is not always the email’s final destination but can sometimes act as the email’s relay. A server may accept incoming email only to pass it on to another server. In this case, the server that an email hops through assumes the role of an SMTP client as it forwards emails to other servers, in an email transmission process that looks like the one covered above.
But of course there are exceptions. For example, though Gmail accepts SMTP commands as any other mail server, when sending an internal email (e.g. between Gmail addresses) the service can place the message directly into the recipient’s inbox without going through SMTP and other standard protocols.
How do POP3 and IMAP relate to SMTP?
You might have seen SMTP mentioned alongside POP3 and IMAP protocols. All three have to do with email, but are used for different purposes. Whereas SMTP is used to connect to an outgoing email server, POP3 and IMAP are used for getting the contents of a mailbox on an email server.
The latter two also differ in their functionality:
- POP3 (Post Office Protocol) downloads emails from the server, saves them on a device, and then removes them from the server. This means that if you connect to the same email account on multiple devices, only one device will have access to the email.
- IMAP (Internet Message Access Protocol) accesses emails on the email server where they’re kept indefinitely. This is preferred to POP3 for users who access the same email account from multiple devices. By keeping emails on the server rather than saving them locally, IMAP makes sure that the emails are accessible from any device. Additionally, IMAP keeps your devices in sync, so once you read an email on one device, it’s shown as ‘read’ across all other devices.
Use Mailosaur for SMTP diagnostics
Setting up a server to conduct email tests does not have to be complicated. With Mailosaur, you can quickly automate email testing on a virtual SMTP server, which lets you avoid managing a real server, or worse, sending untested emails to your customers. Check out our documentation for more information on how to get started.