What Every Developer Should Know When Working with Email (Part 2)
This is the second post in a series for developers working on email-enabled applications. View Part 1.
In the first part of this series, we covered the components and protocol that comprise email. This post will explore the anatomy of an email message, the SMTP lifecycle, encodings, and example code to send a message.
3. Encodings
Before jumping into the formatting of email messages and SMTP lifecycle, it’s important to understand encodings. Individual email headers and entire bodies can be encoded according to the following schemes:
- 7bit is the default encoding when none is specified. It means that all characters can be represented with 7 bits (ASCII characters 0-127) and thus don’t need to be encoded. This is fine for the simplest of emails, but will cause errors when trying to transmit Unicode characters.
- 8bit is similar to 7bit, in that the contents are not actually encoded. Instead, it indicates that one-byte characters (ASCII characters 0-255) will be transmitted raw on short lines.
- Binary is nearly identical to 8bit, but may be useful to indicate that the contents are not human-readable. The main difference is that binary encoding does not require specific line lengths.
- Base64 is a safe encoding that takes an entire string and transforms it from its eight-bit binary representation to a safe, human-readable six-bit ASCII alphabet (consisting of uppercase letters, lowercase letters, numerals, and the “+” and “/” characters). While Base64-encoded text is 25% less space efficient than 7bit or 8bit (leading to larger messages), it allows all contents to be safely sent and consumed by modern email software.
- Quoted-Printable is an alternative encoding to Base64 which only encodes high-byte characters, which can be detected by an equals sign followed by the hexadecimal representation of the byte (e.g., “=D0”). This allows most of the text to remain human-readable, with clear exceptions wherever equal signs are encountered.
Many email libraries will analyze what you’re sending and handle encoding for you (picking the most space-efficient encoding). When you need to manually choose an encoding, we recommend sticking with Base64 or Quoted-Printable.
4. Anatomy of an Email Message
An email message is transmitted as plain-text (generally lower ASCII). It is made up of key/value headers of metadata, followed by a body.
Let’s take a look at an example:
Delivered-To: [email protected]
Received: by 192.168.1.89 with SMTP id n86csz1253946lfi;
Mon, 29 March 2016 08:36:49 -0700 (PDT)
X-Received: by 192.168.1.73 with SMTP id f19mq15281755oig.95.1432413009714;
Mon, 29 March 2016 08:36:49 -0700 (PDT)
Return-Path: <[email protected]>
Received: from mail-oi1-x123.google.com (mail-oi1-x123.google.com.)
by mx.google.com with ESMTPS id w9ti21871367ptd.48.2016.03.29.08.36.49
for <[email protected]>
(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
Mon, 29 March 2016 08:36:49 -0700 (PDT)
Received-SPF: pass (google.com: domain of [email protected] designates 1234:5678:90ab:cdef::243 as permitted sender) client-ip=1234:5678:90ab:cdef::243;
Authentication-Results: mx.google.com;
dkim=pass [email protected];
spf=pass (google.com: domain of [email protected] designates 1234:5678:90ab:cdef::243 as permitted sender) [email protected];
dmarc=pass (p=NONE dis=NONE) header.from=gmail.com
Received: by mail-oi0-x123.google.com with SMTP id f19mq15281755oig.1
for <[email protected]>; Mon, 29 March 2016 08:36:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=gmail.com; s=20120113;
h=mime-version:date:message-id:subject:from:to;
bh=o/az3roGW/PdfekTxkP0M/7SvrdUEGLMTb1VdZGQ5QQ=;
b=FxHc9LkCmj+xowyGHadCnyceJ+dYVehldv00GWS0KO5IG488bPB7va8EHpNAlyFcEW
tU9mW5cLHQpY+aponSeFyEj26mzx4vPzVHBS01GDjgukqkzVbyjFiC2WCqb/D967/8wX
MwcjXl15HwpTvXWIMoJoSlDvO/S3j9+BfodrG0hWLId5NDATnsNX1u8sCr/0e8udZb/r
bDcFGlZlcck31ZOhb8FA6QdmqGJIPQOdNn26qpi9S2FEC+oRYkrwW9SVZ5FduHSKj4ui
z9K5WNj8clZc03O4aAGSYGl2O1o+FZO1t6vyQWnIrJPBUye0A6NW6xE/PO+MubwjANY7
EioQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20130820;
h=x-gm-message-state:mime-version:date:message-id:subject:from:to;
bh=o/az3roGW/PdfekTxkP0M/7SvrdUEGLMTb1VdZGQ5QQ=;
b=DhRy652w0Q41i2E04s1DUgB9NloPzOogOuOEcc87+pV83DxSmTssTuollnDuWpxojm
IePz/A146pKanLcB8Wh2Pp+sNxH0lMI01IxUMT0Jo7rEzwzpd/Os04McGOV51DHSgAGk
GJMk/BDkifBwghI3dtnlChs8ke+hdUSYDfMUzADzRpWpReFeWmt94+AhiGN23PovgNSX
8m393g+aJD1/3pfbw4MY3VZt5It5ELPTID8Q48aEYiiVjra8tp7gxwijr+mJLKnDPnVT
h6Cr1nMCoCkTz2R6ksODzzxvb5a2kbOd48UXFqN45RFp57UX2HuW9BlG/FfwbcRXw1wZ
ttoQ==
X-Gm-Message-State: AOPr4FVbBIxKMzExHXFicQRL6XH4kFGJvdbPJzHctLmmAwioLN7a+GqRLNAvM3rR9VziQCNKmq3M3cRfHUpCbw==
MIME-Version: 1.0
X-Received: by 192.168.1.87 with SMTP id n84mr18214195oih.175.1463413009027;
Tue, 29 March 2016 08:36:49 -0700 (PDT)
Received: by 192.168.1.77 with HTTP; Tue, 29 March 2016 08:36:48 -0700 (PDT)
Date: Tue, 16 May 2016 10:36:48 -0500
Message-ID: <CACzH8h3wVA=_3XwXuMPOD-p0DLWSiPQ_9ZRiXQkDkU41eWbDCw@mail.gmail.com>
Subject: Sample message to show headers
From: Sample Sender <[email protected]>
To: Sample Recipient <[email protected]>
Cc: Sample Cc <[email protected]>
Importance: High
Content-Type: multipart/alternative; boundary=14eb2c0941189c23080532f76476
--14eb2c0941189c23080532f76476
Content-Type: text/plain; charset=UTF-8
Message contents follow...
--14eb2c0941189c23080532f76476
Content-Type: text/html; charset=UTF-8
<b>HTML message contents follow...</b>
--14eb2c0941189c23080532f76476--
A. Headers
We can break down the headers above into a few categories:
- Mandatory user-specified headers (like
From
,To
, andSubject
). - Optional user-specified headers (like
CC
andImportance
). - Mandatory headers generated by the email client (like
Content-Type
andDate
). - Optional headers generated by the sender’s email server (like
DKIM-Signature
). - Mandatory headers generated by the recipient’s email server (like
Message-ID
,Received
, andReturn-Path
). - Optional headers generated by the recipient’s email server (like
Received-SPF
andX-Gm-Message-State
).
When working with email clients, we’re responsible for making sure the first three header categories above are populated. Everything after that is set by mail transfer agents.
You’ll notice that some headers wrap around to one or more lines. Some SMTP specs dictate an upper character limit per row, which led to content wrapping after 76 or so characters. The continuations are called “folds” and are designated by a space or tab character at the start of the line.
The example above is in English with standard ASCII characters, but we can also encode Unicode characters in headers. To do so, we start a string with “=?UTF-8?B?” for Base64 encoding or “=?UTF-8?Q?” for Quoted-Printable encoding, and close the encoding with “?=”.
B. Body
So what’s that weird stuff surrounded by “–14eb2c0941189c23080532f76476” strings? Email is often presented as a single view, but its body often contains multiple parts. We segment the different components of a message using Multipurpose Internet Mail Extensions (MIME) parts.
Because the message contains a header specifying “MIME Version: 1” and because the message’s content type starts with “multipart”, followed by a boundary, we know that we’re looking for one or more MIME-encoded parts. A boundary name is randomly generated and then delimits the different components. Each MIME part starts with a header designating that MIME part’s content type (e.g., “text/html” for an HTML-formatted message or “text/plain” for plain text). Additional headers can be specified, for example to include an attached image’s filename.
Email clients parse the MIME parts and choose which to display. Some clients may prefer to show rich “text/html” MIME parts over plaintext “text/plain” messages. That allows for rich formatting, font control, colors, and images. MIME parts can be nested multiple layers deep.
The example above contains safe ASCII characters only. If we wanted to include Unicode characters, we would add a “Content-Transfer-Encoding” header to the MIME part specifying “base64” or “quoted-printable”.
5. Sending Messages Using SMTP
Now that we understand an email message’s headers and body, let’s look at the process to transmit the message:
- Prepare the message, including headers and an encoded body.
- Open a connection to the remote SMTP server (usually on port 25, 465, or 587).
- Secure the connection using SSL (sometimes referred to as SMTPS) by issuing the “STARTTLS” command.
- Authenticate to the server using one of the Simple Authentication and Security Layer (SASL) mechanmisms. The simplest approach is using “PLAIN” for plaintext password transfer.
- Tell the server who the mail is from by sending a “MAIL FROM:[email protected]” command.
- Tell the server who the direct recipients are by sending one or more “MAIL TO:[email protected]” commands. Note that each recipient should be individually addressed and that “To”, “Cc”, and “Bcc” recipients all use the “MAIL TO” directive.
- If specified, send the “Content-Type” and “Content-Transfer-Encoding” headers on their own lines.
- Send a line consisting of the word “DATA” to delimit where the body starts, followed by the message’s body.
- Send any additional raw headers, one per line.
- Send a blank line followed by the message body, followed by a standalone line consisting of a period.
- A response code will be returned, starting with 2** for success, 4** for transient errors, or 5** for permanent errors).
- Optionally, restart at step 5 to send additional messages on this connection.
- Gracefully disconnect from the server by sending “QUIT”.
Be sure to check for and respond to errors at each step, as there could be ephemeral issues (e.g., server overload) or configuration issues (e.g., wrong password).
6. Sample Code: Sending Email
Let’s apply the steps above to send a message using C#.
The example below uses the OpaqueMail .NET Email Library that I maintain. Examples will be similar for other libraries. You can download OpaqueMail for free using the NuGet Package Manager.
// Prepare core headers for the message.
MailMessage message = new MailMessage();
message.From = new MailAddress("[email protected]", "Example Sender");
message.To.Add(new MailAddress("[email protected]", "Example Recipient"));
message.Subject = "Example Subject";
// Set the message body and include HTML.
message.ContentType = "text/html";
message.Body = "<b>Hello world!</b> This is my test message.";
// Send the message.
SmtpClient smtpClient = new SmtpClient("outlook.office365.com", 587);
smtpClient.EnableSsl = true;
smtpClient.Credentials = new NetworkCredential("[email protected]", "Pass@word123");
smtpClient.Send(message);
That’s it! Pretty easy, huh?
Check out Part 3, where we delve into advanced SMTP topics and start exploring POP3 and SMTP.