What Every Developer Should Know When Working with Email (Part 1)
This is the first post in a series for developers working on email-enabled applications.
Businesses run on email. More than 100 billion emails are sent each day. Nearly every application interfaces with email for messaging, authentication, and managing workflows.
Despite being a mature technology, programming for email is surprisingly complex. Whether you’re creating a website, mobile app, or cloud solution, here’s what every developer should know when working with email.
0. Don’t Reinvent the Wheel
Programming email is deceptively complex. It may be tempting to “roll your own” email library with just the features you need. After all, it looks like some simple text directives to send and receive messages, right?
While it’s possible to hack together small scripts to transmit and check email, they tend to break quickly and spectacularly as soon as you deal with character encoding, mobile views, attachments, spam filtering, advanced headers, and asynchronous connections.
Instead, I recommend finding an email library that meets your needs. There are tons of great open source options for any language. For NET, I created a project called OpaqueMail back in 2013. It was originally focused on email encryption, but it quickly grew to support the majority of email use cases and is used in dozens of projects now.
For 99% of scenarios, it will be faster, cheaper, more reliable, and less frustrating to adopt an existing library than try to invent your own.
1. Email Architecture and History
Any computer or device can send or receive email. To do almost anything interesting, you will need to connect to a Mail Transfer Agent (MTA). This is a server that either listens for messages to send to others or allows you to receive messages that are waiting for you. Any software that interacts with an MTA / server is considered a client.
While email has been around for more than 40 years, it was like the wild west for most of its history. Until the mid-2000’s, there were many open relay servers on the internet which accepted anonymous emails from anyone. This led to a deluge of unwanted email called spam. At certain points, it was estimated that over 95% of email sent was spam.
Early email clients weren’t sophisticated, leaving many vulnerable to viruses, malware, and mass phishing messages. There were hundreds of independent client and server implementations, resulting in lax security, poor interoperability, and bad user experience.
Thankfully, multiple initiatives have been introduced to authenticate senders, detect unwanted messages, blacklist known spammers, and create a web of trust. Email is relatively mature these days. Clients are stable and most organizations have standardized on a handful of reliable servers (the 800 pound gorillas being Microsoft’s Office 365 / Exchange and Google’s Gmail).
Most people organize their email into one or more mailboxes. Intuitively, you may think of these like folders, but mailbox is the technical term. There are also third-party extensions like Google’s “tags”, but email clients always query and interact with organizational constructs as mailboxes.
2. Email Protocols and Standards
The standards for email are defined via dozens of Request for Comment (RFC) publications, the earliest dating back to 1973.
- Simple Mail Transfer Protocol (SMTP) defines how emails are transmitted between servers. When your email client goes to send a message, it connects to an SMTP server, likely authenticates, sends metadata about the message, then sends the message itself. The SMTP server you connected to likely isn’t the final destination for your message. It then negotiates via DNS where each recipient’s copy of the message should be sent. It forwards the message through one or more intermediary mail transfer agents until it finally reaches its destination SMTP server. Ultimately, the destination determines what to do with the message, which in most scenarios is put in a mailbox called your Inbox. SMTP allowed us to send the message, but we need another protocol to receive and interact with it.
- Post Office Protocol (POP) was formerly the most popular mechanism for interacting with mailboxes. It’s third version was the watermark, leading many to refer to the protocol as POP3. When receiving email, your client connects to the POP server, authenticates similarly to SMTP, requests and receives messages, then disconnects.POP is a fairly straightforward protocol without many bells or whistles. POP allows for your email client to download full or partial messages by ID or date, but it doesn’t allow for structured organization of your messages. Instead, POP either expects you to leave all of your messages clumped together on the server as your “INBOX”, or it expects you download everything and clear it each time you check for messages. The latter approach is impractical when the average user has multiple email clients across their desktop, mobile devices, and webmail. POP is fine for writing simple email-checking programs, but IMAP is universally better and more capable.
- Internet Message Access Protocol (IMAP) is how most email clients interact with servers. IMAP does everything that POP does, while allowing for hierarchical mailboxes, message persistence, server-side searching, and real-time notifications of new messages. It follows the same authentication conventions of SMTP and POP. IMAP supports long-term connections that stay alive and “IDLE”. IMAP is the backbone of any email software. You’ll use it to download full messages or individual components such as attachments or message headers. It will allow you to organize and search countless messages at once.
- Exchange ActiveSync is a proprietary protocol used by Microsoft’s enterprise email and device management platforms. While there are documented APIs, working directly with ActiveSync is cumbersome and not recommended for most situations.
- Proprietary APIs exist and are becoming more popular. Google provides an OAuth web service that wraps the complexity of email. It’s a good option for many web developers, but not portable across email providers.
- Multipurpose Internet Mail Extensions (MIME) are standards that define how extended characters are transmitted, attachments are embedded, and how complex messages such as calendar invites are formed. MIME is a very complex envelope system that contains multiple archives of message parts. The majority of messages sent today use MIME to either contain attachments, embedded images, or multiple views for HTML and text-only mail readers. MIME can be incredibly complex to work with, and is best left to mature libraries.
- Simple Authentication and Security Layer (SASL) is a framework for authentication via challenges and responses. Authentication handshakes range from “PLAIN” for simple cleartext password exchange, “OAUTHBEARER” for federated authentication, to salted authentication mechanisms like “SCRAM”.