Lynne Teaches Tech: How does HTTPS keep you safe online?

This question was originally submitted via the survey.

HTTPS is an extension to HTTP. It encrypts your traffic to and from a given website, ensuring that nobody can read your activity in transit. It’s used almost everywhere, from banking to search engines to this very blog.

This was the question submitted via the survey:

What does TLS protect me from, and what does TLS not protect me from?
Anonymous survey respondent

TLS, or Transport Layer Security, is a cryptographic protocol that replaces its predecessor SSL (Secure Sockets Layer). It is used not only for HTTPS, but also for email, VoIP calls, instant messaging, and more. I’ll talk specifically about HTTPS in this article, but most of what I’ll be talking about applies to these other applications too.

How does HTTPS work?

Without getting into the specifics, HTTP data is sent over TCP/IP connections through the internet. Each HTTP request and response is split into small TCP packets, and sent to an address specified by IP. The receiving machine reads the HTTP data inside the packets.

TLS adds another layer. The HTTP data is encrypted with TLS, and then sent using TCP/IP. The receiving machine reads and decrypts the data inside the packets. The data inside the packets is encrypted using a method only the server and client can understand, so nobody else can read the data while it’s being sent.

The specifics

When you connect to a TLS secured resource, a TLS handshake must first be performed. It goes like this¹:

The client (your web browser) sends a message with a random number (we’ll call it Rc, the random client number), a list of supported encryption types, and the highest TLS version supported.
The server (the website you’re connecting to) sends a reply with another random number (Rs), the encryption method the server wishes to use, and the certificate file that verifies that the server is who it says it is.
After the client verifies the certificate, it extracts the server’s public PGP key from the certificate The client then uses the public key to encrypt a third random number (Rp, also known as the premaster secret), and sends it to the server. A public PGP key is a piece of data you can use to encrypt a message or file so that only the server can decrypt it. It’s like the server sent you a padlock, you used it to lock away your secret message, and you send it back to the server, who holds the only key.
The server uses its private PGP key to decrypt Rp. Both computers now have Rc, Rs, and Rp. Nothing sent so far has been encrypted, which means an attacker could have obtained Rc and Rs. However, since Rp is encrypted using a key that only the server can decrypt, the attacker would be unable to read it. All three numbers are needed to be able to break the encryption, so the attacker wouldn’t be able to do anything.
Both the client and the server use the three random numbers to generate a session key which is used to encrypt the rest of the connection.
The client and server both send a final message encrypted with the session key. It all has gone well, the client will be able to decrypt the server’s message, and vice versa.

It took a while, but both the client and the server now have a shared key that they can use to encrypt data. Anything sent by the client can only be read by the client and the server, and the same applies to anything sent by the server.

What does TLS protect you from?

TLS protects the data as it travels over the internet. Anything you look at will be encrypted using a method that only you (and the server) can decrypt. This means that someone intercepting your connection can’t read anything that you or the server is sending. For example, if an attacker was somehow able to “sit between” you and a HTTP connection to a server, they would be able to intercept the TCP packets and read the HTTP data stored within, and see exactly what you see. They would also be able to send their own responses back pretending to be you. With a HTTPS connection, however, they would still be able to intercept the packets, but they wouldn’t be able to read what’s inside them, or pretend to be you.

In networking security, people often refer to the CIA² model of information security: Confidentiality, Integrity, and Availability. This is a good “rule of thumb” method for determining how safe something is to use. HTTPS communication is Confidential because only you and the server can read the data. There is data Integrity because data can’t be modified without both the client and server being able to tell (as any modified data wouldn’t be encrypted with the session key). However, it does not provide Availability – an attacker is able to cut off communication between you and the server. Of course, no communication over the internet can provide availability, because nothing can be done about the connection being lost due to someone pulling out the router’s power cable.

What doesn’t TLS protect you from?

TLS only protects communications between two computers. If someone is able to hack in to the server, they can read all the messages that are being sent to and from it. This is the same for the client – if your computer is hacked into, an attacker would be able to view your HTTPS traffic.

Let’s say you want to go to example.com, but you accidentally type example.net. Even if the TLS encryption works, you’re still on the wrong website. example.net might be a malicious website owned by an attacker trying to trick you into providing your example.com login details. Your traffic can still only be read by you and the server, but if the server is malicious, there’s nothing you can do.

There are many different encryption algorithms that a TLS connection can be encrypted with. Every now and then, one of the encryption methods will be found to be insecure. If you’re using a (very) outdated web browser, you might not support any encryption methods that haven’t been cracked, making TLS much less secure. Wikipedia has a table listing SSL and TLS encryption algorithms, along with which ones are secure and which aren’t.

TLS doesn’t protect you or the server from making bad security decisions. Weak passwords, password reuse, and other poor security practices on your part will not be protected by TLS encryption. Additionally, if the server does not properly encrypt passwords, gets hacked into, or uses bad security questions (for example, the security question “what is your birthday” is terribly insecure if you’ve ever shared your birthdate with anyone), TLS won’t help them there. All TLS does is encrypt data in transit. What happens with that data, or what that data is, is outside of TLS’ scope.

Connecting to a website over HTTPS doesn’t hide what URL you’re visiting. For example, if you’re on google.com.au, and you search for “bunnies”, you’ll get a URL that looks something like https://google.com.au/search?q=bunnies. If an attacker is able to view what URLs you’re opening, they’ll be able to know what you searched for. Some particularly bad websites send passwords this way (for example, https://example.com/login?username=lynne&password=UhOhSpaghettio), which makes retrieving your login details trivially easy.

The PRISM surveillance program operated by the USA’s NSA describes a “Google Frontend Server” which is used to remove SSL encryption, read the previously encrypted data, and re-add the encryption.

A slide showing how traffic between Google and the public internet is intercepted by a "GFE" server, which "adds and removes" SSL encryption. — ˙ ͜ʟ˙

It’s entirely possible (and almost certain) that the NSA is still doing this. The PRISM system is used by multiple countries and online services.

Dates when PRISM collection began for various services. In order: Microsoft, Yahoo, Google, Facebook, PalTalk, YouTube, Skype, AOL, Apple — Services with active PRISM data collection as of 2013.

Summary

TLS protects data in transit. It doesn’t protect you or the server from making bad decisions with the data, but it means that nobody can read the data while it’s in transit, unless the algorithm you’re using is compromised.

This is a description of the RSA method. There is also the Diffie-Hellman method, which works a little differently. It’s less frequently used than RSA. If you’d like to read about it, you can do so here.
This has nothing to do with the USA’s Central Intelligence Agency