What I Learned at Work this Week: Accept-Language HTTP Header
A big part of my company’s software involves collecting data submitted by users who sign up for our service on a client site. For example, a user can submit their email, then our client can use our service to…send them an email. But sometimes they might also want that email saved by another piece of software they’re using, like Salesforce. And so my company provides integrations with other popular email service providers where, among other things, we’ll send them a network request when we save an email.
A few weeks ago, I was asked to make a small change to one of these integration payloads for a client who was seeing 401 errors coming back from their requests. I made the one-line change, deployed my code, and watched the logs to see…the same number of errors before. Not my problem, I thought. These errors have been consistent since before my change, so my update hasn’t had any adverse effect. Good luck figuring this out!
And so as I probably deserved, last week I got another ticket for the same client asking me to make another change to their payload. The reporter provided a curl command that he had successfully tested with Postman, so it definitely worked. This was no longer a one-line change, but a total rewrite of the module that creates the body and head of the request. I worked off-and-on for days tweaking and testing my code, trying to figure out why my payload wasn’t generating the same results as the sample. Finally, after I asked for help, a teammate pointed out that I might have an incorrect value for the Accept-Language header.
This could have been a very short blog post. MDN has a very helpful page for this header that only takes a few minutes to read. But after going through it, I still find myself confused about a few points, so let’s try to dive deeper. But first: what is Accept-Language?
When we’re sending an HTTP request, it is expected that we will add headers to denote certain qualities about the message or modify the request. Common examples include Content-Type (which describes the format of the request), User-Agent (the source of the request), and of course Accept-Language. Per MDN, the Accept-Language request HTTP header indicates the natural language and locale that the client prefers. Think of a client as our browser. If we are in the US, a Google search will send a request to Google servers and return results in English. If we travel to Mexico and run the same search on a public computer (our browsers usually cache our native language despite location), the servers should return Spanish results. This works because the requests we (the client) are making likely have an updated Accept-Language header, which was determined based on our location.
That single sentence was probably enough to teach me what I needed to know about Accept-Language, but of course I had to keep reading. The rest of the paragraph isn’t as easy for me to understand.
The explanation continues:
The server uses content negotiation to select one of the proposals and informs the client of the choice with the Content-Language response header.
Awesome. So what exactly is content negotiation? Here’s a diagram from the MDN documentation:
Step 1 is our request in the form of a URL, which contains headers such as Accept-Language. When our URL reaches the server, the parameters in the request will determine which representation of the requested resource we are returned. In the case of Accept-Language, we can assume there are different versions for different languages. If we do request a language that is not available, we may receive a 406 error (Not Acceptable). Nothing too surprising here.
The documentation called out some specific guidelines around the use of Accept-Language. The value can be set a few different ways and may be overridden by default values in our browser. Despite some nifty functionality for determining the most likely preference for language, web designers are encouraged to provide manual options for language on their page. Once that preference has been set, the site should stick to it. If this isn’t explicitly coded out, each page refresh could revert back to the browser’s default language.
The Content Negotiation documentation warned against fingerprinting, which happened to also be included in the main page for Accept-Language:
Browsers set required values for this header according to their active user interface language. Users rarely change it, and such changes are not recommended because they may lead to fingerprinting.
We know that our browser will likely determine the value for Accept-Language while we’re surfing the web. Though it might be tempting to alter it, that can apparently trigger a browser fingerprint, which works like a cookie in that websites can mine it for data. Technically speaking, it’s different in that cookies are downloaded during browsing but a fingerprint is referenced directly from browser metadata. But practically speaking, security-minded individuals should be cautious about both. Hence MDN’s advice to avoid changing a browser’s default language.
Now that we’re finally past the first paragraph in the documentation, we can take a look at the syntax for our header. Here are the examples from MDN:
The first example uses de to denote a request for German content (find all abbreviation options here). The second example is more specific, as it uses the CH subtag to denote that we want the Swiss regional variant of German (we can find country codes here). Finally, in the third example we’re presented with a priorities list.
If we’re concerned that our primary option might return an error, we can list multiple choices for the Accept-Language header. The first here is en-US, which means English with a regional variant from the United States. Separated from that by a comma is en;q=0.5. After the en, which still represents English (but no specific region), we get a semicolon with q=. This denotes the priority of the option using Quality Values syntax. The higher the value, the higher the option’s priority.
But how can we determine priority if only one of our two options is given a quality value? The first option in a list will default to a value of 1, so en-US will be attempted first, then simply en if that fails.
Finding my 200
Once I had a better understanding of Accept-Language, I continued to tweak my payload to try and find a combination that would return a success message. It turns out that the server API didn’t like my use of a wildcard for a language value:
This syntax is technically correct to indicate that any language is fine. But the API was written for a header that used en, which I learned after one final update. It’s always a special feeling when you check your logs and stop seeing red bars populate. We made it!