{"id":12086,"date":"2019-03-12T22:49:36","date_gmt":"2019-03-12T21:49:36","guid":{"rendered":"https:\/\/daniel.haxx.se\/blog\/?p=12086"},"modified":"2019-03-13T09:54:35","modified_gmt":"2019-03-13T08:54:35","slug":"looking-for-the-refresh-header","status":"publish","type":"post","link":"https:\/\/daniel.haxx.se\/blog\/2019\/03\/12\/looking-for-the-refresh-header\/","title":{"rendered":"Looking for the Refresh header"},"content":{"rendered":"\n<p>The other day someone filed a bug on curl that <a href=\"https:\/\/github.com\/curl\/curl\/issues\/3657\">we don&#8217;t support redirects with the Refresh header<\/a>. This took me down a rabbit hole of Refresh header research and I&#8217;ve returned to share with you what I learned down there.<\/p>\n\n\n\n<p>tl;dr Refresh is not a standard HTTP header.<\/p>\n\n\n\n<p>As you know, an HTTP redirect is specified to <a href=\"https:\/\/tools.ietf.org\/html\/rfc7231#section-7.1.2\">use a 3xx response code and a Location: header<\/a> to point out the new URL (I use the term URL here but you know what I mean). This has been the case since RFC 1945 (HTTP\/1.0). According to <a href=\"https:\/\/lists.w3.org\/Archives\/Public\/ietf-http-wg-old\/1996MayAug\/0594.html\">an old mail from Roy T Fielding<\/a> (dated June 1996), Refresh &#8220;didn&#8217;t make it&#8221; into that spec. That was the first &#8220;real&#8221; HTTP specification. (And <a href=\"https:\/\/www.w3.org\/Protocols\/HTTP\/AsImplemented.html\">the HTTP we used before 1.0<\/a> didn&#8217;t even have headers!)<\/p>\n\n\n\n<p>The little detail that it never made it into the 1.0 spec or any later one, doesn&#8217;t  seem to have affected the browsers. Still today, <a href=\"http:\/\/www.otsukare.info\/2015\/03\/26\/refresh-http-header\">browsers keep supporting the Refresh header<\/a> as a sort of Location: replacement even though it seems to never have been present in a HTTP spec.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">In good company<\/h2>\n\n\n\n<p>curl is not the only HTTP library that doesn&#8217;t support this non-standard header. The popular python library requests apparently doesn&#8217;t according to <a href=\"https:\/\/github.com\/request\/request\/issues\/2720\">this bug from 2017<\/a>, and <a href=\"https:\/\/github.com\/request\/request\/issues\/92\">another bug was filed<\/a> about it already back in 2011 but it was just closed as &#8220;old&#8221; in 2014.<\/p>\n\n\n\n<p>I&#8217;ve found no support in wget or wget2 either for this header.<\/p>\n\n\n\n<p>I didn&#8217;t do any further extensive search for other toolkits&#8217; support, but it seems that the browsers are fairly alone in supporting this header.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">How common is the the Refresh header?<\/h2>\n\n\n\n<p>I decided to make an attempt to figure out, and for this venture I used <a href=\"https:\/\/opendata.rapid7.com\/sonar.http\/\">the Rapid7 data trove<\/a>. The method that data is collected with may not be the best &#8211; it scans the IPv4 address range and sends a HTTP request to each TCP port 80, setting the IP address in the Host: header. The result of that scan is 52+ million HTTP responses from different and current HTTP origins. (Exactly 52254873 responses in my 59GB data dump, dated end of February 2019).<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Results from my scans<\/h4>\n\n\n\n<ul class=\"wp-block-list\"><li>Location is used in 18.49% of the responses<\/li><li>Refresh is used in 0.01738% of the responses (exactly 9080 responses featured them)<\/li><li>Location is thus used 1064 times more often than Refresh<\/li><li>In 35% of the cases when Refresh is used, Location is <em>also<\/em> used<\/li><li>curl thus handles 99.9939% of the redirects in this test<\/li><\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Additional notes<\/h4>\n\n\n\n<ul class=\"wp-block-list\"><li>When Refresh is the <em>only<\/em> redirect header, the response code is usually 200 (with 404 being the second most)<\/li><li>When both headers are used, the response code is almost always 30x<\/li><li>When both are used, it is common to redirect to the same target and it is also common for the Refresh header value to only contain a number (for the number of seconds until &#8220;refresh&#8221;).<\/li><\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Refresh from HTML content<\/h4>\n\n\n\n<p>Redirects can also be done by meta tags in HTML and sending the refresh that way, but I have not investigated how common as that isn&#8217;t strictly speaking HTTP so it is outside of my research (and interest) here.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">In use, not documented, not in the spec<\/h2>\n\n\n\n<p>Just another undocumented corner of the web. <\/p>\n\n\n\n<p>When I <a href=\"https:\/\/lists.w3.org\/Archives\/Public\/ietf-http-wg\/2019JanMar\/0197.html\">posted about these findings on the HTTPbis mailing list<\/a>, it was pointed out that WHATWG <a href=\"https:\/\/html.spec.whatwg.org\/multipage\/iana.html#refresh\">mentions this header in their iana page<\/a>. I say <em>mention<\/em> because calling that <em>documenting<\/em> would be a stretch&#8230;<\/p>\n\n\n\n<p>It is not at all clear exactly what the header is supposed to do and it is not documented anywhere. It&#8217;s not exactly a redirect, but almost?<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Will\/should curl support it?<\/h2>\n\n\n\n<p>A decision hasn&#8217;t been made about it yet. With such a very low use frequency and since we&#8217;ve managed fine without support for it so long, maybe we can just maintain the situation and instead argue that we should just completely deprecate this header use from the web?<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Updates<\/h2>\n\n\n\n<p>After this post first went live, I got some further feedback and data that are relevant and interesting.<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Yoav Wiess created <a href=\"https:\/\/chromium-review.googlesource.com\/c\/chromium\/src\/+\/1520146\">a patch for Chrome<\/a> to count how often they see this header used in real life.<\/li><li>Eric Lawrence pointed out that <a href=\"https:\/\/twitter.com\/ericlaw\/status\/1105594709790072832\">IE had several incompatibilities<\/a> in its Refresh parser back in the day.<\/li><li>Boris pointed out (in the comments below) the WHATWG documented steps for handling the header.<\/li><li>The use of &lt;meta> tag refresh in contents is fairly high. <a href=\"https:\/\/www.chromestatus.com\/metrics\/feature\/timeline\/popularity\/1548\">The Chrome counter<\/a> says almost 4% of page loads!<\/li><\/ul>\n","protected":false},"excerpt":{"rendered":"<p>The other day someone filed a bug on curl that we don&#8217;t support redirects with the Refresh header. This took me down a rabbit hole of Refresh header research and I&#8217;ve returned to share with you what I learned down there. tl;dr Refresh is not a standard HTTP header. As you know, an HTTP redirect &hellip; <a href=\"https:\/\/daniel.haxx.se\/blog\/2019\/03\/12\/looking-for-the-refresh-header\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Looking for the Refresh header<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":5,"featured_media":12106,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7,13,45],"tags":[33,230],"class_list":["post-12086","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-curl","category-net","category-web","tag-curl-and-libcurl","tag-http"],"_links":{"self":[{"href":"https:\/\/daniel.haxx.se\/blog\/wp-json\/wp\/v2\/posts\/12086","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/daniel.haxx.se\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/daniel.haxx.se\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/daniel.haxx.se\/blog\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/daniel.haxx.se\/blog\/wp-json\/wp\/v2\/comments?post=12086"}],"version-history":[{"count":22,"href":"https:\/\/daniel.haxx.se\/blog\/wp-json\/wp\/v2\/posts\/12086\/revisions"}],"predecessor-version":[{"id":12112,"href":"https:\/\/daniel.haxx.se\/blog\/wp-json\/wp\/v2\/posts\/12086\/revisions\/12112"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/daniel.haxx.se\/blog\/wp-json\/wp\/v2\/media\/12106"}],"wp:attachment":[{"href":"https:\/\/daniel.haxx.se\/blog\/wp-json\/wp\/v2\/media?parent=12086"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/daniel.haxx.se\/blog\/wp-json\/wp\/v2\/categories?post=12086"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/daniel.haxx.se\/blog\/wp-json\/wp\/v2\/tags?post=12086"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}