<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>daniel.haxx.se &#187; tcpip</title>
	<atom:link href="http://daniel.haxx.se/blog/tag/tcpip/feed/" rel="self" type="application/rss+xml" />
	<link>http://daniel.haxx.se/blog</link>
	<description>Technology is life</description>
	<lastBuildDate>Fri, 27 Jan 2012 22:10:31 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.3</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>getaddrinfo with round robin DNS and happy eyeballs</title>
		<link>http://daniel.haxx.se/blog/2012/01/03/getaddrinfo-with-round-robin-dns-and-happy-eyeballs/</link>
		<comments>http://daniel.haxx.se/blog/2012/01/03/getaddrinfo-with-round-robin-dns-and-happy-eyeballs/#comments</comments>
		<pubDate>Tue, 03 Jan 2012 22:15:21 +0000</pubDate>
		<dc:creator>daniel</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Network]]></category>
		<category><![CDATA[c-ares]]></category>
		<category><![CDATA[cURL and libcurl]]></category>
		<category><![CDATA[dns]]></category>
		<category><![CDATA[ipv6]]></category>
		<category><![CDATA[POSIX]]></category>
		<category><![CDATA[sockets]]></category>
		<category><![CDATA[tcpip]]></category>

		<guid isPermaLink="false">http://daniel.haxx.se/blog/?p=3439</guid>
		<description><![CDATA[This is not news. This is only facts that seem to still be unknown to many people so I just want to help out documenting this to help educate the world. I&#8217;ll dance around the subject first a bit by providing the full background info&#8230;
round robin basics
Round robin DNS has been the way since a [...]]]></description>
			<content:encoded><![CDATA[<p>This is not news. This is only facts that seem to still be unknown to many people so I just want to help out documenting this to help educate the world. I&#8217;ll dance around the subject first a bit by providing the full background info&#8230;</p>
<h2>round robin basics</h2>
<p><a href="http://en.wikipedia.org/wiki/Round-robin_DNS">Round robin DNS</a> has been the way since a long time back to get some rough and cheap load-balancing and spreading out visitors over multiple hosts when they try to use a single host/service with static content. By setting up an <a href="http://rscott.org/dns/a.html">A entry</a> in a DNS zone to resolve to multiple IP addresses, clients would get different results in a semi-random manner and thus hitting different servers at different times:</p>
<blockquote>
<pre>server     A   192.168.0.1
server     A   10.0.0.1
server     A   127.0.0.1</pre>
</blockquote>
<p>For example, if you&#8217;re a small open source project it makes a perfect way to feature a distributed service that appears with a single name but is hosted by multiple distributed independent servers across the Internet.  It is also used by high profile web servers, like for example www.google.com and www.yahoo.com.</p>
<h2>host name resolving</h2>
<p>If you&#8217;re an old-school hacker, if you learned to do socket and TCP/IP programming from the original Stevens&#8217; books and if you were brought up on BSD unix you learned that you resolve host names with <a href="http://pubs.opengroup.org/onlinepubs/009695399/functions/gethostbyname.html">gethostbyname</a>() and friends. This is a POSIX and single unix specification that&#8217;s been around since basically forever.  When calling <em>gethostbyname()</em> on a given round robin host name, the function returns an array of addresses. That list of addresses will be in a seemingly random order. If an application just iterates over the list and connects to them in the order as received, the round robin concept works perfectly well.</p>
<h2>but gethostbyname wasn&#8217;t good enough</h2>
<p>gethostbyname() is really IPv4-focused. The mere whisper of IPv6 makes it break down and cry. It had to be replaced by something better.  Enter <a href="http://pubs.opengroup.org/onlinepubs/009604499/functions/getaddrinfo.html">getaddrinfo</a>() also POSIX (and defined in <a href="http://www.ietf.org/rfc/rfc3493.txt">RFC 3943</a> and again updated in <a href="http://www.ietf.org/rfc/rfc5014.txt">RFC 5014</a>). This is the modern function that supports IPv6 and more. It is the shiny thing the world needed!</p>
<h2>not a drop-in replacement</h2>
<p>So the (good parts of the) world replaced all calls to gethostbyname() with calls to getaddrinfo() and everything now supported IPv6 and things were all dandy and fine?  Not exactly.  Because there were subtleties involved. Like in which order these functions return addresses. In 2003 the IETF guys had shipped <a href="http://www.ietf.org/rfc/rfc3484.txt">RFC 3484</a> detailing <em>Default Address Selection for Internet Protocol version 6</em>, and using that as guideline most (all?) implementations were now changed to return the list of addresses in that order. It would then become a list of hosts in &#8220;preferred&#8221; order. Suddenly applications would iterate over both IPv4 and IPv6 addresses and do it in an order that would be clever from an IPv6 upgrade-path perspective.</p>
<h2>no round robin with getaddrinfo</h2>
<p>So, back to the good old way to do round robin DNS: multiple addresses (be it IPv4 or IPv6 or both). With the new ideas of how to return addresses this load balancing way no longer works. Now getaddrinfo() returns basically the same order in every invoke. I noticed this back in 2005 and posted a question on the glibc hackers mailinglist: <a href="http://www.cygwin.com/ml/libc-alpha/2005-11/msg00028.html">http://www.cygwin.com/ml/libc-alpha/2005-11/msg00028.html</a> As you can see, my question was delightfully ignored and nobody ever responded. The order seems to be dictated mostly by the above mentioned RFCs and the local <a href="http://linux.die.net/man/5/gai.conf">/etc/gai.conf</a> file, but neither is helpful if getting decent round robin is your aim.  Others <a href="http://lists.debian.org/debian-ctte/2007/09/msg00035.html">have noticed this flaw</a> <a href="http://www.mail-archive.com/wget@sunsite.dk/msg09237.html">as well</a> and some have fought compassionately arguing that this is a bad thing, while of course there&#8217;s an opposite side with people claiming it is the right behavior and that doing round robin DNS like this was a bad idea to start with anyway.  The impact on a large amount of common utilities is simply that <strong>when they go IPv6-enabled, they also at the same time go round-robin-DNS disabled</strong>.</p>
<h2>no decent fix</h2>
<p>Since getaddrinfo() now has worked like this for almost a decade, we can forget about &#8220;fixing&#8221; it. Since gai.conf needs local edits to provide a different function response it is not an answer.  But perhaps worse is, since getaddrinfo() is now made to return the addresses in a sort of order of preference it is hard to &#8220;glue on&#8221; a layer on top that simple shuffles the returned results. Such a shuffle would need to take IP versions and more into account. And it would become application-specific and thus would have to be applied to one program at a time.  The popular browsers seem less affected by this getaddrinfo drawback. My guess is that because they&#8217;ve already worked on making asynchronous name resolves so that name resolving doesn&#8217;t lock up their processes, they have taken different approaches and thus have their own code for this.  In <a href="http://curl.haxx.se/">curl&#8217;s</a> case, it can be built with <a href="http://c-ares.haxx.se/">c-ares</a> as a resolver backend even when supporting IPv6, and c-ares does not offer the sort feature of getaddrinfo and thus in these cases curl will work with round robin DNSes much more like it did when it used gethostbyname.</p>
<h2>alternatives</h2>
<p>The downside with all alternatives I&#8217;m aware of is that they aren&#8217;t just taking advantage of plain DNS. In order to duck for the problems I&#8217;ve mentioned, you can instead tweak your DNS server to respond differently to different users. That way you can either just randomly respond different addresses in a round robin fashion, or you can try to make it more clever by things such as <a href="http://www.powerdns.com/content/home-powerdns.html">PowerDNS&#8217;s</a> geobackend feature. Of course we all know that A) <a href="http://en.wikipedia.org/wiki/Geotargeting">geoip</a> is crude and often wrong and B) your real-world geography does not match your network topology.</p>
<h2>happy eyeballs</h2>
<p>During this period, another connection related issue has surfaced. The fact that IPv6 connections are often handled as a second option in dual-stacked machines, and the fact is that IPv6 is mostly present in dual stacks these days. This sadly punishes early adopters of IPv6 (yes, they unfortunately IPv6 must still be considered early) since those services will then be slower than the older IPv4-only ones.</p>
<p>There seems to be a general consensus on what the way to overcome this problem is: the <a href="http://tools.ietf.org/html/draft-ietf-v6ops-happy-eyeballs-07">Happy Eyeballs approach</a>. In short (and simplified) it recommends that we try both (or all) options at once, and the fastest to respond wins and gets to be used. This requires that we resolve A and AAAA names at once, and if we get responses to both, we connec() to both the IPv4 and IPv6 addresses and see which one is the fastest to connect.</p>
<p>This of course is not just a matter of replacing a function or two anymore. To implement this approach you need to do something completely new. Like for example just doing getaddrinfo() + looping over addresses and try connect() won&#8217;t at all work. You would basically either start two threads and do the IPv4-only route in one and do the IPv6 route in the other, <em>or</em> you would have to issue non-blocking resolver calls to do A and AAAA resolves in parallel in the same thread and when the first response arrives you fire off a non-blocking connect() &#8230;</p>
<p>My point being that introducing Happy Eyeballs in your good old socket app will require some rather major remodeling no matter what. Doing this will most likely also affect how your application handles with round robin DNS so now you have a chance to reconsider your choices and code!</p>
]]></content:encoded>
			<wfw:commentRss>http://daniel.haxx.se/blog/2012/01/03/getaddrinfo-with-round-robin-dns-and-happy-eyeballs/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>I&#8217;ll talk at FSCONS 2010</title>
		<link>http://daniel.haxx.se/blog/2010/07/29/ill-talk-at-fscons-2010/</link>
		<comments>http://daniel.haxx.se/blog/2010/07/29/ill-talk-at-fscons-2010/#comments</comments>
		<pubDate>Thu, 29 Jul 2010 21:07:26 +0000</pubDate>
		<dc:creator>daniel</dc:creator>
				<category><![CDATA[Network]]></category>
		<category><![CDATA[cURL and libcurl]]></category>
		<category><![CDATA[FSCONS]]></category>
		<category><![CDATA[HTTP]]></category>
		<category><![CDATA[talk]]></category>
		<category><![CDATA[tcpip]]></category>

		<guid isPermaLink="false">http://daniel.haxx.se/blog/?p=1879</guid>
		<description><![CDATA[Recently I was informed that I got two talks accepted to the FSCONS 2010 conference, to be held in the beginning of November 2010.
My talks will be about the Future and current state of internet transport protocols (TCP, HTTP, SPDY, WebSockets, SCTP and more) and on High performance multi-protocol applications with libcurl, which will educate [...]]]></description>
			<content:encoded><![CDATA[<p>Recently I was informed that I got two talks accepted to the <a href="http://www.fscons.org/">FSCONS 2010</a> conference, to be held in the beginning of November 2010.</p>
<p>My talks will be about the <em>Future and current state of internet transport protocols</em> (TCP, HTTP, SPDY, WebSockets, SCTP and more) and on <em>High performance multi-protocol applications with <a href="http://curl.haxx.se/libcurl/">libcurl</a></em>, which will educate the audience on how to use libcurl when doing high performance clients with potentially a very large number of simultaneous transfers. A somewhat clueful reader will of course spot that these two talks have a lot in common, and yeah they do reveal a lot of what I do and what I like and what I poke on these days. I hope I&#8217;ll be able to put the light on some things not everyone is already perfectly aware of.</p>
<p>The <strong>talks will be held in English</strong>, and if the past FSCONS conferences tell anything, my talks will be video filmed and become available online afterward for the world to see if you have a funeral or something to attend to that prevents you from actually attending in person.</p>
<p>If you have thoughts, questions or anything on these topics that you would like to get answered in my talk, feel free to bring them up and I&#8217;ll see what I can do.</p>
<p>(If those fine guys and gals at FSCONS ever settled for a logo, or had one I could link to, I would&#8217;ve shown one of them right here.)</p>
]]></content:encoded>
			<wfw:commentRss>http://daniel.haxx.se/blog/2010/07/29/ill-talk-at-fscons-2010/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Multipath TCP</title>
		<link>http://daniel.haxx.se/blog/2009/07/31/multipath-tcp/</link>
		<comments>http://daniel.haxx.se/blog/2009/07/31/multipath-tcp/#comments</comments>
		<pubDate>Fri, 31 Jul 2009 21:31:54 +0000</pubDate>
		<dc:creator>daniel</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Network]]></category>
		<category><![CDATA[IETF]]></category>
		<category><![CDATA[MPTCP]]></category>
		<category><![CDATA[tcpip]]></category>

		<guid isPermaLink="false">http://daniel.haxx.se/blog/?p=986</guid>
		<description><![CDATA[During the IETF 75 meeting in Stockholm, there was this multipath tcp BOF (&#8221;start-up meeting&#8221; sort of) on Thursday morning that I visited.
Multipath TCP (shortened to MPTCP at times) is basically an idea to make everything look like TCP for both end points, but allow for additional TCP paths to get added and allow packets [...]]]></description>
			<content:encoded><![CDATA[<p>During the <a href="http://www.ietf75.se/">IETF 75</a> meeting in Stockholm, there was this multipath tcp BOF (&#8221;start-up meeting&#8221; sort of) on Thursday morning that I visited.</p>
<p>Multipath TCP (shortened to MPTCP at times) is basically an idea to make everything look like TCP for both end points, but allow for additional TCP paths to get added and allow packets to get routed over any of the added flows to overcome congestion and to select the routes where it flows &#8220;best&#8221;. The socket API would remain unmodified in both ends. The individual TCP paths would all look and work like regular TCP streams for the rest of the network. It is basically a way to introduce these new fancy features without breaking compatibility. Of course a big point of that is to keep functionality over NATs or other middle-boxes. (See <a href="http://trac.tools.ietf.org/area/tsv/trac/wiki/MpTcpBofDescription">full description</a>.)</p>
<p>The guys holding the BOF had already presented a fairly detailed draft how it could be designed both <a href="http://tools.ietf.org/html/draft-van-beijnum-1e-mp-tcp-00">one-ended</a> and <a href="http://tools.ietf.org/html/draft-ford-mptcp-multiaddressed-01">with multiple adresses</a>,  but could also boast with an already written implementation that was even demoed live in front of the audience.</p>
<p>The term &#8216;path&#8217; is basically used for a pair of address+port sets. I would personally rather call it &#8220;flow&#8221; or &#8220;stream&#8221; or something, as we cannot really control that the paths are separate from each other as those are entirely in the hands of those who route the IP packets to the destination.</p>
<p>They stressed that their goals here included:</p>
<ul>
<li>perform no worse than TCP would on the best of the single TCP paths</li>
<li>be no harder on the network than a single TCP flow would be, not even for single bottlenecks (network and bottleneck fairness)</li>
<li>allow <a href="http://www.cs.ucl.ac.uk/staff/d.wischik/Research/respool.html">resource pooling</a> over multiple TCP paths</li>
</ul>
<p>A perfect use-case for this is hosts with multiple interfaces. Like a mobile phone with 3G and wifi, as it could have a single TCP connection using paths over both interfaces, and it could even change paths along the way when you move to handover to new wifi access-points or when you plug in your Ethernet cable or whatever. Kind of like a solution to the mobile ip concept with roaming that was never made to actually work in the past.</p>
<p>The <a href="https://www.ietf.org/mailman/listinfo/multipathtcp">multipath tcp mailinglist</a> is already quite active, and it didn&#8217;t take long until possible flaws in the backwards compatibility have been discovered and are being discussed. Like if you use TCP to verify that a particular <strong>link</strong> is alive, MPTCP may in fact break that as the proposal is currently written.</p>
<p>What struck me as an interesting side-effect of this concept, is that if implemented it will separate packets from the same original stream further from each other and possibly make snooping on plain-text TCP traffic harder. Like in the case where you monitor traffic going through a router or similar.</p>
]]></content:encoded>
			<wfw:commentRss>http://daniel.haxx.se/blog/2009/07/31/multipath-tcp/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>HTTP over unix domain sockets</title>
		<link>http://daniel.haxx.se/blog/2008/04/14/http-over-unix-domain-sockets/</link>
		<comments>http://daniel.haxx.se/blog/2008/04/14/http-over-unix-domain-sockets/#comments</comments>
		<pubDate>Mon, 14 Apr 2008 09:40:51 +0000</pubDate>
		<dc:creator>daniel</dc:creator>
				<category><![CDATA[cURL and libcurl]]></category>
		<category><![CDATA[sockets]]></category>
		<category><![CDATA[tcpip]]></category>

		<guid isPermaLink="false">http://daniel.haxx.se/blog/2008/04/14/http-over-unix-domain-sockets/</guid>
		<description><![CDATA[On the libcurl list the discussion popped up how to best add support to do HTTP over unix domain sockets instead of TCP/IP and the discussion took off trying to figure out how to specify this the best way from an application&#8217;s point of view.
What&#8217;s your take on this?
]]></description>
			<content:encoded><![CDATA[<p>On the libcurl list the discussion popped up how to best add support to <a href="http://curl.haxx.se/mail/lib-2008-04/0279.html">do HTTP over unix domain sockets</a> instead of TCP/IP and the discussion took off trying to figure out how to specify this the best way from an application&#8217;s point of view.</p>
<p>What&#8217;s your take on this?</p>
]]></content:encoded>
			<wfw:commentRss>http://daniel.haxx.se/blog/2008/04/14/http-over-unix-domain-sockets/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

