Decomplexification

(Clearly a much better word than simplification.)

I believe we generally accept the truth that we should write simple and easy to read code in order to make it harder to create bugs and cause security problems. The more complicated code we write, the easier it gets to slip up, misunderstand or forget something along the line.

And yet, at the same time, over time functions tend to grow and become more and more complicated as we address edge cases and add new funky features we did not anticipate when we first created the code many decades ago.

Complexity

Cyclomatic complexity is a metric used to indicate the complexity of a program. You can click the link there and read all the fine details, but it boils down to: a higher number means a more complex function. A function that contains many statements and code paths.

There is this fine old command line tool called pmccabe that is able to scan C code and output a summary of all functions and their corresponding complexity scores.

Invoking this tool on your own C code is a perfect way to get a toplist of functions possibly in need of refactor. While of course the idea of what a complex function is and exactly how to count the score is not entirely objective, I believe this method works to a decently sufficient degree.

curl

Last year I created a graph to the curl dashboard that shows the complexity scores of the worst function in curl as well as the 99th percentile. Later, I also added a plot for 90th percentile.

This graph shows how the worst complexity in curl has shifted over time – and like always when there is a graph or you measure something – suddenly we get the urge to do something about the bad look of that graph. Because it looked bad.

The worst

I grabbed my scalpel and refactored a few of the most complex functions we had, and I basically halved the complexity score of the worst-in-curl functions. The steep drop at the right side of the graph felt nice.

I left the state there then for a while, quite pleased with having at least improved the state of things.

A few months later I returned back to the topic. I figured we could do more as the worst was still quite bad. We should have a goal set to extinguish (improve really) all functions in the curl code with a score higher than N.

A goal

In my mail to the team I proposed the acceptable complexity limit to be 100, which is not super aggressive. When I sent the email, there were seven functions ranked over 100 and the worst offender scored 196. After all, only a couple of months earlier, the worst function had scored over 350.

Maybe we could start with 100 as a max and lower it going forward if that works?

To get additional visualization of the curl code complexity (and ideally how we improve the situation) I also created two more graphs for the dashboard.

Graph complexity distribution

The first one gets the function’s complexity score for every line of source code and then shows how large percentage of the source code that has which complexity scores. The ideal of course being that almost the entire thing should have low scores.

This graph shows 100% of the source code, independent of its size at any given time because I think that is what is relevant: the complexity distribution at any particular point in time independent of the size. The size of the code has grown almost linearly all through the period this graph shows, so of course 50% of the code in 2010 was much less code than what 50% is today.

This graph shows how we have had periods of quite a lot of code with complexity over 200 and that today we finally have erased all complexity above 100. It’s a little hard to see in the graph, but the yellow field goes all the way up as of May 28 2025.

Graph average complexity

The second graph would get the same complexity score per source code line, and then calculate the average complexity score of all lines of code at that point in time. Ideally, that line should shrink over time.

It now shows a rather steep drop in mid 2025 after our latest efforts. The average complexity has more than halved since 2022.

Analyzers like it too

Static code analyzers also get better results and fewer false positives when they get to work with smaller and simpler functions. It too helps produce better code.

Refactors could shake things up

Of course, refactoring a complex function into several smaller and simpler functions can be anywhere from straight forward to quite complicated. A refactor in the name of simplification that might be hard. An oxymoron and one that of course might shake things up and could potentially rather add bugs than fix them.

Doing this of course needs to be done with care and there needs to be a solid test suite around the functions to validate that most of the functionality is still there and with the same behavior as before the refactor.

Function length

The most complex functions are also the longest, as there is a strong correlation. For that reason, I also produce a graph for the worst and the 99th percentile function lengths used in curl source code.

(something is wrong in this graph, as the P99 cannot be higher than the worst but the plot seems to indicate it was that in late 2024?)

A CI job to keep us honest

To make absolutely sure not a single function accidentally increases complexity above the permitted level in a pull-request, we created a script that makes a CI job turn red if any function goes over 100 in the complexity check. It is now in place. Maybe we can lower the limit going forward?

Towards the goal

The goal is not so much a goal as a process. An attempt to make us write simpler code, which in turn should help us write better and more secure code. Let’s see where we are in ten years!

As of this writing, here is the toplist of the most complex functions in curl right now. The ones with scores over 70:

100   lib/vssh/libssh.c:myssh_statemach_act
99    lib/setopt.c:setopt_long
92    lib/urlapi.c:curl_url_get
91    lib/ftplistparser.c:parse_unix
88    lib/http.c:http_header
83    src/tool_operate.c:single_transfer
80    src/config2setopts.c:config2setopts
79    lib/setopt.c:setopt_cptr
79    lib/vtls/openssl.c:cert_stuff
75    src/tool_cb_wrt.c:tool_write_cb
73    lib/socks.c:do_SOCKS5
72    lib/vssh/wolfssh.c:wssh_statemach_act
71    lib/vtls/wolfssl.c:Curl_wssl_ctx_init
71    lib/rtsp.c:rtsp_do
71    lib/socks_sspi.c:Curl_SOCKS5_gssapi_negotiate

This is just a snapshot of the moment. I hope things will continue to improve going forward. If even a little slower perhaps as we now have fixed all the most terrible cases.

Everything is public

All the scripts for this, the graphs shown, and the data behind them all are of course publicly available.

9 thoughts on “Decomplexification”

Kovács György says:

May 31, 2025 at 03:56

Ratcheted improvements are a nice benefit of simply having something on a graph – might as well improve the numbers, just because they’re there. But as Goodhart’s law cautions: “When a measure becomes a target, it ceases to be a good measure.”

I can also see that being the case with optimizing ruthlessly for cyclomatic complexity on a per-function level. Splitting a 100 CC function into 20 smaller 5 CC functions each would improve these metrics, but wouldn’t necessarily make things easier to understand. The total system complexity is not reduced, but following the logic through several layers of a call stack can itself pose a problem.

I wonder, do you have some sort of metric to counterbalance that sort of approach? So you’re still driven to reduce cyclomatic complexity – but not at *all* costs.
1. Daniel Stenberg says:
  
  May 31, 2025 at 14:05
  
  @Kovács György: we don’t have tools for combating that, we just have the plain old manual review and common sense. I mean, that would not be the only way you can write totally crazy code. We still need to make sure that code is sensible and readable and we do this primarily the plain old human way.
  1. Waleed Khan says:
    
    June 1, 2025 at 09:24
    
    I’m curious to know: have you found that your high-complexity functions had a higher density of bugs, and do you think your refactors would have prevented those bugs?
    1. Daniel Stenberg says:
      
      June 1, 2025 at 18:04
      
      @Waleed: I don’t think the answers are that easy. I think bugs happen all over, and they typically happen when we lose context or mix things up. Complex functions make it more likely that a reader does one of those things. The actual bugs can then end up in a function called from the complex function or in a function that handles the return value from one. It is not as easy as to just say that bugs always happen *in* the complex functions.
      
      This is more pro-active than reactive work.
Willy says:

May 31, 2025 at 07:31

Thanks for sharing, Daniel, that’s super interesting. I guess in haproxy we have several very bad scores (config parsers, config checker, etc). Usually we know them. Some have been split, but splitting can be equally hard, or render the code less readable when not done along the right paths (typically when you have to create single-use functions whose name and/or purpose is vague because they’re only there to split a long one), and may even introduce new bugs. But I agree that having scores can give hints about what to look at, and the longer you look, the more ideas come to mind about how to address the “problem”. I think we’ll eventually give this a try.
1. Daniel Stenberg says:
  
  May 31, 2025 at 13:55
  
  @Willy: for several of our high-complexity functions I initially first thought that I could not possibly do them better because they need something that made them that way, but for all the cases so far, they just needed time and more “background processing” for a while in back of some heads and then later we would figure out ways to splice them into more chewable chunks. Of course that might not always apply to everything.
Lars says:

June 3, 2025 at 18:57

For comparison, the generally accepted limit in the automotive industry is a maximum cyclomatic complexity of 10 (search for HIS metrics). This is especially important for safety critical software.

I like this metric since it’s simple and it gives you an indication of which functions are too complex. For small functions it’s easy to calculate the metric in your head (it’s more or less just the sum of the number of if statements and loops).
1. Daniel Stenberg says:
  
  June 4, 2025 at 13:15
  
  @Lars: for readers, it could be valuable to mention that “the automotive industry” in that case is a very narrow part of what code actually runs in vehicles, since basically all modern cars run a full Linux-based stack with curl already, with quite a lot of code with higher complexity than 10, for infotainment and more.
codr7 says:

June 5, 2025 at 18:43

Keep up the good work!

Whenever I see something posted by the curl team, I know I’m in for a refreshing dose of common sense.

Much needed these days, with all the AI bullshit flying.