The danger of variable length arrays in C99+

December 07, 2015

I have recently encountered several significant bugs in C codes with a common source: variable length arrays (VLAs). The bug ultimately arises from assumptions made years ago about how big a VLA could get, which would later be violated - leading to undefined behavior in C. The tricky part about this kind of bug is that there is no way within C to check for successful allocation of a VLA; incorrect VLA construction simply is undefined behavior leading usually to crashing code.

What are variable length arrays?

Variable length arrays fill in a gap with traditional C arrays, as mentioned by this Dr. Dobb’s article. The gap was the inability to declare arrays with non-constant bounds that have scope-based allocation. This is best explained with a (silly) example:

void foo(int len){
  if(len==50){
    float vals[50];
  }/*After leaving this scope: vals deallocates.*/
  else{
    float vals[len];
  }/*Same here.*/
}

It used to be, before C99, that only the immediate block after the if(len==50) made sense according to the C standard. After C99 however it became legal to also use non-constant length to declare an array. This however can be very problematic.

What’s the problem?

Although the C standard does not even mention the word stack, the notion of a stack is now basically inseparable for scoped variables in C, and for reasons outside of my experience or understanding the stack used for this kind of allocation in C is almost always limited. Very limited.

In BASH the stack size can be found using ulimit -s, for me this command tells me my stack size is 8192 Kilobytes. Now I will devise a simple C program which crashes using variable length arrays

int main(int argc,char** argv){
  int len=0;
  if(argc>1)len=atoi(argv[1]);
  else{printf("Supply array lengh\n"); return 1;}

  char bad[len];

  printf("%c\n",bad[len-1]);/*Keep compiler from optimizing bad away.*/
  return 0;
}

running this on my machine with 8192*1024=8388608 bytes gives a seg fault.

Conclusion

Although one can be extra careful with VLAs and use them to great effect in some cases, I find their use causes trouble down the road. The biggest problem is that one can not even check for failure as they could with the slightly more verbose malloc’d memory. Assumptions in the size of an array could be broken two years after writing perfectly legal C using VLAs, leading to possibly very difficult to find issues in the code. Worse, your assumptions may be perfectly correct, but the stack size on different systems may still break the VLA-using code by being smaller than on your development machine. When I need variable length I just use malloc, or if in C++ I use std::vector.