One of the many areas that the kernel self protection project
looks at is making sure kernel developers are using APIs correctly and safely.
The string APIs, in particular string copying APIs, seem to be one area that
gets developers confused. Strings in C aren’t real1 in that there isn’t a
proper string type. For the purposes of this discussion, a C string is an
array of characters with a terminating
One of the obvious operations a programmer would want to do is copy a string.
There’s an API
strcpy to do so:
char *strcpy(char *dest, const char *src);
From the man page:
The strcpy() function copies the string pointed to by src, including the terminating null byte ('\0'), to the buffer pointed to by dest. The strings may not overlap, and the destination string dest must be large enough to receive the copy. Beware of buffer overruns! (See BUGS.)
That last sentence is important and the source of numerous bugs. Because C
strings don’t have an inherent length associated with them, it’s up to the
programmer to know/check the length everywhere. Otherwise, you may end up
copying bytes outside the
dst buffer. This is pretty annoying and
error prone so there’s another API,
char *strncpy(char *dest, const char *src, size_t n);
This one takes a length parameter so it’s getting better. From the man page:
The strncpy() function is similar, except that at most n bytes of src are copied. Warning: If there is no null byte among the first n bytes of src, the string placed in dest will not be null-terminated. If the length of src is less than n, strncpy() writes additional null bytes to dest to ensure that a total of n bytes are written.
That last sentence in the first paragraph is, again, important. If your
string is greater than
n your buffer will not be
NUL terminated. You may not
have written beyond the buffer but the next time you access the string at
C will happily look in the next memory area until it sees a
It’s also pretty easy to run into some anti-patterns with
strncpy. If you
don’t specify the bound on
n correctly, it’s possible to overrun the buffer.
If your bound for
n is a function of your
src string, you haven’t solved
anything. gcc has started to warn
on some of these issues which is helpful (if annoying to clean up).
size_t strlcpy(char *dst, const char *src, size_t size);
I couldn’t quite find the full history but this one seems to be derived from
BSD. From the kernel’s
Compatible with ``*BSD``: the result is always a valid NUL-terminated string that fits in the buffer (unless, of course, the buffer size is zero). It does not pad out the result like strncpy() does.
strlcpy will solve the truncation issue but will not pad the buffer. The
padding may or may not be behavior that’s wanted.
strlcpy in the kernel
also has the implementation detail of calling
strlen(src) which means that
you will always be reading the entire string length even if you only specify
a subset of the string to be copied. This shouldn’t matter for most uses but
there may be cases which could result in reading memory unexpectedly if
There’s also strscpy
which was introduced in 2015 and is designed to be a combination of both
strlcpy. This was not without controversy
but today the API is frequently preferred over either
More important than a general rule of “You should always use
strscpy” is to
make sure you understand what all the APIs do. There may be cases where
it is appropriate to just use
strcpy or you want the behavior of
strlcpy. If you’re doing something unusual, please document your code
for the benefit of others.
C strings are about as real as Linux containers. ↩