inet   Frontpage - Services - Products - Support - Company - Contact - Blog - GDPR
Note that all opinions, views, etc. expressed in these blog pages reflect those of the authors, not necessarily those of Inferno Nettverk A/S.  

Valgrind and uninitialised reads debugging - take two

Posted by Michael Shuldman, Inferno Nettverk A/S, Norway on Mon Aug 11 14:51:40 MEST 2014

After the time I spent on documenting my previous approach to debugging uninitialised reads errors from Valgrind, I did not expect having to spend many hours once again understanding why Valgrind was printing the warning it did. It took Valgrind three years to prove me wrong.

==27040== Conditional jump or move depends on uninitialised value(s)
==27040==    at 0x4A0B152: bcmp (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==27040==    by 0x4AAB3E: compareconfigs (shmemconfig.c:793)
==27040==    by 0x4AE9FB: doconfigtest (shmemconfig.c:1114)
==27040==    by 0x4B4F85: dotest (sockd.c:1639)
==27040==    by 0x4AF141: main (sockd.c:189)

Line 793 of shmemconfig.c is this:

   EQCHECK_PTR(a,
               b,
               internal.addrv,
               sizeof(*a->internal.addrv) * a->internal.addrc);

Basically what this does is a memcmp(3) of a->internal.addrv and b->internal.addrv. Since earlier in the code we had done the equivalent of copying a->internal.addrv over b->internal.addrv with memcpy(3), and we were now comparing the same amount of bytes we had previously copied from a to b, I did not immediately (or even within the first few hours) understand why Valgrind was complaining.

True, "addrv" is a struct with various members, and quite likely contains some padbytes, but a memcpy(3) would of course copy those padbytes too, since it was copying the whole of addrv, based on the size of the whole of addrv.

So "addrv" in object "a" and object "b" should be 100% identical, padbytes and all. Indeed, the only reason why EQCHECK_PTR() was called was to assert that "addrv" was the same in both objects.

But Valgrind complained.

After much debugging and experimenting with bzero(3) here and there, individual memcmp(3)'s, short and long memcmp(3)'s, I eventually understood the problem, and the reason for why Valgrind complained about "Conditional jump or move depends on uninitialised value(s)".

Yes, "addrv" in "a" and "b" is identical, and so are the padbytes that are not initialized in either object ("addrv" in "b" being an exact copy of "addrv" in "a"). And we never try to use the padbytes. And Valgrind is clever enough to, as they say in the faq, ignore padbytes as long as you do not use them: "As for eager reporting of copies of uninitialised memory values, this has been suggested multiple times. Unfortunately, almost all programs legitimately copy uninitialised memory values around (because compilers pad structs to preserve alignment) and eager checking leads to hundreds of false positives. Therefore Memcheck does not support eager checking at this time."

The problem

Our problem was however that when doing the memcmp(3) to assert that "addrv" in both "a" and "b" are identical, memcmp(3) of course will also read the padbytes part of "addrv" in "a", comparing it against the same padbytes part in object "b". And when we read the padbytes, uninitialised as they are in our code, Valgrind rightfully complains about us reading uninitialised values. Even if we never use them for anything but comparing against the same uninitialised values, copied to another object.

Conclusion

There are three simple ways to avoid this warning.

  1. bzero(3) the object (the "addrv" struct in this case) before we start initialising the various members of the object.

    Any padbytes will then be initialised to zero, and Valgrind will not complain about uninitialised values.

  2. Use Valgrinds great "--gen-suppressions" option to generate something you can cut and paste into your Valgrind suppression-file, so Valgrind will not warn about this particular problem at this particular place again.

    The padbytes will then still be uninitialised, but Valgrind will not complain.

    The benefit of this is that if you have other tools that perform static analysis, they will still be able to warn you about using uninitialised members of the object. If you on the other hand simply bzero(3) all members before you initialise them, the static analysis tool will be unable to warn you about using any members you neglected to intialise. Because those members will be initialised to zero by the big bzero(3) you started with.

    Unfortunately, the exact callstack may vary from platform to platform, and subsequent code-changes may also change the callstack enough to no longer match the suppression pattern.

  3. bzero(3), but guard it with a

    if (RUNNING_ON_VALGRIND)
    statement. That lets static analysis tools still pick up genuine usage of uninitialised attributes (since at least one of the phases should not be
    RUNNING_ON_VALGRIND
    ), but at the same time silences the false Valgrind positive for good.


Comments:


Add comment:

To add a public comment to this article, please use the form below.

For information on how we process and store information submitted via this form, please see our privacy policy for blog comments.

Not spam: (optional, check if message is not spam)
Name:
Email: (optional, will not be published, add if you want us to contact you by email)
Subject:
Comment:

Copyright © 1998-2024 Inferno Nettverk A/S