21. Check that the end-of-file character is reached correctly (EOF)

Let's continue the topic of working with files. And again we'll have a look at EOF. But this time we'll speak about a bug of a completely different type. It usually reveals itself in localized versions of software.

The fragment is taken from Computational Network Toolkit. The error is detected by the following PVS-Studio diagnostic: V739 EOF should not be compared with a value of the 'char' type. The 'c' should be of the 'int' type.

string fgetstring(FILE* f)
{
  string res;
  for (;;)
  {
    char c = (char) fgetc(f);
    if (c == EOF)
      RuntimeError("error reading .... 0: %s", strerror(errno));
    if (c == 0)
      break;
    res.push_back(c);
  }
  return res;
}

Explanation

Let's look at the way EOF is declared:

#define EOF (-1)

As you can see, the EOF is nothing more than '-1 ' of int type. Fgetc() function returns a value of int type. Namely, it can return a number from 0 to 255 or -1 (EOF). The values read are placed into a variable of char type. Because of this, a symbol with the 0xFF (255) value turns into -1, and then is handled in the same way as the end of file (EOF).

Users that use Extended ASCII Codes, may encounter an error when one of the symbols of their alphabet is handled incorrectly by the program.

For example in the Windows 1251 code page, the last letter of Russian alphabet has the 0xFF code, and so, is interpreted by the program as the end-of-file character.

Correct code

for (;;)
{
  int c = fgetc(f);
  if (c == EOF)
    RuntimeError("error reading .... 0: %s", strerror(errno));
  if (c == 0)
    break;
  res.push_back(static_cast<char>(c));
}

Recommendation

There is probably no particular recommendation here, but as we are speaking about EOF, I wanted to show an interesting variant of an error, that some people aren't aware of.

Just remember, if the functions return the values of int type, don't hasten to change it into char. Stop and check that everything is fine. By the way, we have already had a similar case discussing the function memcmp() in Chapter N2 - "Larger than 0 does not mean 1" (See the fragment about a vulnerability in MySQL)

results matching ""

    No results matching ""