Sunday, July 15, 2018

Text Adventure Programming in C, Part 2

TEXT ADVENTURE PROGRAMMING IN C

TEXT INPUT:

So, the basics of C style strings were covered last time. Now it is time to look at how to input strings. Again, in C++, it is no big issue with STL, but is fraught with pitfalls to trip up the unwary with C. There are several ways to input text with C, using the stdio.h functions. There are also as many ways to cause buffer overflows, if the programmer is not careful.

The most basic function for text input is the multipurpose scanf function. However, it is not designed specifically and only for text. It accepts data from stdin, which in most cases is the keyboard. This data is read according to the "format" specified when the function is called, and stored in a variable, by address, also specified when calling the function. The data can be floating point, integer, character, or even a string. scanf receives some bad publicity, I have noticed, but it is not that bad, once it is understood, and remains a good option for a string input if handled properly. I will come back to it.

By far the favorite method for text input, I notice, is the fgets function, also in the stdio.h header. This initially may seem a bad option, because fgets is a function to read data from a file that was opened with fopen. However, fgets can be forced to accept input from the keyboard by using stdin (keyboard) as the "file stream" argument. Here is the prototype of fgets...

char *fgets(char *str, int n, FILE *stream)

What is liked about it is that "int n" part. It limits the number of characters that fgets reads into the string. If n is equal to the length of the string, it prevents fgets from causing buffer overflows in the string. Also, fgets will read spaces and include them in the string, so sentences can be input. scanf will (normally) only read a string up to one word long.

So, inputting something like "The large lake" with...

scanf("%s", c_str);

...will only accept the word "The" into the string, and will ignore the rest after the first space (note, &c_str, is not necessary when assigning to a string with scanf, as arrays are already "addresses").

fgets, with the same input, on the other hand...

fgets(c_str, 15, stdin);

...will accept the whole string: All 14 characters, including spaces, plus the null terminator. Ideal.

But there is, indeed, a potential problem. It is not always realized that when the user types in a line of text for input, a queue of characters is actually being made in the keyboard buffer. If you type in 20 characters and hit Return, yes, 14 of them get sucked into the string, but the last six stay in the keyboard buffer, waiting to get used the next time a command to accept strings or characters is invoked. This causes some pretty annoying problems, as part of the last input becomes the beginning of the next input, whether you wanted it or not. Try inputting "Barcelona is an historical city" into this program...

int main(int argc, char *argv[])
{
    char c_str[15];

    puts("Input a string (14 characters maximum, including spaces)");

    fgets(c_str, 15, stdin);
    printf("%s\n", c_str);

    fgets(c_str, 15, stdin);
    printf("%s\n", c_str);


    return 0; 
}


What happened? You never got to do the second input. It assigned the 14 characters to the string (and invisibly added the terminator as fgets should)...

"Barcelona is a"

...then without waiting for another input from the user for the second fgets, grabbed as much as it could of the rest of the original input still in the keyboard buffer...

"n historical c"

...and even then, still left the characters "ity" in the keyboard buffer. If this is undesired behavior for your program, then clearly, that keyboard buffer needs to be flushed before the next input.

There is often a recommendation to use fflush(stdin)...

int main(int argc, char *argv[])
{
    char c_str[15];

    puts("Input strings (14 characters maximum, including spaces)");

    fgets(c_str, 15, stdin);
    printf("%s\n", c_str);

    fflush(stdin);

    fgets(c_str, 15, stdin);
    printf("%s\n", c_str);

    fflush(stdin);
}

This may or may not work, depending on the platform (it will work on Windows). However, fflush it is not designed for the purpose of flushing input streams. Look it up here. A better, universal solution is required for flushing the keyboard buffer of excess characters not taken by fgets. This is it...

do{}while(getchar() != '\n');

The loop, with getchar(), will keep reading and discarding characters from the keyboard buffer until it finds the newline at the end, and will discard that too. It is important to do this the way I have done it here, with the do{}while format, and not just...

while(getchar() != '\n');

...like I sometimes see recommended. Doing it this way would leave the '\n' in the buffer, which isn't desired, either. Here is the program done properly...

int main(int argc, char *argv[])
{
    char c_str[15];

    puts("Input strings (14 characters maximum, including spaces)");
    fgets(c_str, 15, stdin);
    printf("%s\n", c_str);

    do{}while(getchar() != '\n');

    fgets(c_str, 15, stdin);
    printf("%s\n", c_str);

    do{}while(getchar() != '\n');
}

That solves the keyboard buffer problem, and for all intents and purposes could be the end of this post. It will certainly do for a text adventure input function. However, there are another couple of methods of text input in C that I want to look at, for the sake of broadening our options.

One solution that I like is to read the string one character at a time, inside a loop, using getch(). Now, getch() is a version of getchar() that automatically hits return itself after each character is input. It is contained in the conio.h header, which unfortunately is only available on the Windows platform, and there is a Linux version of getch() in curses.h, which would also allow this to be utilized on that platform. Here is a screen shot of the method...


What is important with this method are three things that must be considered;

1). If you try to type in text beyond the size of the string buffer, the loop will stop one short of the end of the buffer, assign a null terminator, and break out of the loop. That is, you were never in any risk of writing beyond the limits of the string and causing a buffer overflow.

2). If you hit Return before the end of the string buffer is reached, the program will assign the null terminator and break out of the loop.

3). Finally, if the user hits Backspace, the iterator (buffer_counter) for the string will move back a place (it is moved back two places because the loop, when it runs again, will move it forward one place). For the visual representation on the console, the putchar(' ') function will over-write the last character with a blank, then move the cursor back one place with putchar('\b'). This allows the user to edit the input effectively.

Nonetheless, for these blog posts on Text Adventures, I will be reverting to good old scanf. It is not as bad as the publicity it gets, so...

Let us hear it for scanf!

So scanf is rubbish? Avoid it at all costs? Well, if you use it incorrectly, maybe so. But it does work, and it works very well for this application, if you get the formatting right. Text adventures normally require a two word input; a verb and a noun. Sometimes, even three words might be acceptable, if you have programmed an option for an adjective in between the verb and the noun.

GO WEST

GET BLUE CARD

...might be examples of commands you would give in a text adventure. An improperly formatted scanf, like this...

scanf("%s", c_str);

...would only pick up GO and GET from each of those inputs, because it stops reading the input at the occurrence of the first white space. And, as we have already seen, it leaves the rest of the user input in the keyboard buffer, ready to ruin your next command input attempt. So, the first thing to force scanf to read spaces. It is done this way...

scanf("%[^\n]s", c_str);

To a beginner, that already looks scary, but it is not. Let us look at another reference to the scanf function...

http://www.cplusplus.com/reference/cstdio/scanf/

The specifier [^character] , form that reference, is a negated scanset. What does this mean? It alters scanf's default behavior so that it does not stop reading the input at the first white space, but at the character the programmer specifies in those brackets. So, if you wanted scanf to read your input into a string up to the first occurrence of the letter 'p' or 'P' (remember, case sensitive), you would do this as the format specifier...

"%[^pP]s"

If you want it to keep collecting characters, including white spaces, until you hit Return, then you would use the "new line" escape character...

"%[^\n]s"

This is all good, but it does not (yet) protect us from a buffer overflow. Fortunately, there is a way. Specify the number of characters to collect...

"%14s"

This will only pass 14 characters to a string once you hit Return, no matter how much you type into the console. Combine both methods, like this...

scanf("%14[^\n]s", c_str);

...and you can collect white spaces and avoid a buffer overflow. The rest of what was typed will stay in the keyboard buffer, and can be cleared out with the same...

do{}while(getchar() != '\n');

...that we have already looked at above. Here is a screen shot of a working program along these lines...


And that is what I will be using for text input for the rest of this text adventure section of the blog.

All the best!

Notes:

Just in case it was not clear in the example above, it is best to use the scanf() width parameter one less than the size of your character buffer (string length). This ensures that there will be space for the null terminator at the end. Like this...

char my_string[50];
scanf("%49[^\n]s", my_string);


Also, make use of scanf()'s return integer. This can help to catch situations where no characters were entered (for example, the user presses return accidentally, without having entered a command). If the string was entered, it will return 1. If nothing was entered, it will return 0.


int scanf_check = scanf("%49[^\n]s", my_string);

if(scanf_check == 0)
{
    printf("No input\n");
}
else
{
    printf("%s\n", my_string);
}

Useful, for a error tight text adventure.



No comments:

Post a Comment