Here is my original code:
#include <stdio.h>
int main()
{
puts("Please enter a number.");
int number = 0;
scanf("%d", &number);
puts("Please enter a string.");
char sentence[number];
scanf("%s", &sentence);
printf("%s", sentence);
}
It seems to work when i try it, but everyone is saying that scanf() isn't really secure, so I chose to replace it with fgets():
#include <stdio.h>
int main()
{
puts("Please enter a number.");
int number = 0;
scanf("%d", &number);
puts("Please enter a string.");
char sentence[number];
fgets(sentence,number, stdin);
printf("%s", sentence);
}
But this gave me the following output:
Please enter a number.
2
Please enter a string.
The program had ended without my entering the requested string. From my code above, fgets() should've at least allowed me to enter a single letter, right? The maximum number of characters to be copied into sentence would be 1. Huh.
After doing a bit of googling, I find a satisfactory answer as to why this is happening: http://cboard.cprogramming.com/c-programming/96983-why-scanf-dont-working.html
After I enter 2 in the above example, stdin should look like the following:
[2][\n][][][][][]
scanf() reads the 2 and puts it in number, without doing anything to the trailing '\n', so stdin looks like:
[\n][][][][][]
And then fgets() comes along, sees the '\n' in stdin (and assumes the user has typed something) and according to this,
A newline character makes fgets stop reading, but it is considered a valid character by the function and included in the string copied to str.So sentence should at least look like this:
[\n][]
So what do we do? Why, we remove the extra '\n' before fgets() trips over it!
#include <stdio.h>
int main()
{
puts("Please enter a number.");
int number = 0;
scanf("%d", &number);
char ch;
ch = fgetc(stdin);
puts("Please enter a string.");
char sentence[number];
fgets(sentence,number, stdin);
printf("%s\n", sentence);
}
So stdin is effectively cleared and we can continue with our little experiment:
Please enter a number.
2
Please enter a string.
hello
h
Remember, fgets can only read a maximum of two characters into the string. In this case,'h' and '\n'. "But where's the new line?" From what I can gather, fgets() automatically appends a '\0' to the end of the string, which essentially shows where the string ends.
#include <stdio.h>
int main()
{
puts("Please enter a number.");
int number = 0;
scanf("%d", &number);
char ch;
ch = fgetc(stdin);
puts("Please enter a string.");
char sentence[number];
fgets(sentence,number, stdin);
printf("%s", sentence);
if (sentence[number-1] == '\0')
{
puts("The string is null-terminated.");
}
}
I get the following:
Please enter a number.
2
Please enter a string.
helllo
hThe string is null-terminated.
So, it seems fgets does that job for us. Let's test to see if scanf does the same:
//fgets(sentence,number, stdin);
scanf("%s", &sentence);
printf("%s", sentence);
if (sentence[number-1] == '\0')
{
puts("The string is null-terminated.");
}
The above code gives me:
Please enter a number.
2
Please enter a string.
helllo
helllo
So scanf does no bounds checking and allows me to write 6 characters into a 2 character array. What rule are we violating here?
Ari's answer at http://stackoverflow.com/questions/11455302/where-c-really-stores-a-string-if-the-char-array-that-stores-it-is-smaller-tha
gives us a nice diagram to work with:
Bytes 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Before |-----name1 array-------| |--- other data-|
After Q w e r t y u i o p \0 |-er data-|
As outlined in the above diagram, scanf allows you to overwrite data from other variables in memory (BTW this is called a buffer overflow.)
Let's try this example:
#include <stdio.h>
int main()
{
puts("Please enter a number.");
int number = 0;
scanf("%d", &number);
char ch;
ch = fgetc(stdin);
puts("Please enter a string.");
char sentence[number];
int y = 1;
//fgets(sentence,number, stdin);
scanf("%s", &sentence);
printf("%s", sentence);
// if (sentence[number-1] == '\0')
// {
// puts("The string is null-terminated.");
// }
printf("%d", y); //Supposed to print 1
}
I get the following:
Please enter a number.
2
Please enter a string.
HELOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
Segmentation fault (core dumped)
So clearly scanf replaced some data in memory, causing this segmentation fault. Remember, I had mentioned above that scanf() isn't secure. This is clearly why.
Alright, so scanf has the potential to overwrite memory, so what can we do about it? We can specify the maximum amount of characters that the user can input. pb2q from Stack Overflow explains it as follows:
If you must usescanf
then I believe that the best that you can do is use the width specifier with something like:"%31s"
, as you've already mentioned, then usestrlen
to check the length of the input, and discard the string and report an error if the input is longer than your limit.
Or possibly skip thestrlen
by additionally using an%n
in your format string, e.g."%31s%n"
.
A format string using something like%[^\n]
in place of%s
simply instructs the function to continue reading until a newline, consuming other whitespace characters along the way. This is useful if you want to allow the input to include whitespace characters.
Review the docs for scanf (here's a copy of the man page).