What confuses me:
Let's begin with a snippet at page 104 of The C Programming Language(Second Edition):
From this snippet we can see,it is illegal to evaluate c expressions like:
char *pmessage = "now is the time";
printf("%s\n",pmessage);
pmessage[0] = 'p';
printf("%s\n",pmessage);
Because the "now is the time" is constant string
But it is totally ok with the c complier in my Windows:
That makes me confused.
Then I put the same codes in Ubuntu:
The codes are complied successfully.But something wrong happened when I run the complied program.It says:Segmentation fault (core dumped)
The result in Windows doesn't correspond with what the book says,it seems to indicate the c compiler of Microsoft doesn't conform to the ANSI C standard at this place. I guess the c compiler of Microsoft doesn't let the pmessage
point to a constant string.My confussion got relieved.
What is the difference between array and pointer?
To figure out what does "Segmentation fault" mean,I found an answer in Stackoverflow,which also illustrates the difference between array and pointer very well:
There is nothing inherently wrong with using pointers as arrays, unless those pointers point to constant data (and string literals are constant data). Although semantically incorrect, in the old days of no memory protection,
pmessage[0] = 'n';
would have actually worked with unpredictable results (e.g. affecting all occurrences of the same literal within the program). On modern operating system this could not happen because of the memory protection in place. String literals and other constants are put in so-called read-only sections of the executable and when the executable is loaded in memory in order to create a process, the memory pages that contain the read-only sections are made to be read-only, i.e. any attempt to change their content leads to segementation fault.
char amessage[] = "now is the time";
is really a syntactic sugar for the following:
char amessage[] = { 'n','o','w',' ','i','s',' ','t', 'h','e',' ','t','i','m','e','\0' };
i.e. it creates an array of 16 characters and initialises its content with the string "now is the time" (together with the NULL terminator).
On the other hand
char *pmessage = "now is the time";
puts the same string data somewhere in the read-only data and assigns its address to the pointer pmessage
. It works similar to this:
// This one is in the global scope so the array is not on the stackconst
const char _some_unique_name[] = "now is the time";
//the 'const' is added by Vincent Ge,not in the origin edition of this answer
char *pmessage = _some_unique_name;
_some_unique_name
is chosen so as to not clash with any other identifier in your program. Usually symbols that are not permitted by the C language, but are ok for the assembler and the linker, are used (e.g. dots like in string.1634
).
You can change the value of a pointer - this will make it point something else, e.g. to another string. But you cannot change the address behind the name of an array, i.e. amessage
will always refer to the same array storage that was allocated for it in first place.
You can refer to individual elements of each string using amessage[i]
or pmessage[i]
but you can only assign to the elements of amessage
as they are located in the read-write memory.
Reference: http://stackoverflow.com/questions/11691324/forbiddens-in-string-literals-in-c
Experiments afterwards:
Although semantically incorrect, in the old days of no memory protection,
pmessage[0] = 'n';
would have actually worked with unpredictable results (e.g. affecting all occurrences of the same literal within the program).
String literals and other constants are put in so-called read-only sections of the executable.
These two sentences indicates that all the occurrences of the same string will access the same memory address,and this address is remote from the address of local variables.This is my assumption 1.
Base on my guessing thatthe c compiler of Microsoft doesn't let the pmessage
point to a constant string,I guess pmessage
and amessage
will have nearby memory address on Windows.This is my assumption 2.
To confirm my assumptions I wrote the following codes:
char amessage1[] = "array:now is the time";
char amessage2[] = "array:now is the time";
char *pmessage1 = "pointer:now is the time";
char *pmessage2 = "pointer:now is the time";
printf("%s Address:%X\n",amessage1,amessage1);
printf("%s Address:%X\n",amessage2,amessage2);
printf("%s Address:%X\n",pmessage1,pmessage1);
printf("%s Address:%X\n",pmessage2,pmessage2);
Result in Windows:
Result in Ubuntu:
Assumption 1 is confirmed,but assumption 2 not.