/* As we discussed in class your 4th assignment due next week Oct 6 is to use a linked list to gather some statistics about the words in the "Voyage of the Beagle" text that is in the file vbgle10.txt. This template3.c code and vbgle10.txt along with another program called printwords.c are all linked to the class web page. As shown below your program will report the number of unique words, the total number of words and the most frequently used word. The printing of this output is already in the template code so long as you properly set the values of ... int unique_words, total_words; struct list_elem *most_frequent_elem; Rather than the more difficult process of making a sorted linked list you can build your list by simply adding a new list element for each instance of a brand new word at the front of the list and only increasing the count for repeated use of words. To accomplish this you will need to use the malloc() function to allocate memory for both the list elements and the words that connect to those list elements. Each list element will attach to just one distinct word and will include the count of the number of times that word appears and a link to the next list element. The end of a list will be marked by a NULL next pointer. Although we've discussed realloc(), free() and calloc() in class you will not need to use them in this assignment. You will want to use strcmp(), strcpy() and strlen() again this week or perhaps strdup(). Unlike the book description you will keep things simpler if you don't worry about typedef or the abstract "Item" data type. Just directly use the struct list_elem. Also unlike the book method you will notice that "head" in this program is not a list element but is only a pointer to a list element. My program run looks like this... [awetzel@localhost tmp]$ gcc -O program3.c [awetzel@localhost tmp]$ time a.out < vbgle10.txt there are 12816 unique words out of 210333 total words the most frequent word is which used 16930 times real 0m29.949s user 0m29.860s sys 0m0.090s You can verify the correctness of this result by using the printwords program that you can get from the web page together with the standard UNIX/Linux commands "wc", "sort", "uniq" and "tail" as shown below. [awetzel@localhost tmp]$ gcc -O printwords.c [awetzel@localhost tmp]$ ./a.out < vb*.txt | wc 210333 210333 1159121 [awetzel@localhost tmp]$ ./a.out < vb*.txt | sort | uniq | wc 12816 12816 103916 [awetzel@localhost tmp]$ ./a.out < vb*txt | sort | uniq -c | sort -n | tail -5 4295 in 5355 a 5779 and 9465 of 16930 the */ #include #include #include struct list_elem { char *word; int count; struct list_elem *next; }; struct list_elem *head; /* the head pointer is initially NULL */ int get_word(char *); void add_to_list(char *); void list_walk(void); /* Your code will set the following values for final printing */ int unique_words, total_words; struct list_elem *most_frequent_elem; main() { char word_buff[100]; while(get_word(word_buff)) add_to_list(word_buff); list_walk(); printf("there are %d unique words out of %d total words\n", unique_words, total_words); printf("the most frequent word is <%s> which used %d times\n", most_frequent_elem->word, most_frequent_elem->count); } void add_to_list(char *word_buff) { /* ***** YOUR CODE TO SEARCH THE LIST AND COUNT REPEAT WORDS *****/ /* ***** YOUR CODE TO MAKE NEW ELEMENTS FOR BRAND NEW WORDS *****/ } void list_walk() { /* ****** YOUR CODE TO WALK THE LIST AND GATHER THE STATS ***** */ } #include /* Leave this routine EXACTLY as it stands */ int get_word(char *s) { int c; do { c = getchar(); if(c == EOF) return(0); } while(!isalpha(c) && !isdigit(c)); do { if(isupper(c)) c = tolower(c); *s++ = c; c = getchar(); } while(isalpha(c) || isdigit(c)); *s = 0; return(1); }