Google's Python Course wordcount.py -


i taking google's python course, uses python 2.7. running 3.5.2.

the script functions. 1 of exercises.

#!/usr/bin/python -tt # copyright 2010 google inc. # licensed under apache license, version 2.0 # http://www.apache.org/licenses/license-2.0  # google's python class # http://code.google.com/edu/languages/google-python-class/  """wordcount exercise google's python class  main() below defined , complete. calls print_words() , print_top() functions write.  1. --count flag, implement print_words(filename) function counts how each word appears in text , prints: word1 count1 word2 count2 ...  print above list in order sorted word (python sort punctuation come before letters -- that's fine). store words lowercase, 'the' , 'the' count same word.  2. --topcount flag, implement print_top(filename) similar print_words() prints top 20 common words sorted common word first, next common, , on.  use str.split() (no arguments) split on whitespace.  workflow: don't build whole program @ once. intermediate milestone , print data structure , sys.exit(0). when that's working, try next milestone.  optional: define helper function avoid code duplication inside print_words() , print_top().  """  import sys  # +++your code here+++ # define print_words(filename) , print_top(filename) functions. # write helper utility function reads fcd ile # , builds , returns word/count dict it. # print_words() , print_top() can call utility function.  ###  def word_count_dict(filename):   """returns word/count dict filename."""   # utility used count() , topcount().   word_count={} #map each word count   input_file=open(filename, 'r')   line in input_file:     words=line.split()     word in words:       word=word.lower()       # special case if we're seeing word first time.       if not word in word_count:         word_count[word]=1       else:         word_count[word]=word_count[word] + 1   input_file.close() # not strictly required, form.   return word_count  def print_words(filename):   """prints 1 per line '<word> <count>' sorted word given file."""   word_count=word_count_dict(filename)   words=sorted(word_count.keys())   word in words:     print(word,word_count[word])  def get_count(word_count_tuple):   """returns count dict word/count tuple -- used custom sort."""   return word_count_tuple[1]  def print_top(filename):   """prints top count listing given file."""   word_count=word_count_dict(filename)    # each (word, count) tuple.   # sort big counts first using key=get_count() extract count.   items=sorted(word_count.items(), key=get_count, reverse=true)    # print first 20   item in items[:20]:     print(item[0], item[1])  # basic command line argument parsing code provided , # calls print_words() , print_top() functions must define. def main():   if len(sys.argv) != 3:     print('usage: ./wordcount.py {--count | --topcount} file')     sys.exit(1)    option = sys.argv[1]   filename = sys.argv[2]   if option == '--count':     print_words(filename)   elif option == '--topcount':     print_top(filename)   else:     print ('unknown option: ' + option)     sys.exit(1)  if __name__ == '__main__':   main() 

here questions course not answering:

  1. where says following, unsure of 1 , +1 mean. mean if word not in list, add list? (word_count[word]=1)? and, don't understand each part of means, says word_count[word]=word_count[word] + 1.

      if not word in word_count:     word_count[word]=1   else:     word_count[word]=word_count[word] + 1 
  2. when says word_count.keys(), not sure other calls key in dictionary defined , loaded keys , values into. want understand why word_count.keys() there.

      words=sorted(word_count.keys()) 
  3. word_count redefined in couple of locations, , know why instead of creating new variable name such word_count1.

      word_count={}   word_count=word_count_dict(filename)   ...and in places outlined in 1st question. 
  4. does if len(sys.argv) != 3: mean if arguments not 3, or characters not 3 (e.g. sys.argv[1], sys.argv[2], sys.argv[3]?

thank help!

  1. if word not in dictionary, create new entry in dictionary it, , set value number 1, since we've far found 1 occurrence of word. otherwise, retrieve old value dictionary, use + 1 add 1 value, , put in dictionary entry assigning word_count[word]. written as:

    word_count[word] += 1 
  2. word_count.keys() returns list of keys in word_count dictionary. being used contents of dictionary can printed in alphabetical order, using sort(). if printed dictionary way is, words in unpredictable order.

  3. the variable not being redefined. variables local each function, each word_count different variable. happen use same name in each function, because it's name variable contains.

  4. list indexes start 0, if (len(sys.argv) != 3 checks have argv[0], argv[1], , argv[2]. argv[0] contains script name, checking gave 2 arguments script. first argument must either --count or --topcount , second argument must filename count words in.


Comments

Popular posts from this blog

javascript - Slick Slider width recalculation -

jsf - PrimeFaces Datatable - What is f:facet actually doing? -

angular2 services - Angular 2 RC 4 Http post not firing -