answer.
Ask question
Login Signup
Ask question
All categories
  • English
  • Mathematics
  • Social Studies
  • Business
  • History
  • Health
  • Geography
  • Biology
  • Physics
  • Chemistry
  • Computers and Technology
  • Arts
  • World Languages
  • Spanish
  • French
  • German
  • Advanced Placement (AP)
  • SAT
  • Medicine
  • Law
  • Engineering
Bingel [31]
2 years ago
8

When an author produce an index for his or her book, the first step in this process is to decide which words should go into the

index; the second is to produce a list of the pages where each word occurs. Instead of trying to choose words out of our heads, we decided to let the computer produce a list of all the unique words used in the manuscript and their frequency of occurrence. We could then go over the list and choose which words to put into the index.
The main object in this problem is a "word" with associated frequency. The tentative definition of "word" here is a string of alphanumeric characters between markers where markers are white space and all punctuation marks; anything non-alphanumeric stops the reading. If we skip all un-allowed characters before getting the string, we should have exactly what we want. Ignoring words of fewer than three letters will remove from consideration such as "a", "is", "to", "do", and "by" that do not belong in an index.

In this project, you are asked to write a program to read any text file and then list all the "words" in alphabetic order with their frequency together appeared in the article. The "word" is defined above and has at least three letters.

Computers and Technology
1 answer:
Igoryamba2 years ago
7 0

Answer:

import string

dic = {}

book=open("book.txt","r")

# Iterate over each line in the book

for line in book.readlines():

   tex = line

   tex = tex.lower()

   tex=tex.translate(str.maketrans('', '', string.punctuation))

   new = tex.split()

   for word in new:

       if len(word) > 2:

           if word not in dic.keys():

               dic[word] = 1

           else:

               dic[word] = dic[word] + 1

for word in sorted(dic):

   print(word, dic[word], '\n')

                 

book.close()

Explanation:

The code above was written in python 3.

<em>import string </em>

Firstly, it is important to import all the modules that you will need. The string module was imported to allow us carry out special operations on strings.

<em>dic = {} </em>

<em>book=open("book.txt","r") </em>

<em> </em>

<em># Iterate over each line in the book</em>

<em>for line in book.readlines(): </em>

<em> </em>

<em>    tex = line </em>

<em>    tex = tex.lower() </em>

<em>    tex=tex.translate(str.maketrans('', '', string.punctuation)) </em>

<em>    new = tex.split() </em>

<em />

An empty dictionary is then created, a dictionary is needed to store both the word and the occurrences, with the word being the key and the occurrences being the value in a word : occurrence format.

Next, the file you want to read from is opened and then the code iterates over each line, punctuation and special characters are removed from the line and it is converted into a list of words that can be iterated over.

<em />

<em> </em><em>for word in new: </em>

<em>        if len(word) > 2: </em>

<em>            if word not in dic.keys(): </em>

<em>                dic[word] = 1 </em>

<em>            else: </em>

<em>                dic[word] = dic[word] + 1 </em>

<em />

For every word in the new list, if the length of the word is greater than 2 and the word is not already in the dictionary, add the word to the dictionary and give it a value 1.

If the word is already in the dictionary increase the value by 1.

<em>for word in sorted(dic): </em>

<em>    print(word, dic[word], '\n') </em>

<em>book.close()</em>

The dictionary is arranged alphabetically and with the keys(words) and printed out. Finally, the file is closed.

check attachment to see code in action.

You might be interested in
Prove that any amount of postage greater than or equal to 64 cents can be obtained using only 5-cent and 17-cent stamps?
elixir [45]
Let P(n) be "a postage of n cents can be formed using 5-cent and 17-cent stamps if n is greater than 63".Basis step: P(64) is true since 64 cents postage can be formed with one 5-cent and one 17-cent stamp.Inductive step: Assume that P(n) is true, that is, postage of n cents can be formed using 5-cent and 17-cent stamps. We will show how to form postage of n + 1 cents. By the inductive hypothesis postage of n cents can be formed using 5-cent and 17-cent stamps. If this included a 17-cent stamp, replace this 17-cent stamp with two 5-cent stamps to obtain n + 1 cents postage. Otherwise, only 5-cent stamps were used and n  65. Hence there are at least three 5-cent stamps forming n cents. Remove three of these 5-cent stamps and replace them with two 17-cent stamps to obtain n + 1 cents postage.Hence P(n + 1) is true.
6 0
2 years ago
Which of the following option is correct about HCatalog?
adell [148]

Answer:

Option (3) is the correct answer of this question.

Explanation:

  • HCatalog makes available Hive metadata to users of other Hadoop applications, such as Pig, MapReduce and Hive. it offers interfaces for MapReduce and Pig so that users can read data from and write data to the Hive warehouse.
  • This means users don't have to care about where or in what format their data is stored. So we know this way that Hcatalog makes sure our data is secure.
  • Others option does not belong to Hcatalog so these options are incorrect .

8 0
2 years ago
Write a program for determining if a year is a leap year. In the Gregorian calendar system you can check if it is a leaper if it
SOVA2 [1]

Answer:

def leap_year_check(year):

return if int(year) % 4 == 0 and (int(year) % 100 != 0 or int(year) % 400 == 0)

Explanation:

The function is named leap_year_check and takes in an argument which is the year which we wish to determine if it's a new year or not.

int ensures the argument is read as an integer and not a float.

The % obtains the value of the remainder after a division exercise. A remainder of 0 means number is divisible by the quotient and a remainder other wise means it is not divisible by the quotient.

If the conditions is met, that is, (the first condition is true and either the second or Third condition is true)

the function leap_year_check returns a boolean ; true and false if otherwise.

8 0
1 year ago
Any software or program that comes in many forms and is designed to disrupt the normal operation of a computer by allowing an un
Vanyuwa [196]
Trojan horse. They come in masked.
8 0
2 years ago
Write a program that creates a dictionary containing the U.S. states as keys and their capitals as values. (Use the Internet to
Dima020 [189]

Answer:

  1. import random  
  2. states = {
  3.    "Alabama": "Montgomery",
  4.    "California": "Sacramento",
  5.    "Florida": "Tallahassee",
  6.    "Hawaii": "Honolulu",
  7.    "Indiana": "Indianapolis",
  8.    "Michigan": "Lansing",
  9.    "New York": "Albany",
  10.    "Texas" : "Austin",
  11.    "Utah" : "Salt Lake City",
  12.    "Wisconsin": "Madison"
  13. }
  14. correct = 0
  15. wrong = 0
  16. round = 1
  17. while(round <= 5):
  18.    current_state = random.choice(list(states))
  19.    answer = input("What is the capital of " + current_state + ": ")
  20.    
  21.    if(answer == states[current_state]):
  22.        correct += 1
  23.    else:
  24.        wrong += 1
  25.    
  26.    round += 1
  27. print("Correct answer: " + str(correct))
  28. print("Wrong answer: " + str(wrong))

Explanation:

The solution code is written in Python 3.

Line 3 -14

Create a dictionary of US States with capital as each of their corresponding value. Please note only ten sample states are chosen here.

Line 16 - 18

Create variables to track the number of correct and inaccurate response and also round counter.

Line 19 - 28

Set the while condition to enable user to play the quiz for five questions and use random.choice to randomly pick a state from the dictionary and prompt user to input the capital of selected stated.

If the answer matched with the capital value of the selected state, increment the correct counter by one. Otherwise the wrong counter will be incremented by one. Increment the round counter by one before proceed to next round.

Line 30 - 31

Print the number of correct responses and wrong responses.

7 0
2 years ago
Other questions:
  • To copy consecutive items, click the first item in the group, hold down _______, and click the last item
    7·2 answers
  • When you park on a hill, think about which way _____.
    6·2 answers
  • Implement a class MyInt() that behaves almost the same as the class int, except when trying to add an object of type MyInt. Then
    11·1 answer
  • Dillard’s wants to learn about its consumers' attitudes toward online purchases. There are numerous studies that are available a
    9·1 answer
  • Write the state of the elements of each of the following arrays after each pass of the outermost loop of the selection sort algo
    11·1 answer
  • Write a multithreaded program that generates the Fibonacci series using Pthreads thread library. This program should work as fol
    10·1 answer
  • You are working on a documentation file userNotes.txt with some members of your software development team. Just before the file
    12·1 answer
  • Explain working principle of computer?​
    13·1 answer
  • What does NOT match with Agile Manifesto?
    9·1 answer
  • All of the following are true of functions except: Group of answer choices They define specific tasks that can be used at many p
    5·1 answer
Add answer
Login
Not registered? Fast signup
Signup
Login Signup
Ask question!