Instantly share code, notes, and snippets.

@mGalarnyk

mGalarnyk / assignment3.md

  • Download ZIP
  • Star ( 12 ) 12 You must be signed in to star a gist
  • Fork ( 45 ) 45 You must be signed in to fork a gist
  • Embed Embed this gist in your website.
  • Share Copy sharable link for this gist.
  • Clone via HTTPS Clone using the web URL.
  • Learn more about clone URLs
  • Save mGalarnyk/21695638e94965640c35667e8683642c to your computer and use it in GitHub Desktop.

R Programming Project 3

github repo for rest of specialization: Data Science Coursera

The zip file containing the data can be downloaded here: Assignment 3 Data

Part 1 Plot the 30-day mortality rates for heart attack ( outcome.R )

programming assignment programming assignment 3 hash tables github

Part 2 Finding the best hospital in a state ( best.R )

Part 3 ranking hospitals by outcome in a state ( rankhospital.r ), part 4 ranking hospitals in all states ( rankall.r ).

@vickkiee

vickkiee commented Jul 7, 2020

Thank you for your solution

Sorry, something went wrong.

ghost commented Jul 15, 2020 • edited by ghost

The course isn't for the student's learning ability but for student's cheating ability. This course didn't teach me or explain me to find the solution to solve the assignment; in contrast, it encouraged me to google and copy other's code. I am feeling that I am in the wrong course. I am about to quit. I have no idea why JHU and instructor created irresponsible course like this.

@sushilsushil

sushilsushil commented Jul 18, 2020

Yo everyone feels the same way. I didn't like the constant saliva sound of that guy. Very poor course design too.

@BouNaj

BouNaj commented Jul 25, 2020

I share the same feeling. I was frustrated the content does in no way prepare you to do anything of the sort the assignments require.

@Chiagozie-Umeano

Chiagozie-Umeano commented Aug 4, 2020

I thought I was the only one having this difficulty in writing code and functions. Was feeling bad for choosing this course, we are not properly taught how to write these codes. I have being feeling to quit the course..

@abduljan71

abduljan71 commented Aug 7, 2020

Hey, I am with it. I am doing the same. The course itself does not help you that much for you to understand and do your own coding. I feel very bad.

@shaunkok64

shaunkok64 commented Aug 16, 2020 • edited

Thank you for your solution. I am glad that there are so many people feeling the same. As I don't have any knowledge regarding coding in general and I wanted to take this course to learn something new such as Data Science. This course could be great if designed properly but the course is not that learner friendly, in some cases even people with experience with coding may not able to cope it and this has led me thoughts to quit. This course are expecting learners to do their own coding without giving guidance to learners. WHY! There are so little guidance given to readers until I had to Google Search for answers. I need to take sometime to study it after I have passed this.

@Gareeeb7

Gareeeb7 commented Aug 27, 2020

Utha re ba ba uthle Coursera ko

Thanks for your Solution

@Fahad-98

Fahad-98 commented Aug 30, 2020

How do I submit my assignment.

@SaneClouxz

SaneClouxz commented Sep 14, 2020

How do I submit my assignment. It's a multi choice quiz which you'll solve using the functions provided in this solution.

@nfabregas

nfabregas commented Sep 25, 2020

Same here. The problem is simple: the course don't give you the tools and knowledge that someone with no experience with code needs to do the Assignments. It looks like assignments are from another course.

100% agree with D-Se:

Coursera lecture: "The drivetrain of a car looks like this" Swirl: "A car can move hurr durrr" Assignment: "build a car and test it, or else fail the course"

@anikvh

anikvh commented Sep 25, 2020

I agree with the comments above, there is a huge gap between the lectures and the assingments. Thank you so much for sharing this!

@Songsuperjing

Songsuperjing commented Sep 27, 2020

love the above comments. so true.

@Arnab-eco

Arnab-eco commented Oct 8, 2020 • edited

I totally agree with the comments. Unfortunately, The instructor thinks that everyone is a master level programmer. His presentations lack the necessary insight. His assignments are way over the mark. I understand that if someone can complete these assignments, they can learn a lot. But frankly, it's like asking someone to write a novel after teaching them how to read. The swirl exercises do teach the basics, but is absolutely hopeless as it doesn't even allow us to make the smallest changes to a code.

@benthecoder

benthecoder commented Oct 18, 2020 • edited

I'm glad many are learning this specialization even today, I was hoping if anyone can direct me to any discord servers or like reddit groups to talk about the assignments, for discussions and to learn from each other. Thanks! And yea the assignments are crazy hard, but to be fair the coursera courses can't fit everything into their videos, so i believe it's encouraged to find ways to create a solution for yourself with the abundance of resources online.

is this part wrong? return(dt[order(state, get(outcome), hospital name ) , head(.SD, num) , by = state , .SDcols = c("hospital name")])

I can't seem to get the right answers with this part. Anyone has a revision of this part of the code?

@siyangni

siyangni commented Nov 26, 2020 • edited

This course is just designed to make me feel bad. I was in the honor's college while I was a senior, now I am getting my master in sociology. Throughout my academic career so far I've never Googled anyone else's assignment. And this course makes me do this for every assignment!!!!!! By the way, I have a decent knowledge of programming where I gain from learning Python. I thought this class would be easy for me (after quickly going through the lecture videos), yet from the first assignment I began to scratch my head for an answer.

Guess what? the following courses for this specialization are just no better. At first sight I thought the instructors may have problems with their pedagogy. After going through several courses of JHU's Data Science Specialization, I highly doubt it's not just pedagogy, it is their attitude. There is no way the three instructors who should be incredibly smart people cannot find the embarrassingly obvious large gap between the course material and the assignments/quizzes in every one of their courses. And they rush through every course in this specialization. I paid for those courses though! I really want to file a complaint on Coursera.

@amingraphy

amingraphy commented Nov 28, 2020

I agree the course it useless. Instead I started learning by reading a time-consuming textbook, called "discovering statistics using R". I even think the material provided by Roger Peng are not all in the same basic level. He doesn't teach basic tools of R, but then he jumps to using multi-core computation on your computer to speed up the calculation! It is just ridiculous

@DanEscasa

DanEscasa commented Dec 30, 2020 • edited

Thanks for the hard work. I chose not to shorten the unwieldingly long column names, just used switch :

outcome <- switch(outcome, "heart attack" = "Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack", "heart failure" = "Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure", "pneumonia" = "Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia", NULL) if (is.null(outcome)) { stop(" : invalid outcome") }

<rant> Isn't there a way to preserve the indentation in a code block? </rant>

As to Roger Peng's teaching, I was looking for a functional programming-oriented approach. He still treats it like a procedural language. I should write up something on that in codementor.io

@charlenelch13

charlenelch13 commented Jan 3, 2021

Thank you so much for the sharing!! It is sooo helpful! I had zero knowledge about programming before taking this course. I feel frustrated about learning the logics from the course materials. They are so vague and not supported by daily examples. I don't know how they can assume students to know how to finish the assignments... Thank you for guiding me!!

@EmilieWaite18

EmilieWaite18 commented Mar 23, 2021

I have never been so frustrated by anything. I have had to look up every single answer to these assignments.

@anaidcandido

anaidcandido commented Jun 1, 2021

Many thanks for sharing!! I also thought I was the only one struggling with the course but I'm "glad" to see is the course itself. Now I am able to analyze and compare what I was doing. Thanks a lot!

@yeho-bt

yeho-bt commented Jun 13, 2021

Hmmmm, I am not alone feeling like this.

@jdpm93

jdpm93 commented Oct 5, 2021

I have the same problem, the gap between the lessons, swirl, and the assigments is horrible and frustrating. some excercises require the use of things never covered in the videos or swirl, I see that many have sent feedback but they are not taking action on this, also the video lessons need to be updated it's 2021 and the videos were recorded in 2015. this is insane. Thank you for sharing.

@codobene

codobene commented Dec 16, 2021

truth is, the world is not meant for everyone to be kept alive - it just needs the upper 10% of people that manage this course well. Same is true for jobs in general, stock market, etc etc get it quickly, or work overtime to compensate for your idiocy, or become social darvinism's fish food.

i blindly copy code that I find here, fail to reproduce anything, and take drugs. good night. good luck!

@Juanvelz

Juanvelz commented Mar 24, 2022

These people should learn how to teach before offering a course. On the other hand, they know very well how to discourage students.

@emakello

emakello commented Mar 27, 2022

The essence of taking an online course is to learn skills that can help you in your career or academics. While the courses are stimulating, it makes no sense to bring assignments that discourage rather than encourage learners. Some people like me had never coded before and I spend hours trying to do this thing without success. I wish there was someone who could teach coding from scratch without assuming any prior knowledge.

@cfsobral

cfsobral commented Jan 22, 2023

Thank you for help. I believe that the course already expects us to look for solutions like this one from mGalarnyk, otherwise it would not make sense to pass on an assignment of this complexity for beginners to do, because many like me would get frustrated and give up the course, as I read in some comments made here. Thank you mGalarnyk

@Doc-OmSa

Doc-OmSa commented Apr 9, 2023

Thanks mGalarnyk. I have a feeling that this course is definitely not for beginners, unlikely for intermediates as well. The assignments are too advanced. I have been struggling to understand functions. I am a beginner having taken the Google Data Analutics course previously. This is too advanced. Would likely leave. Just wanted to ask if there are better courses more suitable for someone who is a beginner and going to intermediate level? An other platform?

Princeton University COS 217:  Introduction to Programming Systems

Assignment 3:  a hash table adt.

A hash table is a collection of buckets . Each bucket is a collection of bindings .  Each binding consists of a key and a value .  A key uniquely identifies its binding; a value is data that is somehow pertinent to its key.  The mapping of bindings to buckets is determined by the hash table's hash function .  A hash function accepts a binding's key and bucket count, and returns the number of the bucket in which that binding should reside.

A collision occurs when more than one binding is mapped to the same bucket.  A good hash function distributes bindings fairly uniformly across the hash table's buckets, and thus minimizes collisions.  A bad hash function is one that generates many collisions.  With a good hash function, the retrieve/insert/delete performance of a hash table is very fast.

In principle, the user/client/customer of the Table ADT need not be concerned about its implementation; the user/client/customer is concerned only with the Table's interface.  However in practice the user/client/customer of the Table ADT must choose a Set implementation (either the one that uses arrays or the one that uses a tree) and a bucket count.  In Part 2 of the assignment your job is to analyze the efficiency of the Table ADT under various conditions, and thus provide guidance to the user/client/customer about which implementation and bucket count to choose.

Part 1:  The Table ADT

The table interface.

The Table_new function should return a new Table.  The parameter iBucketCount is the number of buckets that the Table will contain.  The parameter pfCompare is a pointer to a compare function that the Table will use to compare keys.  The function *pfCompare should return a negative integer if there is a "less than" relationship between the two given keys, a positive integer if there is a "greater than" relationship between the two given keys, and zero if the two given keys are equal.  The parameter pfHash is a pointer to a hash function that the Table will use to assign bindings to buckets.  It should be a checked runtime error for iBucketCount to be less than 1, for pfCompare to be NULL, or for pfHash to be NULL..

The TableIter Interface

The implementations.

Of course you should implement your Table ADT as a hash table.  Each bucket of a Table should be a Set, as specified in Assignment 2.  In that manner a Table should be composed of Sets.  You should use the given Set and SetIter ADTs (and not your own) .  

Part 2:  The Performance Analysis

Suppose you have sold your Table et al ADTs to customers.  Further suppose that your customers are requesting some guidance concerning efficient use of your ADTs.

Generic Analysis

Customer #1 has purchased your ADTs with the intention of using them in a variety of projects.  The customer asks you for general guidelines about which ADT to use when.  In particular, the customer asks for guidelines organized along four dimensions:

  • Table implementation :  Table/Set/array, Table/Set/tree
  • Function :  put, remove, get
  • Bucket count :  1, 100
  • Data size :  10 items, 100 items, 1000 items, 10000 items
  • We recommend that you use the Set/array ADT when...
  • We recommend that you use the Set/tree ADT when...
  • We recommend that you use the Table/Set/array ADT when...
  • We recommend that you use the Table/Set/tree ADT when...

Specific Analysis

Customer #2 has a more specific question.  The customer will be using your Table/Set/tree ADT in an application that is very similar to timetable, where the bucket count is 100 and the number of items is 10000 .  In the hope of improving efficiency, the customer is considering rewriting the hash and compare functions in assembly language.  The customer anticipates that the translation to assembly language will double the speed of those functions.  More precisely, the time spent executing those functions and their descendants will be cut in half.  The customer asks if the translation to assembly language is worth the effort. That is, how much will doubling the speed of the hash and compare functions impact the efficiency of the overall program?

Use the statistics produced by the gprof tool to formulate a one sentence response to Customer #2.  Be as specific as possible about the performance gain.  Place the response in your readme file.

  • set.h contains an interface for the Set ADT.
  • setarray.o and settree.o contain array and binary search tree (respectively) implementations of the Set ADT.  
  • testtable.c contains a minimal test program that illustrates typical usage of the Table and TableIter ADTs.  You can use it as the basis for developing your own programs to test your Table and TableIter ADTs.
  • timetable.c contains a program that you can use to further test your Table ADT, and to collect performance statistics.
  • stats.h and stats.c define an ADT that is used by the timetable program.
  • test10.txt , test100.txt , test1000.txt, and test10000.txt contain data that the timetable program can read.
  • table.h should contain the interface for the Table and TableIter ADTs.
  • table.c should contain the implementation of the Table and TableIter ADTs.
  • readme should contain your name, and any information that will help us to grade your work in the most favorable light.  In particular you should describe all known bugs in your readme file.  It should also contain the performance analyses described above.

Codeshive

  • $ 0.00 0 items

programming assignment programming assignment 3 hash tables github

Programming Assignment 3: Hash Tables and Hash Functions solved

Original price was: $35.00. $ 35.00 Current price is: $35.00.   $ 21.00

Description

Introduction In this programming assignment, you will practice implementing hash functions and hash tables and using them to solve algorithmic problems. In some cases you will just implement an algorithm from the lectures, while in others you will need to invent an algorithm to solve the given problem using hashing. Learning Outcomes Upon completing this programming assignment you will be able to: 1. Apply hashing to solve the given algorithmic problems. 2. Implement a simple phone book manager. 3. Implement a hash table using the chaining scheme. 4. Find all occurrences of a pattern in text using Rabin–Karp’s algorithm. 5. Applying hashing to solve other string processing problems. Passing Criteria: 2 out of 6 Passing this programming assignment requires passing at least 2 out of 6 programming challenges from this assignment. In turn, passing a programming challenge requires implementing a solution that passes all the tests for this problem in the grader and does so under the time and memory limits specified in the problem statement. Contents 1 Phone book 2 2 Hashing with chains 5 3 Find pattern in text 9 4 Substring equality 11 5 Longest common substring 14 6 Pattern matching with mismatches 16 7 Appendix 17 7.1 Compiler Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 7.2 Frequently Asked Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1 1 Phone book Problem Introduction In this problem you will implement a simple phone book manager. Problem Description Task. In this task your goal is to implement a simple phone book manager. It should be able to process the following types of user’s queries: ∙ add number name. It means that the user adds a person with name name and phone number number to the phone book. If there exists a user with such number already, then your manager has to overwrite the corresponding name. ∙ del number. It means that the manager should erase a person with number number from the phone book. If there is no such person, then it should just ignore the query. ∙ find number. It means that the user looks for a person with phone number number. The manager should reply with the appropriate name, or with string “not found” (without quotes) if there is no such person in the book. Input Format. There is a single integer 𝑁 in the first line — the number of queries. It’s followed by 𝑁 lines, each of them contains one query in the format described above. Constraints. 1 ≤ 𝑁 ≤ 105 . All phone numbers consist of decimal digits, they don’t have leading zeros, and each of them has no more than 7 digits. All names are non-empty strings of latin letters, and each of them has length at most 15. It’s guaranteed that there is no person with name “not found”. Output Format. Print the result of each find query — the name corresponding to the phone number or “not found” (without quotes) if there is no person in the phone book with such phone number. Output one result per line in the same order as the find queries are given in the input. Time Limits. C: 3 sec, C++: 3 sec, Java: 6 sec, Python: 6 sec. C#: 4.5 sec, Haskell: 6 sec, JavaScript: 9 sec, Ruby: 9 sec, Scala: 9 sec. Memory Limit. 512MB. 2 Sample 1. Input: 12 add 911 police add 76213 Mom add 17239 Bob find 76213 find 910 find 911 del 910 del 911 find 911 find 76213 add 76213 daddy find 76213 Output: Mom not found police not found Mom daddy 76213 is Mom’s number, 910 is not a number in the phone book, 911 is the number of police, but then it was deleted from the phone book, so the second search for 911 returned “not found”. Also, note that when the daddy was added with the same phone number 76213 as Mom’s phone number, the contact’s name was rewritten, and now search for 76213 returns “daddy” instead of “Mom”. Sample 2. Input: 8 find 3839442 add 123456 me add 0 granny find 0 find 123456 del 0 del 0 find 0 Output: not found granny me not found Recall that deleting a number that doesn’t exist in the phone book doesn’t change anything. Starter Files The starter solutions for C++, Java and Python3 in this problem read the input, implement a naive algorithm to look up names by phone numbers and write the output. You need to use a fast data structure to implement 3 a better algorithm. If you use other languages, you need to implement the solution from scratch. What to Do Use the direct addressing scheme. Need Help? Ask a question or see the questions asked by other learners at this forum thread. 4 2 Hashing with chains Problem Introduction In this problem you will implement a hash table using the chaining scheme. Chaining is one of the most popular ways of implementing hash tables in practice. The hash table you will implement can be used to implement a phone book on your phone or to store the password table of your computer or web service (but don’t forget to store hashes of passwords instead of the passwords themselves, or you will get hacked!). Problem Description Task. In this task your goal is to implement a hash table with lists chaining. You are already given the number of buckets 𝑚 and the hash function. It is a polynomial hash function 𝑕(𝑆) = ⎛ ⎝ |𝑆 ∑︁ |−1 𝑖=0 𝑆[𝑖]𝑥 𝑖 mod 𝑝 ⎞ ⎠ mod 𝑚 , where 𝑆[𝑖] is the ASCII code of the 𝑖-th symbol of 𝑆, 𝑝 = 1 000 000 007 and 𝑥 = 263. Your program should support the following kinds of queries: ∙ add string — insert string into the table. If there is already such string in the hash table, then just ignore the query. ∙ del string — remove string from the table. If there is no such string in the hash table, then just ignore the query. ∙ find string — output “yes” or “no” (without quotes) depending on whether the table contains string or not. ∙ check 𝑖 — output the content of the 𝑖-th list in the table. Use spaces to separate the elements of the list. If 𝑖-th list is empty, output a blank line. When inserting a new string into a hash chain, you must insert it in the beginning of the chain. Input Format. There is a single integer 𝑚 in the first line — the number of buckets you should have. The next line contains the number of queries 𝑁. It’s followed by 𝑁 lines, each of them contains one query in the format described above. Constraints. 1 ≤ 𝑁 ≤ 105 ; 𝑁 5 ≤ 𝑚 ≤ 𝑁. All the strings consist of latin letters. Each of them is non-empty and has length at most 15. Output Format. Print the result of each of the find and check queries, one result per line, in the same order as these queries are given in the input. Time Limits. C: 1 sec, C++: 1 sec, Java: 5 sec, Python: 7 sec. C#: 1.5 sec, Haskell: 2 sec, JavaScript: 7 sec, Ruby: 7 sec, Scala: 7 sec. Memory Limit. 512MB. 5 Sample 1. Input: 5 12 add world add HellO check 4 find World find world del world check 4 del HellO add luck add GooD check 2 del good Output: HellO world no yes HellO GooD luck The ASCII code of ’w’ is 119, for ’o’ it is 111, for ’r’ it is 114, for ’l’ it is 108, and for ’d’ it is 100. Thus, 𝑕(“world”) = (119 + 111 × 263 + 114 × 2632 + 108 × 2633 + 100 × 2634 mod 1 000 000 007) mod 5 = 4. It turns out that the hash value of 𝐻𝑒𝑙𝑙𝑂 is also 4. Recall that we always insert in the beginning of the chain, so after adding “world” and then “HellO” in the same chain index 4, first goes “HellO” and then goes “world”. Of course, “World” is not found, and “world” is found, because the strings are case-sensitive, and the codes of ’W’ and ’w’ are different. After deleting “world”, only “HellO” is found in the chain 4. Similarly to “world” and “HellO”, after adding “luck” and “GooD” to the same chain 2, first goes “GooD” and then “luck”. Sample 2. Input: 4 8 add test add test find test del test find test find Test add Test find Test Output: yes no no yes Adding “test” twice is the same as adding “test” once, so first find returns “yes”. After del, “test” is 6 no longer in the hash table. First time find doesn’t find “Test” because it was not added before, and strings are case-sensitive in this problem. Second time “Test” can be found, because it has just been added. Sample 3. Input: 3 12 check 0 find help add help add del add add find add find del del del find del check 0 check 1 check 2 Output: no yes yes no add help Explanation: Note that you need to output a blank line when you handle an empty chain. Note that the strings stored in the hash table can coincide with the commands used to work with the hash table. Starter Files There are starter solutions only for C++, Java and Python3, and if you use other languages, you need to implement solution from scratch. Starter solutions read the input, do a full scan of the whole table to simulate each find operation and write the output. This naive simulation algorithm is too slow, so you need to implement the real hash table. What to Do Follow the explanations about the chaining scheme from the lectures. Remember to always insert new strings in the beginning of the chain. Remember to output a blank line when check operation is called on an empty chain. Some hints based on the problems encountered by learners: ∙ Beware of integer overflow. Use long long type in C++ and long type in Java where appropriate. Take everything (mod 𝑝) as soon as possible while computing something (mod 𝑝), so that the numbers are always between 0 and 𝑝 − 1. 7 ∙ Beware of taking negative numbers (mod 𝑝). In many programming languages, (−2)%5 ̸= 3%5. Thus you can compute the same hash values for two strings, but when you compare them, they appear to be different. To avoid this issue, you can use such construct in the code: 𝑥 ← ((𝑎%𝑝) + 𝑝)%𝑝 instead of just 𝑥 ← 𝑎%𝑝. Need Help? Ask a question or see the questions asked by other learners at this forum thread. 8 3 Find pattern in text Problem Introduction In this problem, your goal is to implement the Rabin–Karp’s algorithm. Problem Description Task. In this problem your goal is to implement the Rabin–Karp’s algorithm for searching the given pattern in the given text. Input Format. There are two strings in the input: the pattern 𝑃 and the text 𝑇. Constraints. 1 ≤ |𝑃| ≤ |𝑇| ≤ 5 · 105 . The total length of all occurrences of 𝑃 in 𝑇 doesn’t exceed 108 . The pattern and the text contain only latin letters. Output Format. Print all the positions of the occurrences of 𝑃 in 𝑇 in the ascending order. Use 0-based indexing of positions in the the text 𝑇. Time Limits. C: 1 sec, C++: 1 sec, Java: 5 sec, Python: 5 sec. C#: 1.5 sec, Haskell: 2 sec, JavaScript: 3 sec, Ruby: 3 sec, Scala: 3 sec. Memory Limit. 512MB. Sample 1. Input: aba abacaba Output: 0 4 Explanation: The pattern 𝑎𝑏𝑎 can be found in positions 0 (abacaba) and 4 (abacaba) of the text 𝑎𝑏𝑎𝑐𝑎𝑏𝑎. Sample 2. Input: Test testTesttesT Output: 4 Explanation: Pattern and text are case-sensitive in this problem. Pattern 𝑇 𝑒𝑠𝑡 can only be found in position 4 in the text 𝑡𝑒𝑠𝑡𝑇 𝑒𝑠𝑡𝑡𝑒𝑠𝑇. Sample 3. Input: aaaaa baaaaaaa Output: 1 2 3 Note that the occurrences of the pattern in the text can be overlapping, and that’s ok, you still need to output all of them. 9 Starter Files The starter solutions in C++, Java and Python3 read the input, apply the naive 𝑂(|𝑇||𝑃|) algorithm to this problem and write the output. You need to implement the Rabin–Karp’s algorithm instead of the naive algorithm and thus significantly speed up the solution. If you use other languages, you need to implement a solution from scratch. What to Do Implement the fast version of the Rabin–Karp’s algorithm from the lectures. Some hints based on the problems encountered by learners: ∙ Beware of integer overflow. Use long long type in C++ and long type in Java where appropriate. Take everything (mod 𝑝) as soon as possible while computing something (mod 𝑝), so that the numbers are always between 0 and 𝑝 − 1. ∙ Beware of taking negative numbers (mod 𝑝). In many programming languages, (−2)%5 ̸= 3%5. Thus you can compute the same hash values for two strings, but when you compare them, they appear to be different. To avoid this issue, you can use such construct in the code: 𝑥 ← ((𝑎%𝑝) + 𝑝)%𝑝 instead of just 𝑥 ← 𝑎%𝑝. ∙ Use operator == in Python instead of implementing your own function AreEqual for strings, because built-in operator == will work much faster. ∙ In C++, method substr of string creates a new string, uses additional memory and time for that, so use it carefully and avoid creating lots of new strings. When you need to compare pattern with a substring of text, do it without calling substr. ∙ In Java, however, method substring does NOT create a new String. Avoid using new String where it is not needed, just use substring. Need Help? Ask a question or see the questions asked by other learners at this forum thread. 10 4 Substring equality Problem Introduction In this problem, you will use hashing to design an algorithm that is able to preprocess a given string 𝑠 to answer any query of the form “are these two substrings of 𝑠 equal?” efficiently. This, in turn, is a basic building block in many string processing algorithms. Problem Description Input Format. The first line contains a string 𝑠 consisting of small Latin letters. The second line contains the number of queries 𝑞. Each of the next 𝑞 lines specifies a query by three integers 𝑎, 𝑏, and 𝑙. Constraints. 1 ≤ |𝑠| ≤ 500 000. 1 ≤ 𝑞 ≤ 100 000. 0 ≤ 𝑎, 𝑏 ≤ |𝑠| − 𝑙 (hence the indices 𝑎 and 𝑏 are 0-based). Output Format. For each query, output “Yes” if 𝑠𝑎𝑠𝑎+1. . .𝑠𝑎+𝑙−1 = 𝑠𝑏𝑠𝑏+1. . .𝑠𝑏+𝑙−1 are equal, and “No” otherwise. Time Limits. C: 1 sec, C++: 1 sec, Java: 2 sec, Python: 10 sec. C#: 1.5 sec, Haskell: 2 sec, JavaScript: 5 sec, Ruby: 5 sec, Scala: 5 sec. Memory Limit. 512MB. Sample 1. Input: trololo 4 0 0 7 2 4 3 3 5 1 1 3 2 Output: Yes Yes Yes No 0 0 7 → trololo = trololo 2 4 3 → trololo = trololo 3 5 1 → trololo = trololo 1 3 2 → trololo ̸= trololo What to Do For a string 𝑡 = 𝑡0𝑡1 · · ·𝑡𝑚−1 of length 𝑚 and an integer 𝑥, define a polynomial hash function 𝐻(𝑡) = 𝑚∑︁−1 𝑗=0 𝑡𝑗𝑥 𝑚−𝑗−1 = 𝑡0𝑥 𝑚−1 + 𝑡1𝑥 𝑚−2 + · · · + 𝑡𝑚−2𝑥 + 𝑡𝑚−1 . Let 𝑠𝑎𝑠𝑎+1 · · · 𝑠𝑎+𝑙−1 be a substring of the given string 𝑠 = 𝑠0𝑠1 · · · 𝑠𝑛−1. A nice property of the polynomial hash function 𝐻 is that 𝐻(𝑠𝑎𝑠𝑎+1 · · · 𝑠𝑎+𝑙−1) can be expressed through 𝐻(𝑠0𝑠1 · · · 𝑠𝑎+𝑙−1) and 11 𝐻(𝑠0𝑠1 · · · 𝑠𝑎−1), i.e., through hash values of two prefixes of 𝑠: 𝐻(𝑠𝑎𝑠𝑎+1 · · · 𝑠𝑎+𝑙−1) = 𝑠𝑎𝑥 𝑙−1 + 𝑠𝑎+1𝑥 𝑙−2 + · · · + 𝑠𝑎+𝑙−1 = = 𝑠0𝑥 𝑎+𝑙−1 + 𝑠1𝑥 𝑎+𝑙−2 + · · · + 𝑠𝑎+𝑙−1− − 𝑥 𝑙 (𝑠0𝑥 𝑎−1 + 𝑠1𝑥 𝑎−2 + · · · + 𝑠𝑎−1) = = 𝐻(𝑠0𝑠1 · · · 𝑠𝑎+𝑙−1) − 𝑥 𝑙𝐻(𝑠0𝑠1 · · · 𝑠𝑎−1) This leads us to the following natural idea: we precompute and store the hash values of all prefixes of 𝑠: let 𝑕[0] = 0 and, for 1 ≤ 𝑖 ≤ 𝑛, let 𝑕[𝑖] = 𝐻(𝑠0𝑠1 · · · 𝑠𝑖−1). Then, the identity above becomes 𝐻(𝑠𝑎𝑠𝑎+1 · · · 𝑠𝑎+𝑙−1) = 𝑕[𝑎 + 𝑙] − 𝑥 𝑙𝑕[𝑎] . In other words, we are able to get the hash value of any substring of 𝑠 in just constant time! Clearly, if 𝐻(𝑠𝑎𝑠𝑎+1 · · · 𝑠𝑎+𝑙−1) ̸= 𝐻(𝑠𝑏𝑠𝑏+1 · · · 𝑠𝑏+𝑙−1), then the corresponding two substrings (𝑠𝑎𝑠𝑎+1 · · · 𝑠𝑎+𝑙−1 and 𝑠𝑏𝑠𝑏+1 · · · 𝑠𝑏+𝑙−1) are different. However, if the hash values are the same, it is still possible that the substrings are different — this is called a collision. Below, we discuss how to reduce the probability of a collision. Recall that in practice one never computes the exact value of a polynomial hash function: everything is computed modulo 𝑚 for some fixed integer 𝑚. This is done to ensure that all the computations are efficient and that the hash values are small enough. Recall also that when computing 𝐻(𝑠) mod 𝑚 it is important to take every intermediate step (rather then the final result) modulo 𝑚. It can be shown that if 𝑠1 and 𝑠2 are two different strings of length 𝑛 and 𝑚 is a prime integer, then the probability that 𝐻(𝑠1) mod 𝑚 = 𝐻(𝑠2) mod 𝑚 (over the choices of 0 ≤ 𝑥 ≤ 𝑚 − 1) is at most 𝑛 𝑚 (roughly, this is because 𝐻(𝑠1) − 𝐻(𝑠2) is a non-zero polynomial of degree at most 𝑛 − 1 and hence can have at most 𝑛 roots modulo 𝑚). To further reduce the probability of a collision, one may take two different modulus. Overall, this gives the following approach. 1. Fix 𝑚1 = 109 + 7 and 𝑚2 = 109 + 9. 2. Select a random 𝑥 from 1 to 109 . 3. Compute arrays 𝑕1[0..𝑛] and 𝑕2[0..𝑛]: 𝑕1[0] = 𝑕2[0] = 0 and, for 1 ≤ 𝑖 ≤ 𝑛, 𝑕1[𝑖] = 𝐻(𝑠0 · · · 𝑠𝑖−1) mod 𝑚1 and 𝑕2[𝑖] = 𝐻(𝑠0 · · · 𝑠𝑖−1) mod 𝑚2. We illustrate this for 𝑕1 below. allocate 𝑕1[0..𝑛] 𝑕1[0] ← 0 for 𝑖 from 1 to 𝑛: 𝑕1[𝑖] ← (𝑥 · 𝑕1[𝑖 − 1] + 𝑠𝑖) mod 𝑚1 4. For every query (𝑎, 𝑏, 𝑙): (a) Use precomputed hash values, to compute the hash values of the substrings 𝑠𝑎𝑠𝑎+1 · · · 𝑠𝑎+𝑙−1 and 𝑠𝑏𝑠𝑏+1 · · · 𝑠𝑏+𝑙−1 modulo 𝑚1 and 𝑚2. (b) Output “Yes”, if 𝐻(𝑠𝑎𝑠𝑎+1 · · · 𝑠𝑎+𝑙−1) mod 𝑚1 = 𝐻(𝑠𝑏𝑠𝑏+1 · · · 𝑠𝑏+𝑙−1) mod 𝑚1 and 𝐻(𝑠𝑎𝑠𝑎+1 · · · 𝑠𝑎+𝑙−1) mod 𝑚2 = 𝐻(𝑠𝑏𝑠𝑏+1 · · · 𝑠𝑏+𝑙−1) mod 𝑚2 . (c) Otherwise, output “No”. Note that, in contrast to Karp–Rabin algorithm, we do not compare the substrings naively when their hashes coincide. The probability of this event is at most 𝑛 𝑚1 · 𝑛 𝑚2 ≤ 10−9 . (In fact, for random strings the probability is even much smaller: 10−18. In this problem, the strings are not random, but the probability of collision is still very small.) 12 Need Help? Ask a question or see the questions asked by other learners at this forum thread. 13 5 Longest common substring Problem Introduction In the longest common substring problem one is given two strings 𝑠 and 𝑡 and the goal is to find a string 𝑤 of maximal length that is a substring of both 𝑠 and 𝑡. This is a natural measure of similarity between two strings. The problem has applications in text comparison and compression as well as in bioinformatics. The problem can be seen as a special case of the edit distance problem (where only insertions and deletions are allowed). Hence, it can be solved in time 𝑂(|𝑠| · |𝑡|) using dynamic programming. Later in this specialization, we will learn highly non-trivial data structures for solving this problem in linear time 𝑂(|𝑠| + |𝑡|). In this problem, your goal is to use hashing to solve it in almost linear time. Problem Description Input Format. Every line of the input contains two strings 𝑠 and 𝑡 consisting of lower case Latin letters. Constraints. The total length of all 𝑠’s as well as the total length of all 𝑡’s does not exceed 100 000. Output Format. For each pair of strings 𝑠 and 𝑡𝑖 , find its longest common substring and specify it by outputting three integers: its starting position in 𝑠, its starting position in 𝑡 (both 0-based), and its length. More formally, output integers 0 ≤ 𝑖 < |𝑠|, 0 ≤ 𝑗 < |𝑡|, and 𝑙 ≥ 0 such that 𝑠𝑖𝑠𝑖+1 · · · 𝑠𝑖+𝑙−1 = 𝑡𝑗 𝑡𝑗+1 · · ·𝑡𝑗+𝑙−1 and 𝑙 is maximal. (As usual, if there are many such triples with maximal 𝑙, output any of them.) Time Limits. C: 2 sec, C++: 2 sec, Java: 5 sec, Python: 15 sec. C#: 3 sec, Haskell: 4 sec, JavaScript: 10 sec, Ruby: 10 sec, Scala: 10 sec. Memory Limit. 512MB. Sample 1. Input: cool toolbox aaa bb aabaa babbaab Output: 1 1 3 0 1 0 0 4 3 Explanation: The longest common substring of the first pair of strings is ool, it starts at the first position in toolbox and at the first position in cool. The strings from the second line do not share any non-empty common substrings (in this case, 𝑙 = 0 and one may output any indices 𝑖 and 𝑗). Finally, the last two strings share a substring aab that has length 3 and starts at position 0 in the first string and at position 4 in the second one. Note that for this pair of string one may output 2 3 3 as well. What to Do For every pair of strings 𝑠 and 𝑡, use binary search to find the length of the longest common substring. To check whether two strings have a common substring of length 𝑘, ∙ precompute hash values of all substrings of length 𝑘 of 𝑠 and 𝑡; ∙ make sure to use a few hash functions (but not just one) to reduce the probability of a collision; 14 ∙ store hash values of all substrings of length 𝑘 of the string 𝑠 in a hash table; then, go through all substrings of length 𝑘 of the string 𝑡 and check whether the hash value of this substring is present in the hash table. Need Help? Ask a question or see the questions asked by other learners at this forum thread. 15 6 Pattern matching with mismatches Problem Introduction A natural generalization of the pattern matching problem is the following: find all text locations where distance from pattern is sufficiently small. This problems has applications in text searching (where mismatches correspond to typos) and bioinformatics (where mismatches correspond to mutations). Problem Description Task. For an integer parameter 𝑘 and two strings 𝑡 = 𝑡0𝑡1 · · ·𝑡𝑚−1 and 𝑝 = 𝑝0𝑝1 · · · 𝑝𝑛−1, we say that 𝑝 occurs in 𝑡 at position 𝑖 with at most 𝑘 mismatches if the strings 𝑝 and 𝑡[𝑖 : 𝑖 + 𝑝) = 𝑡𝑖𝑡𝑖+1 · · ·𝑡𝑖+𝑛−1 differ in at most 𝑘 positions. Input Format. Every line of the input contains an integer 𝑘 and two strings 𝑡 and 𝑝 consisting of lower case Latin letters. Constraints. 0 ≤ 𝑘 ≤ 5, 1 ≤ |𝑡| ≤ 200 000, 1 ≤ |𝑝| ≤ min{|𝑡|, 100 000}. The total length of all 𝑡’s does not exceed 200 000, the total length of all 𝑝’s does not exceed 100 000. Output Format. For each triple (𝑘, 𝑡, 𝑝), find all positions 0 ≤ 𝑖1 < 𝑖2 < · · · < 𝑖𝑙 < |𝑡| where 𝑝 occurs in 𝑡 with at most 𝑘 mismatches. Output 𝑙 and 𝑖1, 𝑖2, . . . , 𝑖𝑙 . Time Limits. C: 2 sec, C++: 2 sec, Java: 5 sec, Python: 40 sec. C#: 3 sec, Haskell: 4 sec, JavaScript: 10 sec, Ruby: 10 sec, Scala: 10 sec. Memory Limit. 512MB. Sample 1. Input: 0 ababab baaa 1 ababab baaa 1 xabcabc ccc 2 xabcabc ccc 3 aaa xxx Output: 0 1 1 0 4 1 2 3 4 1 0 Explanation: For the first triple, there are no exact matches. For the second triple, baaa has distance one from the pattern. For the third triple, there are no occurrences with at most one mismatch. For the fourth triple, any (length three) substring of 𝑝 containing at least one c has distance at most two from 𝑡. For the fifth triple, 𝑡 and 𝑝 differ in three positions. What to Do Start by computing hash values of prefixes of 𝑡 and 𝑝 and their partial sums. This allows you to compare any two substrings of 𝑡 and 𝑝 in expected constant time. For each candidate position 𝑖, perform 𝑘 steps of the form “find the next mismatch”. Each such mismatch can be found using binary search. Overall, this gives an algorithm running in time 𝑂(𝑛𝑘 log 𝑛). 16 Need Help? Ask a question or see the questions asked by other learners at this forum thread. 7 Appendix 7.1 Compiler Flags C (gcc 7.4.0). File extensions: .c. Flags: gcc – pipe – O2 – std = c11 < filename > – lm C++ (g++ 7.4.0). File extensions: .cc, .cpp. Flags: g ++ – pipe – O2 – std = c ++14 < filename > – lm If your C/C++ compiler does not recognize -std=c++14 flag, try replacing it with -std=c++0x flag or compiling without this flag at all (all starter solutions can be compiled without it). On Linux and MacOS, you most probably have the required compiler. On Windows, you may use your favorite compiler or install, e.g., cygwin. C# (mono 4.6.2). File extensions: .cs. Flags: mcs Go (golang 1.13.4). File extensions: .go. Flags go Haskell (ghc 8.0.2). File extensions: .hs. Flags: ghc – O2 Java (OpenJDK 1.8.0_232). File extensions: .java. Flags: javac – encoding UTF -8 java – Xmx1024m JavaScript (NodeJS 12.14.0). File extensions: .js. No flags: nodejs Kotlin (Kotlin 1.3.50). File extensions: .kt. Flags: kotlinc java – Xmx1024m Python (CPython 3.6.9). File extensions: .py. No flags: python3 Ruby (Ruby 2.5.1p57). File extensions: .rb. ruby Rust (Rust 1.37.0). File extensions: .rs. rustc Scala (Scala 2.12.10). File extensions: .scala. scalac 17 7.2 Frequently Asked Questions Why My Submission Is Not Graded? You need to create a submission and upload the source file (rather than the executable file) of your solution. Make sure that after uploading the file with your solution you press the blue “Submit” button at the bottom. After that, the grading starts, and the submission being graded is enclosed in an orange rectangle. After the testing is finished, the rectangle disappears, and the results of the testing of all problems are shown. What Are the Possible Grading Outcomes? There are only two outcomes: “pass” or “no pass.” To pass, your program must return a correct answer on all the test cases we prepared for you, and do so under the time and memory constraints specified in the problem statement. If your solution passes, you get the corresponding feedback “Good job!” and get a point for the problem. Your solution fails if it either crashes, returns an incorrect answer, works for too long, or uses too much memory for some test case. The feedback will contain the index of the first test case on which your solution failed and the total number of test cases in the system. The tests for the problem are numbered from 1 to the total number of test cases for the problem, and the program is always tested on all the tests in the order from the first test to the test with the largest number. Here are the possible outcomes: ∙ Good job! Hurrah! Your solution passed, and you get a point! ∙ Wrong answer. Your solution outputs incorrect answer for some test case. Check that you consider all the cases correctly, avoid integer overflow, output the required white spaces, output the floating point numbers with the required precision, don’t output anything in addition to what you are asked to output in the output specification of the problem statement. ∙ Time limit exceeded. Your solution worked longer than the allowed time limit for some test case. Check again the running time of your implementation. Test your program locally on the test of maximum size specified in the problem statement and check how long it works. Check that your program doesn’t wait for some input from the user which makes it to wait forever. ∙ Memory limit exceeded. Your solution used more than the allowed memory limit for some test case. Estimate the amount of memory that your program is going to use in the worst case and check that it does not exceed the memory limit. Check that your data structures fit into the memory limit. Check that you don’t create large arrays or lists or vectors consisting of empty arrays or empty strings, since those in some cases still eat up memory. Test your program locally on the tests of maximum size specified in the problem statement and look at its memory consumption in the system. ∙ Cannot check answer. Perhaps the output format is wrong. This happens when you output something different than expected. For example, when you are required to output either “Yes” or “No”, but instead output 1 or 0. Or your program has empty output. Or your program outputs not only the correct answer, but also some additional information (please follow the exact output format specified in the problem statement). Maybe your program doesn’t output anything, because it crashes. ∙ Unknown signal 6 (or 7, or 8, or 11, or some other). This happens when your program crashes. It can be because of a division by zero, accessing memory outside of the array bounds, using uninitialized variables, overly deep recursion that triggers a stack overflow, sorting with a contradictory comparator, removing elements from an empty data structure, trying to allocate too much memory, and many other reasons. Look at your code and think about all those possibilities. Make sure that you use the same compiler and the same compiler flags as we do. ∙ Internal error: exception… Most probably, you submitted a compiled program instead of a source code. 18 ∙ Grading failed. Something wrong happened with the system. Report this through Coursera or edX Help Center. May I Post My Solution at the Forum? Please do not post any solutions at the forum or anywhere on the web, even if a solution does not pass the tests (as in this case you are still revealing parts of a correct solution). Our students follow the Honor Code: “I will not make solutions to homework, quizzes, exams, projects, and other assignments available to anyone else (except to the extent an assignment explicitly permits sharing solutions).” Do I Learn by Trying to Fix My Solution? My implementation always fails in the grader, though I already tested and stress tested it a lot. Would not it be better if you gave me a solution to this problem or at least the test cases that you use? I will then be able to fix my code and will learn how to avoid making mistakes. Otherwise, I do not feel that I learn anything from solving this problem. I am just stuck. First of all, learning from your mistakes is one of the best ways to learn. The process of trying to invent new test cases that might fail your program is difficult but is often enlightening. Thinking about properties of your program makes you understand what happens inside your program and in the general algorithm you’re studying much more. Also, it is important to be able to find a bug in your implementation without knowing a test case and without having a reference solution, just like in real life. Assume that you designed an application and an annoyed user reports that it crashed. Most probably, the user will not tell you the exact sequence of operations that led to a crash. Moreover, there will be no reference application. Hence, it is important to learn how to find a bug in your implementation yourself, without a magic oracle giving you either a test case that your program fails or a reference solution. We encourage you to use programming assignments in this class as a way of practicing this important skill. If you have already tested your program on all corner cases you can imagine, constructed a set of manual test cases, applied stress testing, etc, but your program still fails, try to ask for help on the forum. We encourage you to do this by first explaining what kind of corner cases you have already considered (it may happen that by writing such a post you will realize that you missed some corner cases!), and only afterwards asking other learners to give you more ideas for tests cases. 19

Related products

programming assignment programming assignment 3 hash tables github

Programming Assignment 2: Priority Queues and Disjoint Sets solved

programming assignment programming assignment 3 hash tables github

Data Structures Homework 2 solved

Data structures programming assignment 3: hash tables and hash functions solved.

POPULAR SERVICES

C programming assignment help Computer networking assignment help Computer science homework help Database management homework help Java programming help Matlab assignment help Php assignment help Python programming assignment help SQL assignment help Html homework help

The Codes Hive believes in helping students to write clean codes that are simple to read and easy to execute.Based in New York, United States, we provide assignment help, homework help, online tutoring and project help in programming to the students and professionals across the globe.

Disclaimer : The reference papers/tutorials provided are to be considered as model papers only and are not to submitted as it is. These papers are intended to be used for research and reference purposes only.

programming assignment programming assignment 3 hash tables github

jarviscodinghub

  • $ 0.00 0 items

programming assignment programming assignment 3 hash tables github

Programming Assignment 3: Hash Tables and Hash Functions solution

$ 29.99

Description

Introduction In this programming assignment, you will practice implementing hash functions and hash tables and using them to solve algorithmic problems. In some cases you will just implement an algorithm from the lectures, while in others you will need to invent an algorithm to solve the given problem using hashing. Learning Outcomes Upon completing this programming assignment you will be able to: 1. Apply hashing to solve the given algorithmic problems. 2. Implement a simple phone book manager. 3. Implement a hash table using the chaining scheme. 4. Find all occurrences of a pattern in text using Rabin–Karp’s algorithm. Passing Criteria: 2 out of 3 Passing thisprogramming assignmentrequires passingat least2out of3code problemsfrom thisassignment. In turn, passing a code problem requires implementing a solution that passes all the tests for this problem in the grader and does so under the time and memory limits specified in the problem statement. Contents 1 Problem: Phone book 3 2 Problem: Hashing with chains 5 3 Problem: Find pattern in text 9 4 General Instructions and Recommendations on Solving Algorithmic Problems 11 4.1 Reading the Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.2 Designing an Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.3 Implementing Your Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.4 Compiling Your Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.5 Testing Your Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.6 Submitting Your Program to the Grading System . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.7 Debugging and Stress Testing Your Program . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1 5 Frequently Asked Questions 14 5.1 I submit the program, but nothing happens. Why? . . . . . . . . . . . . . . . . . . . . . . . . 14 5.2 I submit the solution only for one problem, but all the problems in the assignment are graded. Why? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 5.3 What are the possible grading outcomes, and how to read them? . . . . . . . . . . . . . . . . 14 5.4 How to understand why my program fails and to fix it? . . . . . . . . . . . . . . . . . . . . . 15 5.5 Why do you hide the test on which my program fails? . . . . . . . . . . . . . . . . . . . . . . 15 5.6 My solution does not pass the tests? May I post it in the forum and ask for a help? . . . . . 16 5.7 My implementation always fails in the grader, though I already tested and stress tested it a lot. Would not it be better if you give me a solution to this problem or at least the test cases that you use? I will then be able to fix my code and will learn how to avoid making mistakes. Otherwise, I do not feel that I learn anything from solving this problem. I am just stuck. . . 16 2 1 Problem: Phone book Problem Introduction In this problem you will implement a simple phone book manager. Problem Description Task. In this task your goal is to implement a simple phone book manager. It should be able to process the following types of user’s queries: ∙ add number name. It means that the user adds a person with name name and phone number number to the phone book. If there exists a user with such number already, then your manager has to overwrite the corresponding name. ∙ del number. It means that the manager should erase a person with number number from the phone book. If there is no such person, then it should just ignore the query. ∙ find number. It means that the user looks for a person with phone number number. The manager should reply with the appropriate name, or with string “not found” (without quotes) if there is no such person in the book. Input Format. There is a single integer N in the first line — the number of queries. It’s followed by N lines, each of them contains one query in the format described above. Constraints. 1 ≤ N ≤ 105. All phone numbers consist of decimal digits, they don’t have leading zeros, and each of them has no more than 7 digits. All names are non-empty strings of latin letters, and each of them has length at most 15. It’s guaranteed that there is no person with name “not found”. Output Format. Print the result of each find query — the name corresponding to the phone number or “not found” (without quotes) if there is no person in the phone book with such phone number. Output one result per line in the same order as the find queries are given in the input. Time Limits. C: 3 sec, C++: 3 sec, Java: 6 sec, Python: 6 sec. C#: 4.5 sec, Haskell: 6 sec, JavaScript: 9 sec, Ruby: 9 sec, Scala: 9 sec. Memory Limit. 512Mb. Sample 1. Input: 12 add 911 police add 76213 Mom add 17239 Bob find 76213 find 910 find 911 del 910 del 911 find 911 find 76213 add 76213 daddy find 76213 Output: 3 Mom not found police not found Mom daddy Explanation: 76213 is Mom’s number, 910 is not a number in the phone book, 911 is the number of police, but then it was deleted from the phone book, so the second search for 911 returned “not found”. Also, note that when the daddy was added with the same phone number 76213 as Mom’s phone number, the contact’s name was rewritten, and now search for 76213 returns “daddy” instead of “Mom”. Sample 2. Input: 8 find 3839442 add 123456 me add 0 granny find 0 find 123456 del 0 del 0 find 0 Output: not found granny me not found Explanation: Recall that deleting a number that doesn’t exist in the phone book doesn’t change anything. Starter Files ThestartersolutionsforC++, JavaandPython3inthisproblemreadtheinput, implementanaivealgorithm tolookupnamesbyphonenumbersandwritetheoutput. Youneedtouseafastdatastructuretoimplement a better algorithm. If you use other languages, you need to implement the solution from scratch. What to Do Use the direct addressing scheme. Need Help? Ask a question or see the questions asked by other learners at this forum thread. 4 2 Problem: Hashing with chains Problem Introduction In this problem you will implement a hash table using the chaining scheme. Chaining is one of the most popular ways of implementing hash tables in practice. The hash table you will implement can be used to implement a phone book on your phone or to store the password table of your computer or web service (but don’t forget to store hashes of passwords instead of the passwords themselves, or you will get hacked!). Problem Description Task. In this task your goal is to implement a hash table with lists chaining. You are already given the number of buckets m and the hash function. It is a polynomial hash function h(S) =⎛ ⎝ |S|−1 ∑︁ i=0 S[i]xi mod p⎞ ⎠mod m, where S[i] is the ASCII code of the i-th symbol of S, p = 1 000 000 007 and x = 263. Your program should support the following kinds of queries: ∙ add string — insert string into the table. If there is already such string in the hash table, then just ignore the query. ∙ del string — remove string from the table. If there is no such string in the hash table, then just ignore the query. ∙ find string — output “yes” or “no” (without quotes) depending on whether the table contains string or not. ∙ check i — output the content of the i-th list in the table. Use spaces to separate the elements of the list. If i-th list is empty, output a blank line. When inserting a new string into a hash chain, you must insert it in the beginning of the chain. Input Format. There is a single integer m in the first line — the number of buckets you should have. The next line contains the number of queries N. It’s followed by N lines, each of them contains one query in the format described above. Constraints. 1 ≤ N ≤ 105; N 5 ≤ m ≤ N. All the strings consist of latin letters. Each of them is non-emptyand has length at most 15. Output Format. Print the result of each of the find and check queries, one result per line, in the same order as these queries are given in the input. Time Limits. C: 1 sec, C++: 1 sec, Java: 5 sec, Python: 7 sec. C#: 1.5 sec, Haskell: 2 sec, JavaScript: 7 sec, Ruby: 7 sec, Scala: 7 sec. Memory Limit. 512Mb. 5 Sample 1. Input: 5 12 add world add HellO check 4 find World find world del world check 4 del HellO add luck add GooD check 2 del good Output: HellO world no yes HellO GooD luck Explanation: The ASCII code of ’w’ is 119, for ’o’ it is 111, for ’r’ it is 114, for ’l’ it is 108, and for ’d’ it is 100. Thus, h(“world”) = (119+111×263+114×2632 +108×2633 +100×2634 mod 1 000 000 007) mod 5 = 4. It turns out that the hash value of HellO is also 4. Recall that we always insert in the beginning of the chain, so after adding “world” and then “HellO” in the same chain index 4, first goes “HellO” and then goes “world”. Of course, “World” is not found, and “world” is found, because the strings are case-sensitive, and the codes of ’W’ and ’w’ are different. After deleting “world”, only “HellO” is found in the chain 4. Similarly to “world” and “HellO”, after adding “luck” and “GooD” to the same chain 2, first goes “GooD” and then “luck”. Sample 2. Input: 4 8 add test add test find test del test find test find Test add Test find Test Output: yes no no yes Explanation: Adding “test” twice is the same as adding “test” once, so first find returns “yes”. After del, “test” is 6 no longer in the hash table. First time find doesn’t find “Test” because it was not added before, and strings are case-sensitive in this problem. Second time “Test” can be found, because it has just been added. Sample 3. Input: 3 12 check 0 find help add help add del add add find add find del del del find del check 0 check 1 check 2 Output: no yes yes no add help Explanation: Note that you need to output a blank line when you handle an empty chain. Note that the strings stored in the hash table can coincide with the commands used to work with the hash table. Starter Files There are starter solutions only for C++, Java and Python3, and if you use other languages, you need to implement solution from scratch. Starter solutions read the input, do a full scan of the whole table to simulate each find operation and write the output. This naive simulation algorithm is too slow, so you need to implement the real hash table. What to Do Follow the explanations about the chaining scheme from the lectures. Remember to always insert new strings in the beginning of the chain. Remember to output a blank line when check operation is called on an empty chain. Some hints based on the problems encountered by learners: ∙ Beware of integer overflow. Use long long type in C++ and long type in Java where appropriate. Takeeverything (mod p) assoonaspossiblewhilecomputingsomething (mod p),sothatthenumbers are always between 0 and p−1. 7 ∙ Beware of taking negative numbers (mod p). In many programming languages, (−2)%5 ̸= 3%5. Thus you can compute the same hash values for two strings, but when you compare them, they appear to be different. To avoid this issue, you can use such construct in the code: x ← ((a%p) + p)%p instead of just x ← a%p. Need Help? Ask a question or see the questions asked by other learners at this forum thread. 8 3 Problem: Find pattern in text Problem Introduction In this problem, your goal is to implement the Rabin–Karp’s algorithm. Problem Description Task. In this problem your goal is to implement the Rabin–Karp’s algorithm for searching the given pattern in the given text. Input Format. There are two strings in the input: the pattern P and the text T. Constraints. 1 ≤|P|≤|T|≤ 5·105. The total length of all occurrences of P in T doesn’t exceed 108. The pattern and the text contain only latin letters. Output Format. Print all the positions of the occurrences of P in T in the ascending order. Use 0-based indexing of positions in the the text T. Time Limits. C: 1 sec, C++: 1 sec, Java: 5 sec, Python: 5 sec. C#: 1.5 sec, Haskell: 2 sec, JavaScript: 3 sec, Ruby: 3 sec, Scala: 3 sec. Memory Limit. 512Mb. Sample 1. Input: aba abacaba Output: 0 4 Explanation: The pattern aba can be found in positions 0 (abacaba) and 4 (abacaba) of the text abacaba. Sample 2. Input: Test testTesttesT Output: 4 Explanation: Pattern and text are case-sensitive in this problem. Pattern Test can only be found in position 4 in the text testTesttesT. Sample 3. Input: aaaaa baaaaaaa Output: 1 2 3 Explanation: Note that the occurrences of the pattern in the text can be overlapping, and that’s ok, you still need to output all of them. 9 Starter Files The starter solutions in C++, Java and Python3 read the input, apply the naive O(|T||P|) algorithm to this problem and write the output. You need to implement the Rabin–Karp’s algorithm instead of the naive algorithm and thus significantly speed up the solution. If you use other languages, you need to implement a solution from scratch. What to Do Implement the fast version of the Rabin–Karp’s algorithm from the lectures. Some hints based on the problems encountered by learners: ∙ Beware of integer overflow. Use long long type in C++ and long type in Java where appropriate. Takeeverything (mod p) assoonaspossiblewhilecomputingsomething (mod p),sothatthenumbers are always between 0 and p−1. ∙ Beware of taking negative numbers (mod p). In many programming languages, (−2)%5 ̸= 3%5. Thus you can compute the same hash values for two strings, but when you compare them, they appear to be different. To avoid this issue, you can use such construct in the code: x ← ((a%p) + p)%p instead of just x ← a%p. ∙ Use operator == in Python instead of implementing your own function AreEqual for strings, because built-in operator == will work much faster. ∙ In C++, method substr of string creates a new string, uses additional memory and time for that, so use it carefully and avoid creating lots of new strings. When you need to compare pattern with a substring of text, do it without calling substr. ∙ In Java, however, method substring does NOT create a new String. Avoid using new String where it is not needed, just use substring. Need Help? Ask a question or see the questions asked by other learners at this forum thread. 10 4 General Instructions and Recommendations on Solving Algorithmic Problems Your main goal in an algorithmic problem is to implement a program that solves a given computational problem in just few seconds even on massive datasets. Your program should read a dataset from the standard input and write an answer to the standard output. Below we provide general instructions and recommendations on solving such problems. Before reading them, go through readings and screencasts in the first module that show a step by step process of solving two algorithmic problems: link. 4.1 Reading the Problem Statement You start by reading the problem statement that contains the description of a particular computational task as well as time and memory limits your solution should fit in, and one or two sample tests. In some problems your goal is just to implement carefully an algorithm covered in the lectures, while in some other problems you first need to come up with an algorithm yourself. 4.2 Designing an Algorithm If your goal is to design an algorithm yourself, one of the things it is important to realize is the expected running time of your algorithm. Usually, you can guess it from the problem statement (specifically, from the subsection called constraints) as follows. Modern computers perform roughly 108–109 operations per second. So, if the maximum size of a dataset in the problem description is n = 105, then most probably an algorithm with quadratic running time is not going to fit into time limit (since for n = 105, n2 = 1010) while a solution with running time O(nlogn) will fit. However, an O(n2) solution will fit if n is up to 103 = 1000, and if n is at most 100, even O(n3) solutions will fit. In some cases, the problem is so hard that we do not know a polynomial solution. But for n up to 18, a solution with O(2nn2) running time will probably fit into the time limit. To design an algorithm with the expected running time, you will of course need to use the ideas covered in the lectures. Also, make sure to carefully go through sample tests in the problem description. 4.3 Implementing Your Algorithm When you have an algorithm in mind, you start implementing it. Currently, you can use the following programming languages to implement a solution to a problem: C, C++, C#, Haskell, Java, JavaScript, Python2, Python3, Ruby, Scala. For all problems, we will be providing starter solutions for C++, Java, and Python3. If you are going to use one of these programming languages, use these starter files. For other programming languages, you need to implement a solution from scratch. 4.4 Compiling Your Program For solving programming assignments, you can use any of the following programming languages: C, C++, C#, Haskell, Java, JavaScript, Python2, Python3, Ruby, and Scala. However, we will only be providing starter solution files for C++, Java, and Python3. The programming language of your submission is detected automatically, based on the extension of your submission. We have reference solutions in C++, Java and Python3 which solve the problem correctly under the given restrictions, and in most cases spend at most 1/3 of the time limit and at most 1/2 of the memory limit. You can also use other languages, and we’ve estimated the time limit multipliers for them, however, we have no guarantee that a correct solution for a particular problem running under the given time and memory constraints exists in any of those other languages. Your solution will be compiled as follows. We recommend that when testing your solution locally, you use the same compiler flags for compiling. This will increase the chances that your program behaves in the 11 same way on your machine and on the testing machine (note that a buggy program may behave differently when compiled by different compilers, or even by the same compiler with different flags). ∙ C (gcc 5.2.1). File extensions: .c. Flags: gcc -pipe -O2 -std=c11

Related products

programming assignment programming assignment 3 hash tables github

Project #1 Java multithreading solution

Project 2 expression interpreter solution, assignment 8 payroll modification solution.

programming assignment programming assignment 3 hash tables github

IMAGES

  1. How To Implement a Sample Hash Table in C/C++

    programming assignment programming assignment 3 hash tables github

  2. Hash Tables from Ground Up

    programming assignment programming assignment 3 hash tables github

  3. Data Structures: Hash Tables

    programming assignment programming assignment 3 hash tables github

  4. Hash Tables

    programming assignment programming assignment 3 hash tables github

  5. Hashing and Hash Table in Data Structure

    programming assignment programming assignment 3 hash tables github

  6. Hash tables and Dictionary ADT

    programming assignment programming assignment 3 hash tables github

VIDEO

  1. NPTEL Programming In Java Week 7 Programming Assignment 3 Answers l March 2024

  2. Hash table : Get || JavaScript Data Structures & Algorithms

  3. NPTEL Programming In Java Week 10 Programming Assignment 3 Answers l March 2024

  4. 6. Week

  5. Hash Table

  6. VIDEO USER MANUAL ASSIGNMENT PROGRAMMING

COMMENTS

  1. Data Structures and Algorithms 02: Data Structures Programming ...

    Data Structures Programming Assignment 03: Hash Tables and Hash Functions Problems Practice implementing hash functions and hash tables and using them to solve algorithmic problems:

  2. Programming Assignment 3

    The data for this assignment come from the Hospital Compare web site run by the U.S. Department of Health and Human Services. The purpose of the web site is to provide data and information about the quality of care at over 4,000 Medicare-certi ed hospitals in the U.S.

  3. GitHub

    Contribute to akueisara/datastructures development by creating an account on GitHub. ... Programming Assignment 2: Priority Queues and Disjoint Sets. Problem: Convert array ... Programming Assignment 3: Hash Tables and Hash Functions. Problem: Phone book Problem: Hashing with chains Problem: Find pattern in text. Week 4 & 5. Programming ...

  4. Data Structures and Algorithms, UC San Diego · GitHub

    Data Structures and Algorithms Specialization. Master Algorithmic Programming Techniques. Learn algorithms through programming and advance your software engineering or data science career. About This Specialization. This specialization is a mix of theory and practice: you will learn algorithmic techniques for solving various computational ...

  5. PDF {"payload":{"allShortcutsEnabled":false,"fileTree":{"PA3 ...

    {"payload":{"allShortcutsEnabled":false,"fileTree":{"PA3_HashTablesAndHashFunctions":{"items":[{"name":"07_hash_tables_1_intro.pdf","path":"PA3 ...

  6. PDF {"payload":{"allShortcutsEnabled":false,"fileTree":{"data ...

    {"payload":{"allShortcutsEnabled":false,"fileTree":{"data_structures/hash_tables":{"items":[{"name":"Programming Assignment 3.pdf","path":"data_structures/hash_tables ...

  7. R Programming Programming Assignment 3 (Week 4) John Hopkins ...

    And this course makes me do this for every assignment!!!!! By the way, I have a decent knowledge of programming where I gain from learning Python. I thought this class would be easy for me (after quickly going through the lecture videos), yet from the first assignment I began to scratch my head for an answer.

  8. Data-Structures/Week 4 Programming Assignment Solution ...

    Saved searches Use saved searches to filter your results more quickly

  9. GitHub

    Coursera: Data Structures and Algorithms Specialization. Progress: 4/6 courses completed. This repository contains almost all the solutions for Data Structures and Algorithms Specialization. The language of choice is Python3, but I tend to switch to Ruby/Rust in the future. All program assignments can be found inside the course weeks directory.

  10. PDF Assignment 3: Hash Tables

    chaining (i.e. linked list) based hash-table. Scope The focus of this assignment is a little different from the previous two. We have done most of the coding for you. Your job is to investigate how to design an effective and efficient hash-table. In particular, you will investigate the effect of different hash codes and load factors on the time ...

  11. PDF Programming Assignments #3 Group project: Implementing your hash tables

    Programming Assignments #3 Group project: Implementing your hash tables Due date: 11/4/2010 Friday 11:59PM Problem (30 points) Implement dynamic hash tables using several different collision resolution techniques. In this project, you need to implement three collision resolution techniques, that is, (1) open addressing using linear

  12. Week3 hash tables

    Programming Assignment 3: Hash Tables and Hash Functions. Revision: March 29, 2020. Introduction. In this programming assignment, you will practice implementing hash functions and hash tables and using them to solve algorithmic problems. In some cases you will just implement an algorithm from the lectures, while in others you will need to ...

  13. Hash Table

    Step 3: Implement CourseGradebook's SetScore() and GetScore() functions. Implement the SetScore() function to enter a single entry into the gradebook. SetScore() has parameters for the assignment name, student ID, and score. How the entry is stored depends on the member variables chosen in step 2.

  14. Assignment 3: A Hash Table ADT

    The assignment consists of two parts. In Part 1 your task is to create a Table ADT. A Table should be implemented as a hash table in which each bucket is a Set (as defined in Assignment 2). Thus each Table is composed of Sets; effectively, the Table ADT is a user/client of the Set ADT. You should also create an associated TableIter ADT to allow ...

  15. Rob Hess

    Assignments. Programming assignments will be managed via GitHub Classroom. Following the links below will prompt you to sign in to GitHub and to create an assignment repository for yourself. The assignment repository will at a minimum contain a README.md file containing the assignment description. There may also be additional skeleton files in ...

  16. PDF Assignment 3: Hash Tables Implementation

    Complete the program by implementing an improved dictionary data structure that uses a hash table instead of an array or list in order to speed up lookup. Implement the following hash functions: add3 adds the ASCII values of the first three characters of the word. kr the first hash function given in the lecture slides (multiplicative hashing ...

  17. Week3 hash tables

    Programming Assignment 3: Hash Tables and Hash Functions. Revision: July 8, 2019. Introduction. In this programming assignment, you will practice implementing hash functions and hash tables and using them to solve algorithmic problems. In some cases you will just implement an algorithm from the lectures, while in others you will need to invent ...

  18. PDF Programming Assignment 3: Hash Tables and Hash Functions

    Hash Tables and Hash Functions Revision: August25,2016 ... For solving programming assignments, you can use any of the following programming languages: C, C++, ... Also, check the boundary values. Ensure that your program processes correctly sequences of size = 1,2,105.

  19. Programming Assignment 3: Hash Tables and Hash Functions

    Hash tables - Free download as PDF File (.pdf), Text File (.txt) or read online for free.

  20. Hash Tables

    4-2 Assignment: Hash Tables Code Reflection This assignment required the use of hash table logic to load, display, insert, search, and remove bids from a CSV file. The HashTable program contained empty methods that required adding logic to complete the program. I followed the in-line text and only had difficulty in the insert and search methods.

  21. Programming Assignment 3: Hash Tables and Hash Functions solved

    Upon completing this programming assignment you will be able to: 1. Apply hashing to solve the given algorithmic problems. 2. Implement a simple phone book manager. 3. Implement a hash table using the chaining scheme. 4. Find all occurrences of a pattern in text using Rabin-Karp's algorithm.

  22. Programming Assignment 3: Hash Tables and Hash Functions solution

    Learning Outcomes Upon completing this programming assignment you will be able to: 1. Apply hashing to solve the given algorithmic problems. 2. Implement a simple phone book manager. 3. Implement a hash table using the chaining scheme. 4. Find all occurrences of a pattern in text using Rabin-Karp's algorithm.