Programming Homework Help

Programming Homework Help. Algorithms for Massive Data frequency dataset reservoir summary misra-Gries summery countmin Sketch

Best to complete in 1 day, and python. 1. Execute bruteforce computation (10 pts): Compute the frequency vector of

all words, descendingly sort the words by their frequencies, and save the result

into le (since you might use the bruteforce result for the next tasks). You need

to report:

(a) The average frequency of the words in stream (5 pts).

(b) Plot the sorted frequency of words to observe the skewed distribu-

tion (5 pts).

2. Reservoir Summary (10 pts): Implement Reservoir Sampling to see the skewed

distribution of our frequency vector. Fix the summary size S = 10; 000, you need

to:

(a) Estimate the frequency vector from our Reservoir Summary, and

plot this estimate vector to see the approximation skewness (5 pts).

(b) Run your Reservoir Sampling 5 times and report the average num-

ber of times the summary has been updated over these 5 runs (5

pts).

3. Misra-Gries Summary (15 pts): Implement Misra-Gries summary to nd the

most frequent words whose frequency is larger than 1,000. You need to:

(a) Explain the size of summary you choose such that you can nd

these frequent words (5 pts).

(b) Run your Misra-Gries summary and report the number of decre-

ment steps with your chosen parameter (10 pts).

4. CountMin Sketch (15 pts): Implement CountMin sketch to estimate the fre-

quency of words. You need to:

(a) Explain the size of summary you choose such that the estimate

error is at most 100 (5 pts).

An answer.pdf le reports the requested values and explanation of each task.

A source code le contains detailed comments.

(b) Run your CountMin Sketch with your chosen parameters, and re-

port the estimate of the frequency of the words, whose frequency

is larger than 1,000 found in the bruteforce algorithm (10 pts).

Programming Homework Help

 
"Our Prices Start at $11.99. As Our First Client, Use Coupon Code GET15 to claim 15% Discount This Month!!"