# 算法代写｜CSE 417 Algorithms & Computational Complexity Assignment #8

This is the final version; re-read it carefully if you’ve looked at draft versions.

These problems relate to RNA secondary structure prediction (aka “RNA folding”), from lecture and text section 6.5. In short, you will implement Nussinov’s algorithm, its associated “traceback” routine, and time them.

Structure: RNA (secondary) structure is often diagrammed as shown at right, which nicely
depicts paired and unpaired regions. However, these pictures are somewhat awkward to generate. “Dot-bracket” notation is logically equivalent and simpler to generate. This is a string of parens
and dots, of the same length as the RNA string. A dot means that the corresponding position in the RNA is unpaired; a left paren means it is paired with a position to its right, marked by the matching right paren. (Parens must be properly balanced/nested as a consequence of the “no pseudoknots” rule.) The sequence and structure shown below are the same as in the figure at right.

GCCCACCUUCGAAAAGACUGGAUGACCAUGGGCCAUGAUU [1]
((((….((…..)).(((….))).))))……. [2]
Summary: 9, 40, 0.06 [3]
[4]

Input: Your main program should read a sequence of lines from “standard input” each containing one such string from the 4 letter alphabet fA,G,C,Ug (all uppercase), as in line [1], and call “Nussinov” (below) on each. Different lines may be of different lengths (different “n”).

Core Algorithm: You should have a subroutine named “Nussinov” whose single argument is a string of letters as specified above. It should calculate the Nussinov OPT table, call your traceback routine, and print the outputs specified below. The conventions used to index the OPT table differ between the book and the slides; follow the conventions in my slides.

Traceback: Also provide a traceback routine generating one of the optimal structures corresponding to your OPT table. I say “one of” since there may be different structures with equal numbers of pairs, often slight variants of each other. Give any one of them. E.g., there are 14 different optimal structures for the sequence on line [1]. (The structure shown on line [2] is not one of them; it just serves to illustrate the format).

Output: For each “Nussinov” call, print to standard out:

• One line echoing the input, like [1] above.

• A second line containing (one of) its optimal structure(s) (i.e., output of your traceback, formatted as in [2]above) and vertically aligned with the RNA sequence.

• Additionally, for a length n input, if n ≤ 15, print the n × n OPT matrix calculated by Nussinov’s algorithm.

Print one line per row (smallest index first for both rows and columns) with n white-space-separated integer values per line, preferably keeping columns vertically aligned; do not have blank lines above/below/interspersed with this.

• Also print a “Summary” line, as shown in [3] above, giving (comma-separated) (i) the total number of pairs in that structure, (ii) the length of the input, and (iii) the time taken for this calculation, in milliseconds (or fractions thereof).

• Finally, follow all of this by one blank line, as in [4]. (One blank lines per input in total; no other blank lines should appear.)

(The bracketed numbers, “[1]”…, should not be part of your output; they’re just to allow me to reference the specific lines in this writeup. The examples linked at the end of this writeup illustrate the reqjuired format.)

Turnin: You are writing one program, but your turnin (and our grading) will be split into three parts:

1. Traceback: Give a traceback algorithm to construct a dot-bracket structure string. I strongly recommend that you look for a recursive algorithm for this, but it is not required. As usual, describe it in English, explain how it works/why it is correct, and analyze its run-time, and compare to the run-time of Nussinov OPT table construction.

E-mail: itcsdx@outlook.com  微信:itcsdx