Datalog代写 | CS6340 Constraint Based Analysis

本次prolog代写是使用高级版的prolog-datalog实现静态分析

 

Corresponding Lecture: Lesson 7 (Constraint Based Analysis)
Objective
In this lab, you will implement two static analyses using a declarative logic programming language, Datalog. Declarative languages like Datalog are different from imperative languages like C in that they specify what the program should accomplish but not how the program should accomplish it.
For those without experience in declarative languages, Datalog relations are similar to SQL (Structured Query Language) queries. The analyses you will implement are modified-reference (mod-ref) analysis and local variable analysis. Both analyses are flow-insensitive,context-insensitive, may (as opposed to must), and interprocedural (requiring reasoning across method boundaries).
Note on Past Issues
This lab has seen very large numbers of students submitting work that was not their own in past semesters. It has caused a high number of students to be submitted to the Office of Student
Integrity for Academic Integrity violations. In particular, it’s possible to find solutions to similar analyses on the internet. Looking at these solutions in any form is likely to influence your thinking and cause your solutions to be similar to them, especially given the condensed syntax of
Datalog, which is an Academic Integrity violation in this class. If you are unclear of our guidelines for what is collaboration and what is cheating, we suggest reviewing that section of the syllabus. If you have any questions about what is allowed and what is not allowed, please reach out to Instructors via Piazza for clarification.
Students who submit solutions found to be similar to online resources or other students should expect a 0 grade on the lab, a disciplinary record of an Academic Integrity issue through the Office of Student Integrity, and will not be eligible to receive a final grade of A in the course.
Students who have had past Academic Integrity issues may find that OSI assigns them higher penalties.
Important Documentation (You will need to read this to complete the lab!)
● How to write an analysis in Datalog:
https://github.com/petablox-project/petablox/wiki/Datalog-Analysis
● Predefined relations:
https://github.com/petablox-project/petablox/wiki/Predefined-Analyses
Help with Understanding Recursion in Datalog
● Learn Prolog Now website with good tutorials on recursion (Datalog is a subset of Prolog) [chapters 1 and 3 of the online book in particular]. The example problems are great material to prepare for exam questions on Datalog:
http://www.learnprolognow.org/lpnpage.php?pageid=online
Online Datalog Test Environment
● Allows you to practice writing Datalog in your web browser. The “Rules” section allows you to populate relations both from data and by writing rules. The “Query” section allows you to write a relation whose output will be shown at the bottom of the screen.
Unfortunately most of the examples on the site are in German, so we recommend starting with the two examples where the code is in English (“Umlaufbahnen von kosmologischen Objekten” (“Orbits of Cosmological Objects”), “Pfade in einem
Graphen” (“Paths in a Graph”), and “Computer Parts”). For a more complex example,there are two samples covering paths between stations on parts of the Munich U-Bahn subway’s stations, where the German isn’t too difficult to figure out:
https://datalog.db.in.tum.de/
Additional Petablox Resources
● Petablox repository:
https://github.com/petablox-project/petablox
● Petablox user guide:
https://github.com/petablox-project/petablox/wiki
Setup
1. Download Datalog.zip from Canvas. When uncompressed, it will produce the directory Datalog/.
2. Build the example Java program to be analyzed (TestCase) by running ant in the directory Datalog/examples/test_case/. This will produce the subdirectory classes/.
3. Write your mod-ref analysis in the file Datalog/src/modreference.dlog and your local variables analysis in the file Datalog/src/local_vars.dlog. See below for more details on these analyses.
4. To run your analyses, run the following commands in Datalog/:
ant -Dpetablox.work.dir=examples/test_case/ modreference
ant -Dpetablox.work.dir=examples/test_case/ local_vars
6. You can find the output of the analyses under the directory
Datalog/examples/test_case/petablox_output/.
7. Sample outputs for the test cases are provided in Datalog/sample_output/.
8. We have provided a script compare.sh that you can use to verify your analyses against the provided sample output. From the Datalog/ directory, grant the script execute permissions with the command chmod +x compare.sh and then use the command ./compare.sh modreference or ./compare.sh local_vars to execute the script. The script output will indicate how closely your most recent analysis run matches the sample solutions. Note that you may receive an error for test cases you have not yet executed.
9. If your analysis does not run correctly, you may view any errors that were encountered in petablox_output/log.txt.
10. Both analyses will use a pre-existing, flow- and context-insensitive pointer analysis with on-the-fly call graph construction (henceforth called pointer/call-graph analysis). This analysis is provided in Datalog/cipa_0cfa.dlog. You can study it to learn how to write your own analyses in Datalog, however you should not spend time trying to understand each relation defined in this analysis. It declares all the relations (both input and output) that you will need to write your analyses. Of particular interest are:
○ reachable methods in the relation – reachableM
○ the call graph in the relation IM
○ points-to information in the relations VH, FH, and HFH
○ the relation specifying the containing method of each call site – MI
NOTE: Any changes you make to this file will not be reflected when you run your analyses. This file is provided as a reference only. Petablox uses an analogous file in petablox.jar for the purpose of executing analyses.
Part 1: Modified-Reference (mod-ref) Analysis
The goal of mod-ref analysis is to determine which memory locations may be modified and which memory locations may be referenced during the execution of a method. The execution of a method includes the execution of any methods called by that method, directly or transitively.
Your analysis must consider each reachable method (that is, for each method in relation reachableM). The output of your analysis must be in the form of the following four relations:
1. The relation refStatField⊆ (M × F) containing each tuple (m, f) such that pointer-typed static field f may be read during the execution of method m.
2. The relation modStatField⊆ (M × F) containing each tuple (m, f) such that pointer-typed static field f may be written to during the execution of method m.
3. The relation modInstField⊆ (M × H × F) containing each tuple (m, h, f) such that the pointer-typed instance field f of an object allocated at site h may be written to during the execution of method m. Note that f may be the element 0 in the domain ?, in which case it denotes some pointer-typed element of an array allocated at site h.
4. The relation refInstField⊆ (M × H × F) containing each tuple (m, h, f) such that pointer-typed instance field f of an object allocated at site h may be read during the execution of method m. As in the preceding item, f may denote a pointer-typed array element.
Use the following relations to write your analysis. They are also used as input relations in the pointer/call-graph analysis provided to you:
1. The relation MgetStatFldInst⊆ (M × V × F) contains each tuple (m, v, f) such that the body of method m contains the instruction v = f, where f is a static field of pointer type.
2. The relation MputStatFldInst⊆ (M × F × V) contains each tuple (m, f, v) such that the body of method m contains the instruction f = v, where f is a static field of pointer type.
3. The relation MgetInstFldInst⊆ (M × V × V × F) contains each tuple (m, v, b, f) such that the body of method m contains the instruction v = b.f, where f is either an instance field of pointer type or it is the distinguished hypothetical field (denoted by element 0 in domain F) that designates any array element.
4. The relation MputInstFldInst⊆ (M × V × F × V) contains each tuple (m, b, f, v) such that the body of method m contains the instruction b.f = v, where f is either an instance field of pointer type or it is the distinguished hypothetical field (denoted by element 0 in domain F) that designates any array element.
You can also use the output relations of the pointer/call-graph analysis provided to you. You will need to use both call-graph information (to determine method reachability) and points-to information (to determine the sites that may allocate objects whose instance fields or array elements may be read/written by a particular method). You can find the descriptions of these relations at the link provided in the Resources section above.
Datalog supports recursive relations. A relation which is used in its own definition is recursive.
Recursive relations must also have a base case. For example, a recursive relation P defined in terms of other relations Q and R may look something like this (the second relation is the base case):
P(i) :- P(i), Q(i).
P(i) :- R(i).
The execution of a method includes the execution of any methods that are called by it directly or transitively. You can use relations MI and IM recursively to determine such methods.
For this lab, we will ignore reporting static fields, instance fields, and array elements of non-pointer type (called primitive types in Java, e.g., boolean, int, double, etc.) that may be read or written during method execution. The above relations such as MgetStatFldInst, etc. already ignore such fields.

Part 2: Local Variable Analysis
The goal of local variables analysis is to determine for each reachable method m and for each object allocation site h in the body of m, whether any object allocated at h in any invocation of m may outlive that invocation. If so, the site is called method-escaping; if not, it is called method-local and all objects created at that site at runtime can be allocated on the method invocation stack (instead of being allocated in the heap, which is more expensive).
An object may outlive the invocation of its allocating method if any of the following conditions hold:
● An object allocated at site h may become reachable from some global variable (i.e., a static field in Java).
● An object allocated at site h may become reachable from some argument of method m.
● An object allocated at site h may become reachable from some return variable of method m.
Your analysis must consider each reachable method (that is, for each method in relation reachableM). The output of your analysis must be a relation localMH⊆ (M × H) containing each tuple (m, h) such that allocation site h contained in the body of reachable method m is method-local and a relation escapingMH⊆ (M × H) containing each tuple (m, h) such that allocation site h contained in the body of reachable method m is method-escaping.
Use the following relations to write your analysis.
● The relation MmethArg⊆ (M × Z× V) contains each tuple (m, z, v) such that variable v is the zth actual argument of method m. (Z denotes the set of integers.)
● The relation MmethRet⊆ (M × Z× V) contains each tuple (m, z, v) such that variable v is the return value of method m (you can ignore z).
In addition, you may use the output relations of the pointer/call-graph analysis provided to you and described above. You will need to use both call-graph information (to determine method reachability) and points-to information (to determine the sites that may allocate objects that may escape). You can find the descriptions of these relations at the link provided in the Resources section above.

Note that this analysis requires reasoning about reachability in the heap (as exemplified by the use of the word “reachable” in the above three items); use the relation HFH recursively for this purpose. The relations VH and FH will provide the base cases for the recursion.
Items to Submit
We expect your submission to conform to the standards specified below. To ensure that your submission is correct, you should run the provided file validator. You should not expect submissions that have an improper folder structure, incorrect file names, or in other ways do not conform to the specification to score full credit. The file validator will check folder structure and names match what is expected for this lab, but won’t (and isn’t intended to) catch everything.
The command to run the tool is: python3 zip_validator.py lab4 lab4.zip
Submit the following files in a single compressed file (.zip format) named lab4.zip. For full credit, there must not be any subfolders or extra files contained within your zip file.
1. (50 points) modreference.dlog
2. (50 points) local_vars.dlog
Grading Criteria
Generally, credit for this lab is awarded as follows:
● For each program, both analyses will be equally weighted
● All outputs of the analysis will be compared against the correct output
● Each output is equally weighted
● For each output, your score will be % of the [# expected tuples]
[# expected tuples]−[# missing tuples]−[# extra tuples]
total possible points for that output. For example, an output with 10 expected tuples where you found all 10 expected tuples but also two additional tuples would score % 0% of the possible credit 10 10−0−2 = 8
Your datalog analyses will be graded against several programs; the three provided benchmark programs will constitute at least half of your grade but other programs will be run against your analysis as well. In general, the programs used in grading but not provided as part of the lab are of the same order of complexity as the provided programs. Note that programs will be weighted so more complicated programs are worth more credit.
We encourage you to write your own programs and analyze them with your two analyses.
Please feel free to share your programs to be analyzed and your output with your classmates on Piazza. The course staff will not be evaluating these programs for correctness, but you should compare your output with your classmates to gain additional insight into if your analyses are correct. While we encourage you to write and share these programs, please do not share your Datalog code.