数据库代写|FIT3176 Assignment 2022-Semester2

这是一篇来自澳洲的数据库代写

 

A.General Information and Submission

o This is a group assignment. One group consists of 2 students.

o Submission method: Submission is online through Moodle.

o Penalty for late submission: 10% deduction for each day (including weekends).

o Assignment Cover Sheet: You will need to sign the assignment cover sheet.

o Contribution Form: The contribution needs to be completed by all members and please sign (e-signature is acceptable) the form as an agreement between members.

o Please carefully read the requirements for EACH section, especially the Task Outputs.

B.Problem Description – MonUGov

MonUGov is a Monash University initiative which utilizes open source government data to provide services to the Monash community. One service MonUGov provides is helping individuals find housing options according to their preferences in budget, property type, area,location etc. They do this by accessing various datasets given in the vic.gov.au site and doing analysis to match the individual’s preferences.

MonUGov has hired your team of Advanced Database Experts to use the following sample data files downloaded from vic.gov.au to help with the data analysis that helps MonUGov provide their services:

  • suburbs.csv
  • landmarks.csv
  • properties.json
  • properties.csv

Note: These data are raw data that does not follow any particular schema.

For the analysis MonUGov has asked your team to perform the following tasks:

C.Tasks

The assignment is divided into FOUR main tasks:

Since MonUGov has heard about both MongoDB and Cassandra, therefore, they wish to use a combination of both technologies to analyse the data.

Task Requirements:

The following tasks require the use of MongoDB Compass:

C.1.1. Create a database called monUGovDB.

Provide in your report a screenshot of the created database in the list of all databases.

C.1.2. In the newly created database using appropriate data types add the data from

  1. suburbs.csv into the suburbs collection
  2. landmarks.csv into the landmarks collection
  3. properties.json into the properties collection

Provide in your report a screenshot of one document in each created collection after adding the data.

Note: Please check each field and the queries from section C1.4 to assign the relevant data types either while or after importing. More than one collection can be used to save any modified documents; however the result of C1.2 should be the 3 collections mentioned above with correct names: suburbs, landmarks and properties.

The following tasks require the use of MongoDB Shell. Where applicable and unless stated otherwise you can use either MongoDB CRUD methods, single-purpose aggregations, or aggregation pipeline to answer the tasks.

Note: Marks for this section depend on the query efficiency e.g. the processing speed, number of documents scanned, the storage, the number of queries used, etc. Therefore, using more than the required amount of queries to answer a section or using temporary variables, collections, cursors (e.g. for each loop) may incur mark penalties.

C.1.3. Read the queries from C1.4 and create one single field index and one compound (more than one field) index to help speed up at least one query.

Provide in your report the code to create the indices and screenshots of the created indices details.

C.1.4. The following questions are MongoShell queries:

(i)List the landmarks that had a theme of “School”, for example, but not limited to “Secondary Schools”, “Primary Schools”, “School – Primary and Secondary Education” etc.

Provide the MongoShell query code and a screenshot of the MongoDB Shell output containing the landmarkName and theme name in the following format:

“landmarkName”: ________,

“theme”: ________

(ii)Count the number of properties in each suburb and list all suburbs from highest to lowest property count.

Provide the MongoShell query code and a screenshot of the MongoDB Shell output containing the suburb name and the number of properties as propertyCount in the following format:

“suburb”: ________,

“propertyCount”: ________

(iii)Using MongoDB Aggregation Pipeline display the Council Area having the second highest average property count. Display the result in the following format with avgPropertyCount rounded to 1 decimal place:

“councilArea”: ________,

“avgPropertyCount”: ________

Provide the MongoShell query code and a screenshot of the MongoDB Shell output containing the councilArea name of the council and the avgPropertyCount fields.