Java代写 | COMP6214 Open Data Innovation Coursework 1 Assignment: Open Data
本次Java代写是进行数据建模和数据清洗
COMP6214 Open Data Innovation Coursework 1
 Assignment: Open Data
 Part 1: Clean the dataset
 You will be required to clean the dataset and perform simple manipulations such as formatting,
 fixing errors etc. to prepare it for creating your visualisation. You will be assessed on your ability
 to identify and handle a number of different types of errors in the dataset. These errors should be
 accounted for through pre-processing (using tools such as Open Refine or using your own scripts
 or code). You must provide a written description of your data cleaning and manipulation methods
 (for details of what to include, see Part 3, below, under “Reporting on your visualisation”.
 There are several errors and error types in the dataset, and you should look for at least 6 errors.
 It is not necessary to find and fix all of the errors in the CSV file to be awarded the full marks,
 provided you have spotted, reported and outlined solutions for at least 6 and have provided
 elegant solutions.
 Part 2: Model your dataset and represent them in RDF
 Any RDF serialisation type is adequate [RDF/XML, JSON-LD, TURTLE, etc.]. Populate your
 RDF model with the dataset (examples are given in the class, and will be uploaded on the
 course website)
 Part 3: Create and Host your visualisation
 Creating your visualisation
 You must build a visualisation of the dataset. You must use one of the Open Data visualisation
 library/tools we will cover in class for the task. Your visualisation should have suitable interactivity
 that allows for manipulation, filtering, and detailed analysis of the data.
 You should aim to develop a multidimensional (greater than 2 dimensions) visualisation that
 enables rich exploration of the data. Note that “multidimensional” refers to the dimensions of the
 data, not the visualisation, i.e. expected to use the values from at least 3 columns from the
 provided dataset to create your visualisation (from one or more worksheets). The visualisation
 should be appropriate to the dataset and appropriate for the target audience or use case of your
 choosing.
 Hosting your visualisation
 You must create a simple website or web page to host your visualisation. Most of the marks relate
 to the data cleaning and the quality of the visualisation itself, so there is no need to produce a
 complex website. You can use publicly available templates when creating your website/webpage
 provided you reference the source.
 Part 4: Communication of Your Work
 Reporting on the Open Data Cleaning and Modelling
 • A description of your cleaning and manipulation of the dataset used for your visualisation:
 o The tool(s) used for data cleaning
 o A list of the error or error types you found in the dataset
 o For each error type: solutions or transformations you have applied to clean the
 dataset
 • Description of your modelling:
 o description of how you modelled your data,
 o ontologies you chose and why you chose them
 Reporting on the Open Data Visualisation
 • Information describing your visualisation:
 o An overview of the audience and use case for each visualisation and why your
 visualisation is appropriate both to this audience and the data.
 o A description of the interaction and functionality the visualisation provides and why
 this interaction is appropriate both to the audience and the data. For list what value,
 and/or benefits your visualisation offers to users
 o Any details about your visualisation that you’ve included to enhance it for your
 target audience, this may include how you have highlighted interesting trends to
 your identified audience or enriching the source data with another data set.
 NOTE1: In any of your writing for this assignment, all sources must be cited. This includes (1) any
 code or templates you have used that you have not created yourself, and (2) any
 sources/website/journal articles that you have used to justify certain aspects of your visualisation.
 NOTE2: You should report your cleaning and modelling of the data on the website on which
 you are hosting your visualisation.
 Submission
 Submit one zip file (.zip) to the C-BASS handin system (http://handin.ecs.soton.ac.uk), by the
 submission deadline stated above for Assignment 1.
 At a minimum your zip file should contain:
 1. Your cleaned csv files
 2. Description of how you modelled the dataset, ontologies used, and file(s) containing the
 Open Data (RDF) format of the cleaned csv files.
 3. The source code for your Linked Data visualisation and website, including any
 accompanying CSS or JavaScript files.
 4. A README text file, containing instructions for how to (a) run your code or open your
 website and (2) the URL of your website (only if you have chosen to additionally host your
 visualisation/website online).
 Your zip file will be submitted electronically via handin.ecs.soton.ac.uk. We recommend you
 ensure your website is in the correct file format (e.g. index.html, folders for CSS, js etc) such that
 it can be locally hosted and runs without errors. You may also choose to host your solution
 somewhere online. No extra marks are available for hosting online, but this can be a failsafe if
 your zip file doesn’t extract properly.
 The standard ECS late penalties apply, as detailed in the regulations (para. 4.1 of
 http://www.calendar.soton.ac.uk/sectionXII/ecs-ug.html).
 They are 10% per working day that a piece of work is overdue, up to a maximum of 5 days, after
 which the mark becomes zero.
 Relevant Learning Outcomes
 1) Identify innovation opportunities for open data.
 2) Be able to apply appropriate validation, cleaning and transformation to use, reuse and
 combine a multitude of complex datasets.
 3) Be able to model data sets in open data format (RDF) and populate these models with
 data from the datasets
 4) Critically evaluate a large range of infographics and interaction techniques suitable for
 different tasks.
 Marking Scheme
 Criterion Description Outcomes Mark
 Cleaning and
 manipulation of
 dataset
 The student has identified a number of errors or
 different types of errors in the dataset. The
 student has applied suitable techniques to fix
 errors and manipulate the dataset ready to be
 visualised.
 2 10
 Modelling the
 dataset &
 Populating the
 Model
 The student has modelled the concepts (and their
 attributes) of the data sets and has populated
 these models with real data (from data sets)
 1,3 8
 Visualisation I:
 Implementation &
 function
 The implementation is functional and runs without
 errors. The visualisation is hosted on a webpage
 which opens without errors. Good use is made of
 an appropriate library for presenting dynamically
 loading visualisations.
 1, 2, 4 10
 Visualisation II:
 Interactivity and
 innovation
 The visualisation presents multi-dimensional data
 that is interactive; i.e. it allows features such as
 filtering, selection, zooming, ad multi-view
 capability to explore the dataset.
 The choice of visualisation is appropriate to the
 data and audience. The visualisation is innovative
 and useful; it provides value to the intended
 audience beyond that of the raw data or simple
 non-interactive graphs.
CONTACT
 
                         
