In this assignment, you will develop and critically evaluate one or more computational models appropriate to a concrete humanities or social sciences modelling task of your own choosing. This task must relate directly to at least one of the subjects covered in sessions 4 through 18 of this module. You will be asked to provide a brief description of your chosen task in a formative exercise during week 6, for which you will receive feedback on the appropriateness of the proposed task. Examples of suitable tasks (which you may use if wished) will be provided during lectures as well as in the description below.
You will submit a written report of up to 3000 words describing your chosen task, data sources, and implementation, together with documented code used to carry out the task.
Your report must:
- Introduce the task
- Document sources of data, and briefly state any relevant modules or toolkits that were used (e.g. CoreNLP, Stanza, Gensim, QGIS, etc.).
- Describe the chosen model(s) and implementation
- Critically evaluate the model(s). Depending on the task, this will typically involve both formal metrics of success (e.g. accuracy, precision, recall, etc. as appropriate) and critical analysis of the adequacy of the modelling (e.g. what are the assumptions the models rely on, and do they all hold; are there factors or biases that might invalidate conclusions drawn, etc.).
- Discuss what conclusions can be drawn from the model(s)
Your task must:
- Involve some degree of original analysis – i.e. your task should not simply be repeating an analysis from previous work using an identical dataset.
- Investigate a concrete, real-world problem or issue that has a clear connection to a humanities or social science domain, and offer some conclusions about it (which can be speculative, tentative, and/or carefully qualified).
o Bad example I: “I apply sentiment analysis to a dataset of tweets and show that 39% of them are positive, 46% negative, …” [No connection made to any concrete real-world problem]
o Bad example II: “I develop/implement technique X to perform generic task Y (e.g. sentiment analysis, text reuse identification, …), and show that it outperforms state-of-the-art technique Z” [No connection made to any concrete real-world problem]
o Good example I: “I use data from X, Y, and Z to assess whether there are regional biases in the places of birth of UK prime ministers over the last 200 years” [You cannot, however, use exactly this topic as it has previously been set as an assignment for this module.]
o Good example II: “I combine data from Wikidata and the DPRR [see lecture notes] to investigate geographical movement of individuals over time in the Roman empire, years N to M” [You are free to base your work on this idea if you wish to]
o Good example III: “I create an appropriate corpus of text and apply off-the-shelf NLP models in order to identify patterns of change in English language use in works of fiction from years N to M” [You are free to base your work on this idea if you wish to]
For inspiration and examples of appropriate modelling tasks, please refer to the academic papers cited and discussed during lectures, as well as those assigned as recommended readings and linked to on Ultra. You are strongly recommended to discuss your proposed task with the lecturer either by e-mail or during office hours.
In this assignment, please note that:
- You are allowed to reuse existing code – this includes open source modules and toolkits, Stack Overflow responses, code described in online tutorials, etc., provided that whenever you do this, you clearly indicate what you have reused and exactly where it came from (including a URL for an online source). You should document smaller instances of reuse (e.g. short code examples from Stack Overflow) as comments in your code, and longer ones both in your code and as citations to the source in your report. Marks will be awarded for the original parts of your work (i.e. your extensions to and adaptations or developments of any reused material).
- Copying any material (this includes code and text) in your submission from any source without clear acknowledgement is likely to constitute plagiarism, which can have serious consequences as described in the Teaching and Learning Handbook:
- Marks will not be awarded under the “Technical depth/clarity of implementation” sections for parts of your code that reimplement code available in widely used libraries or modules.
- You may use any programming language suitable for the task; use of Python 3.x is recommended where possible.
- In writing your report, you should assume that the reader is familiar with everything covered during the lectures for this module. Concepts and techniques important to your implementation which were not introduced during the module should be explained.
- The word limit is exclusive of figures and references. You may include a maximum of 20 figures (recommended: around 5-10 figures). Please use a 12pt font, singlespacing, and A4 pages. Your report should begin with an introduction, and does not require a separate abstract.
Submission (2 files)
- A written report as a PDF file.
- An archive (.zip / .tar.gz) containing the code for your implementation, including instructions for how to run it in a README.TXT file. If any data that your code relies on is over 10 Mb in size, include instructions in this file for how to obtain it and do not include it in the archive. Your code is not marked separately, but serves as supporting evidence for criteria 2 and 3 of the marking scheme (below): provided that your code is suitably documented, it will be taken into account in assessing the technical depth and clarity of your implementation.
本网站支持淘宝 支付宝 微信支付 paypal等等交易。如果不放心可以用淘宝交易！
E-mail: firstname.lastname@example.org 微信:itcsdx