机器学习代写|CSC311 27 Midterm 2 A

本次Java代写的主要内容是通过Java网络编程创建一个服务端和客户端,实现服务端和客户端的通信,并且实现分布式的数据流传输,且支持多线程。

Assignment Description

Objective

To gain an understanding of what is required to build a client/server system, by building a simple system that aggregates and distributes ATOM feeds.

Introduction

Information management and tracking becomes more difficult as the number of things to track increases. For most users, the number of web pages that they wish to keep track of is quite large and, if they had to remember to check everything manually, it’s easy to forget a webpage or two when you’re tired or busy. Enter syndication, a mechanism by which a website can publish summaries as a feed that you can sign up to, so that you can be notified when something new has happened and then, if it interests you, go and look at it. Initial efforts in the world of syndication included the development of the RSS family of protocols but these are, effectively, not standardised. The ATOM syndication protocol is a standards-based approach to try and provide a solid basis for syndication. You can see the ATOM RFC here (Links to an external site.)Links to an external site.although you won’t be implementing all of it!

XML-based formats are easy to transport via Hypertext Transport Protocol (HTTP), the workhorse protocol of the Web, and it is increasingly common to work with a standard format for interchange between clients and servers, rather than develop a special protocol for one small group of clients and servers. Where, twenty years ago, we might have used byte-boundary defined patterns in transmitted data to communicate, it is far more common to use XML-based standards and existing HTTP mechanisms to shunt things around. This is socket-based communication between client and server and does not need to use the Java RMI mechanism to support it – as you would expect as you don’t have to use an RMI client to access a web page! In this prac, you will take data and convert it into ATOM format and then send it to a server. The server will check it and then distribute a limited form of that data to every client who connects and asks for it. When you want to change the data in the server, you overwrite the existing file, which makes the update operation idempotent (you can do it as many times as you like and get the same result). The real test of your system will be that you can accept PUT and GET requests from other students on your server and your clients can talk to them. As always, don’t share code.

Syndication Servers

Syndication servers are web servers that serve XML documents which conform to the RSS or ATOM standards. On receipt of an HTTP GET, the server will respond with an XML response like this (from “Creating an ATOM feed in PHP” (Links to an external site.)Links to an external site.):

<?xml version='1.0' encoding='iso-8859-1' ?>

<feed xml:lang="en-US" xmlns="http://www.w3.org/2005/Atom">

        <title>Fishing Reports</title>

        <subtitle>The latest reports from fishinhole.com</subtitle>

        <link href="http://www.fishinhole.com/reports" rel="self"/>

        <updated>2015-07-03T16:19:54-05:00</updated>

        <author>

                <name>NameOfYourBoss</name>

                <email>[email protected]</email>

        </author>

        <id>tag:fishinhole.com,2008:http://www.fishinhole.com/reports</id>

        <entry>

                <title>Speckled Trout In Old River</title>

                <link type='text/html' href='http://www.fishinhole.com/reports/report.php?id=4'/>

                <id>tag:fishinhole.com,2008:http://www.fishinhole.com/reports/report.php?id=4</id>

                <updated>2009-05-03T04:59:00-05:00</updated>

                <author>

                        <name>ReelHooked</name>

                </author>

                <summary>Limited out by noon</summary>

        </entry>

        ...

</feed>

The server, once configured, will serve out this ATOM XML file to any client that requests it over HTTP. Usually, this would be part of a web-client but, in this case, you will be writing the aggregation server, the content servers and the read clients. The content server will PUT content on the server, while the read client will GET content from the server.

Elements

The main elements of this assignment are:

  • An ATOM server (or aggregation server) that responds to requests for feeds and also accepts feed updates from clients. The aggregation server will store feed information persistently, only removing it when the content server who provided it is no longer in contact, or when the feed item is not one of the most recent 25.
  • A client that makes an HTTP GET request to the server and then displays the feed data, stripped of its XML information.
  • A CONTENT SERVER that makes an HTTP PUT request to the server and then uploads a new version of the feed to the server, replacing the old one. This feed information is assembled into ATOM XML after being read from a file on the content server’s local filesystem.

All code elements will be written in the Java programming language. Your clients are expected to have a thorough failure handling mechanism where they behave predictably in the face of failure, maintain consistency, are not prone to race conditions and recover reliably and predictably.

Summary of this prac

In this assignment, you will build the aggregation system described below, including a failure management system to deal with as many of the possible failure modes that you can think of for this problem. This obviously includes client, server and network failure, but now you must deal with the following additional constraints (come back to these constraints after you read the description below):

  1. Multiple clients may attempt to GET simultaneously and are required to GET the aggregated feed that is correct for the Lamport clock adjusted time if interleaved with any PUTs. Hence, if A PUT, a GET, and another PUT arrive in that sequence then the first PUT must be applied and the content server advised, then the GET returns the updated feed to the client then the next PUT is applied. In each case, the participants will be guaranteed that this order is maintained if they are using Lamport clocks.
  2. Multiple content servers may attempt to simultaneously PUT. This must be serialised and the order maintained by Lamport clock timestamp.
  3. Your aggregation server will expire and remove any content from a content server that it has not communicated within the last 15 seconds. You may choose the mechanism for this but you must consider efficiency and scale.
  4. All elements in your assignment must be capable of implementing Lamport clocks, for synchronization and coordination purposes.

Your Aggregation Server

To keep things simple, we will assume that there is one file in your filesystem which contains a list of entries and where are they come from. It does not need to be an ATOM format, but it must be able to convert to a standard ATOM file when the client sends a GET request. However, this file must survive the server crashing and re-starting, including recovering if the file was being updated when the server crashed! Your server should restore it as was before re-starting or a crash. You should, therefore, be thinking about the PUT as a request to handle the information passed in, possibly to an intermediate storage format, rather than just as overwriting a file. This reflects the subtle nature of PUT – it is not just a file write request! You should check the feed file provided from a PUT request to ensure that it is valid. The file details that you can expect are detailed in the Content Server specification.

All the entities in your system must be capable of maintaining a Lamport clock.

The first time your ATOM feed is created, you should return status 201 – HTTP_CREATED. If later uploads are ok, you should return status 200. (This means, if a Content Server first connects to the Aggregtion Server, then return 201 as succeed code, then before the content server lost connection, all other succeed response should use 200). Any request other than GET or PUT should return status 400 (note: this is not standard but to simplify your task). Sending no content to the server should cause a 204 status code to be returned. Finally, if the ATOM XML does not make sense you may return status code 500 – Internal server error.

Your server will, by default, start on port 4567 but will accept a single command line argument that gives the starting port number. Your server’s main method will reside in a file called AggregationServer.java.

Your server is designed to stay current and will remove any items in the feed that have come from content servers which it has not communicated with for 15 seconds. How you do this is up to you but please be efficient!

Your GET client

Your GET client will start up, read the command line to find the server name and port number (in URL format) and will send a GET request for the ATOM feed. This feed will then be stripped of XML and displayed, one line at a time, with the attribute and its value. Your GET client’s main method will reside in a file called GETClient.java. Possible formats for the server name and port number include “http://servername.domain.domain:portnumber”, “http://servername:portnumber” (with implicit domain information) and “servername:portnumber” (with implicit domain and protocol information).

You should display the output so that it is easy to read but you do not need to provide active hyperlinks. You should also make this client failure-tolerant and, obviously, you will have to make your client capable of maintaining a Lamport clock.

Your Content Server

Your content server will start up, reading two parameters from the command line, where the first is the server name and port number (as for GET) and the second is the location of a file in the file system local to the Content Server (It is expected that this file located in your project folder). The file will contain a number of fields from the ATOM format that are to be assembled into an ATOM XML feed and then uploaded to the server. You may assume that all fields are text and that there will be no embedded HTML or XHMTL. The list of ATOM elements that you need to support are:

  • title
  • subtitle
  • link
  • updated
  • author
  • name
  • id
  • entry
  • summary

Input file format

To make parsing easier, you may assume that input files will follow this format:

title:My example feed

subtitle:for demonstration purposes

link:www.cs.adelaide.edu.au

updated:2015-08-07T18:30:02Z

author:Santa Claus

id:urn::uuid:60a76c80-d399-11d9-b93C-0003939e0af6

entry

title:Nick sets assignment

link:www.cs.adelaide.edu.au/users/third/ds/

id:urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a

updated:2015-08-07T18:30:02Z

summary:here is some plain text. Because I’m not completely evil, you can assume that this will always be less than 1000 characters. And, as I’ve said before, it will always be plain text.

entry

title:second feed entry

link:www.cs.adelaide.edu.au/users/third/ds/14ds2s1

id:urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6b

updated:2015-08-07T18:29:02Z

summary:here’s another summary entry which a reader would normally use to work out if they wanted to read some more. It’s quite handy.

Note that the author field only contains a name and that you will have to convert this into a name element inside an author element. An entry is terminated by either another entry keyword, or by the end of file, which also terminates the feed. You may reject any feed or entry with no title, link or id as being in error. You may ignore any markup in a text field and just print it as is.

<?xml version='1.0' encoding='iso-8859-1' ?>

<feed xml:lang="en-US" xmlns="http://www.w3.org/2005/Atom">

(And then your file of data)

...

</feed>

Your content server will need to confirm that it has received the correct acknowledgment from the server and then check to make sure that the information is in the feed as it was expecting. It must also support Lamport clocks.

Some basic suggestions

The following would be a good approach to solving this problem:

  • Think about how you will test this and how you are going to build each piece. What are the individual steps?
  • Write a simple version of your servers and client to make sure that you can communicate between them.
  • Use known working ATOM feeds for testing parts of your system and read all of the relevant spec sections carefully!
  • There are many default Java XML parsers out there, learn how to use them rather than write your own. Both options are acceptable, but we have found that it does save time to use existing ones (if not for anything, you have a ton of tutorials out there!)
  • We strongly recommend that you implement this assignment using Sockets rather than HttpServer
  • Try modularising your code; for example, ATOM Feed parse function is required in all places, so it is better to have all those functions in one class, then reused in other places.

Notes on Lamport Clocks

Please note that you will have to implement Lamport clocks and the update mechanisms in your entire system. This implies that each entity will keep a local Lamport clock and that this clock will get updated as the entity communicates with other entities or processes events. It is up to you to determine which events (such as send, receive or processing) the entity will consider in the Lamport clock update (for example, a System.out.println might not be interesting). This granularity will influence the performance of your implementation. The local Lamport clocks will need to be sent through to other entities with every message/request (like in the request header) – you are responsible for ensuring that this tagging occurs and for the local update of Lamport clocks once messages/requests are received. Towards this, follow the algorithm discussed in class and/or in the Lamport clocks paper accessible from the forum. As part of this requirement, we are aware that your method for embedding Lamport clock information in your communications may mean that you lose interoperability with other clients and servers. This is an acceptable outcome for this assignment but, usually, we would take a standards-based approach to ensure that we maintain interoperability.

Appendix A

Code Quality Checklist

Do

  • Write comments above the header of each of your methods, describing what the method is doing, what are the inputs and expected outputs
  • describe in the comments any special cases
  • create modular code, following cohesion and coupling principles

Don’t

  • use magic numbers
  • use comments as structural elements
  • mis-spell your comments
  • use incomprehensible variable names
  • have methods longer than 80 lines
  • allow TODO blocks

Appendix B

Assignment 2 Checklist

 Basic functionality refers to:
  • XML parsing works
  • client, Atom server and content server processes start up and communicate
  • PUT operation works for one content server
  • GET operation works for many read clients
  • Atom server expired feeds works (15s)
  • Retry on errors (server not available etc) works

Full functionality refers to:

  • Lamport clocks are implemented
  • All error codes are implemented: empty XML, malformed XML
  • Content servers are replicated and fault tolerant