Understanding Web Communications
Understanding Web Communications
Client - front end user interface, accepts user input sends data to server for processing
Server - back end. Responds to requests from clients for pages. Returns HTML page (which includes instructions on how to generate user interface).
Client and server communicate via HTTP, text based protocol assigned to TCP port 80. If server has SSL certificate then can use HTTPS to authenticate server and encrypt communications. HTTPS uses TCP port 443.
- User enters address into web browser
- Browser connects by HTTP and issues GET request to web server
- Server processes request
- Server uses HTTP to send back response. If request was successful then status code of 200 together with HTML document is provided. If requested page cannot be found then status code is 404. If page address has changed then the new URL together with status code of 302. Several other responses are possible.
- Browser processes response by displaying HTML page (if status code of 200), showing an error, redirecting, etc.
Web Server's Role
At its simplest the server sends static HTML or image files through HTTP connection to browser. Modern servers do far more, including:
- Verify request structured correctly - ignore malicious requests
- Authenticate itself via SSL certificates and encrypt all content before sending to client
- Authenticate user if content requires the user to identify themselves by verifying the provided credentials
- Authorise that the user is allowed to access requested content
- Determine how to handle request - static content request the server determines if cached content can be used. Dynamic content requests are passed onto ASP
- Handle errors by providing error information to client
- Cache output to improve response times for subsequent requests. Provide caching information to client so browsers know how long to keep content cached for
- Compress output to reduce bandwidth requirements
- Log access requests for security and performance monitoring
Web Browser's Role
- Send requests to web server - resolve (via DNS) the address entered, use HTTP to connect to server and then request page
- Authenticate server if request made via HTTPS and server provides SSL certificate. Use certificate to decrypt future communications
- Process response. If HTML received then fetch referenced objects such as images, audio, etc. If error, redirection, etc. response then browser responds appropriately.
- Render HTML and referenced objects
HTTP is text based protocol. When web page requested browser sends request to server:
GET /default.aspc HTTP/1.1 Host: www.northwindtraders.com
First word in request is command (in this case GET). This is followed by URI of resource to be retrieved. After the URI is the version of HTTP used to process command.
Second line of request identifies name of website. Most web servers host multiple websites with a single IP address and need to know websites name to return correct page.
Common HTTP methods. If Distributed Authoring and Versioning (DAV) is enabled then many more commands will be available:
|Get||Gets an object from server identified by the URI.|
|POST||Sends data to server for processing, e.g. user enters data in form|
|HEAD||Retrieve meta data for object without downloading it. Used to verify resource has not changes since browser cached it.|
|OPTIONS||Request list of supported commands.|
|PUT||Allows client to create resource at indicated URI. If user has permission the server take body of request, creates file at specified URI and places received data into it.|
|DELETE||Deletes resource on server (if user has permission).|
|TRACE||Used for testing / diagnostics - allows client to see what is being received at end of request chain|
|CONNECT||For use by proxies that can dynamically switch to being a tunnel|
|DEBUG||Starts ASP.NET debugging - informs Visual Studio of process to which debugger will attach.|
What is DAV
Extensions to HTTP/1.1 that simplifies distributed web development. Open standard available on many platforms. Provides file locking and versioning.
Build directly on HTTP/1.1 so other protocols such as FTP and Server Message Block (SMB) not required. Provides ability to obtain resource properties such as file names timestamps, etc. Allows for server-side file copying and moving.
Communication from client to server wrapped in ASP.NET request object. This provides code access to cookies, etc. associated with site, query string, path to requested resource, etc.
Communication from server to client is wrapped in response object. Can set cookies, define caching, page expiration, etc. from this object.
When web server responds to request uses contents of response object to write the actual text-based HTTP response, e.g.
HTTP/1.1 200 OK Server: Microsoft-IIS/6.0 Content-Type: text/html Content-Length: 38 <html><body>Hello, world</body></html>
First line indicates communication protocol and HTTP status code. Status codes are 3 digit numbers grouped as follows:
|Status Group Code||Description|
|1xx||Informational - request received and server continuing to process|
|3xx||Redirection required to different resource|
|4xx||Client error - request has syntax error or server does not know how to fulfil request|
|5xx||Server error - server failed to fulfil valid request|
Common status codes:
|Status Group Code||Description|
|407||Proxy authentication required|
|413||Request entity too large|
|500||Internal server error|
Third line of response indicates type of Multi-purpose Internet Mail Extension (MIME) resource being sent to client, e.g. text/html represents and HTML text file. Common MIME types include:
|text||Textual information - subtypes include plain, html, xml|
|image||Image data - subtypes include such as jpeg, gif|
|audio||Audio data - subtype: basic|
|application||Any binary data. Subtype of octet typically used.|
The fourth line is size of content in octets (or 8-bit bytes)
Submitting form data
HTML <form> tag creates web page that collects data from user and sends to server, e.g.
<form method="POST" action="getCustomer.aspx">
Enter customer ID:
<input type="text" name="Id" />
<input type="submit" value#"Get Customer">
The method attribute indicates the HTTP command to use when sending request to server. The action attribute is the relative URL of page to which request will be sent. Two methods can be used with form data - GET and POST. When GET command is used the from data is appended to the action URL as a query string, e.g.
GET /getCustomer.aspx?Id=123&colour=blue HTTP/1.1 Host: www.northwindtraders.com
When GET command is used to send data then complete URL and query string can be seen and modified in the browser address bar. This allows users to easily bookmark or link to the results of the form - important for search pages, but not good for authentication pages where the users credentials will be visible in the URL. Not good where large amounts of data need to be transferred - Internet Explorer and IIS impose a limit of 1024 characters in URL.
The POST command is better for large amounts or sensitive data. When POSTing data is placed into message body:
POST /getCustomer.aspx HTTP/1.1 Host: www.northwindtraders.com Id=123&colour=blue
Sending data back to server as part of request often refereed to as postback. Although term derived from POST command, can also perform postback using GET command, ASP.NET webpage contains IsPostBack property that indicates if data is being sent back to server or if page is simply being requested.