Basic knowledge about HTTP
Pretty much every program on the web runs on a family of communication standards called Internet Protocol (IP), but which concerns us most of them is is the Transmission Control Protocol (TCP/IP), it makes communication between computers as simple as reading & writing text files.
We use IP to locate a computer on the internet and PORT to determine which programme we want to visit. So if someone has built a web server listening on port 80 on computer A of which IP is 10.22.122.345, then we can access it anywhere using 10.22.122.345:80. Unfortunately, we may not always remember such complex IP address, so Domain Name System (DNS) will automatically match nickname of that programme (such as 'www.example.org') with IP address number.
The Hypertext Transfer Protocol (HTTP) allows us to exchange data on the internet, Request/Response mechanism based on socket connection is the basic working mode:
Usage of socket
Socket allows processes on different computers to communicate with each other, most services on the web are based on socket. We will focus on server side socket, which mainly finish following tasks:
- Opeing socket
- Binding to host and port
- Listening for coming connection
- Creating connection and exchange data
- Closing connection
We will create a simple web server which always return the same content to client connection:
import socket
HOST, PORT = '', 7002
server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
server_socket.bind((HOST, PORT))
server_socket.listen(1)
print 'Serving HTTP on port %s ...' % PORT
while True:
client_connection, client_address = server_socket.accept()
request = client_connection.recv(1024)
print request
http_response = """\
HTTP/1.1 200 OK\r\n
Some content from server!
"""
client_connection.sendall(http_response)
client_connection.close()
Firstly, we created a server socket using socket.socket(socket.AF_INET, socket.SOCK_STREAM)
, the AF_INET means that we are using IPV4 address family and the SOCK_STREAM means that the socket is serving for TCP/IP connection.
Then, we add some socket option to make it reuse address and the backlog is 1, bind this server socket to given host and port. The listen
method accept one parameter, it means the max connection OS can hold on before new connection is refused.
Return value of accept
method is a pair (conn, address) where conn is a new socket object usable to send and receive data on the connection, and address is the address bound to the socket on the other end of the connection. client_connection
can be used to exchange data between server and client, we often call recv
to recieve data from client and sendall
to send data to client.
Basic knowledge about WSGI
The Web Server Gateway Interface (WSGI) is a standard interface between web server software and web applications written in Python. Having a standard interface makes it easy to use an application that supports WSGI with a number of different web servers.
WSGI mainly working flow:
- A callable object
application
must be given by web framework (such as Flask/Django..), its implementation has no limit. - Every time when web server received http request from client, the callable object
application
will be invoked. Then web server will transfer a dict that contains all CGI environment variables and another callable objectstart_response
- Web framework will build HTTP response headers (including response status), then send them to
start_response
and generate response body - Finally, web server will construct a response contains all information and send it to client.
So, let's begin with a simple wsgi demo:
from wsgiref.simple_server import make_server
class SimpleApp(object):
def __call__(self, env, start_response):
status = '200 OK'
response_headers = [('Content-type', 'text/plain')]
# set header and status
start_response(status, response_headers)
return [u"This is a wsgi application".encode('utf8')]
def app(env, start_response):
status = '200 OK'
response_headers = [('Content-type', 'text/plain')]
# set header and status
start_response(status, response_headers)
return [u"This is a wsgi application".encode('utf8')]
httpd = make_server('', 7000, app)
# httpd = make_server('', 7000, SimpleApp())
print 'Serving http on port:7000'
httpd.serve_forever()
We used make_server
to create a WSGI server, the 3rd parameter should be a callable object which can be a function or class with __call__
method.
Then we will lookup source code for some detail information. First, let's take a look at the make_server
method in simple_server.py:
def make_server(
host, port, app, server_class=WSGIServer, handler_class=WSGIRequestHandler
):
"""Create a new WSGI server listening on `host` and `port` for `app`"""
server = server_class((host, port), handler_class)
server.set_app(app)
return server
It returns a server instance which is presented by WSGIServer
:
class WSGIServer(HTTPServer):
"""BaseHTTPServer that implements the Python WSGI protocol"""
application = None
def server_bind(self):
"""Override server_bind to store the server name."""
HTTPServer.server_bind(self)
self.setup_environ()
def setup_environ(self):
# Set up base environment
env = self.base_environ = {}
env['SERVER_NAME'] = self.server_name
env['GATEWAY_INTERFACE'] = 'CGI/1.1'
env['SERVER_PORT'] = str(self.server_port)
env['REMOTE_HOST']=''
env['CONTENT_LENGTH']=''
env['SCRIPT_NAME'] = ''
def get_app(self):
return self.application
def set_app(self,application):
self.application = application
So after calling make_server
, the callable object application
has been bind to WSGI server class.
Next question is what the framework do with incoming request, so we will search for method serve_forever
. Note that, it was defined in SocketServer.BaseServer
.
def serve_forever(self, poll_interval=0.5):
"""Handle one request at a time until shutdown.
Polls for shutdown every poll_interval seconds. Ignores
self.timeout. If you need to do periodic tasks, do them in
another thread.
"""
self.__is_shut_down.clear()
try:
while not self.__shutdown_request:
# XXX: Consider using another file descriptor or
# connecting to the socket to wake this up instead of
# polling. Polling reduces our responsiveness to a
# shutdown request and wastes cpu at all other times.
r, w, e = _eintr_retry(select.select, [self], [], [],
poll_interval)
if self in r:
self._handle_request_noblock()
finally:
self.__shutdown_request = False
self.__is_shut_down.set()
Class hierarchy for WSGIServer
:
SocketServer.BaseServer
|-SocketServer.TCPServer
|--BaseHTTPServer.HTTPServer
|---simple_server.WSGIServer
then we will find it calls process_request
and then calls finish_request
:
def process_request(self, request, client_address):
"""Call finish_request.
Overridden by ForkingMixIn and ThreadingMixIn.
"""
self.finish_request(request, client_address)
self.shutdown_request(request)
In method finish_request
, it constructed an instance of class BaseRequestHandler
:
def __init__(self, request, client_address, server):
self.request = request
self.client_address = client_address
self.server = server
self.setup()
try:
self.handle()
finally:
self.finish()
Now, we will go through method handle
and it was overrideen by WSGIRequestHandler
:
class WSGIRequestHandler(BaseHTTPRequestHandler):
# ...
def handle(self):
"""Handle a single HTTP request"""
self.raw_requestline = self.rfile.readline(65537)
if len(self.raw_requestline) > 65536:
self.requestline = ''
self.request_version = ''
self.command = ''
self.send_error(414)
return
if not self.parse_request(): # An error code has been sent, just exit
return
handler = ServerHandler(
self.rfile, self.wfile, self.get_stderr(), self.get_environ()
)
handler.request_handler = self # backpointer for logging
handler.run(self.server.get_app())
Now, we will go through method run
defined in class BaseHandler
:
def run(self, application):
"""Invoke the application"""
# Note to self: don't move the close()! Asynchronous servers shouldn't
# call close() from finish_response(), so if you close() anywhere but
# the double-error branch here, you'll break asynchronous servers by
# prematurely closing. Async servers must return from 'run()' without
# closing if there might still be output to iterate over.
try:
self.setup_environ()
self.result = application(self.environ, self.start_response)
self.finish_response()
except:
try:
self.handle_error()
except:
# If we get an error handling an error, just give up already!
self.close()
raise # ...and let the actual server figure it out.
This method will invoke the application we transferred in, start_response
method is called for build response status and response headers, finish_response
will help to build a readable response for client.
Create WSGI Server
We've got the basic working style and code structure of python WSGI in previous chapter, so let's build our own WSGI server without using embed WSGI modules.
Key points was listed as following:
- Bind to certain callable application
- Parse request line
- Invoke application
- Build response header and body
- Send response to client
Here comes the code:
import socket
import StringIO
import sys
class WSGIServer(object):
address_family = socket.AF_INET
socket_type = socket.SOCK_STREAM
request_queue_size = 1
def __init__(self, server_address):
self.server_socket = server_socket = socket.socket(
self.address_family,
self.socket_type
)
# reuse the same address
server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
server_socket.bind(server_address)
server_socket.listen(self.request_queue_size)
# Get server host name and port
host, port = self.server_socket.getsockname()[:2]
self.server_name = socket.getfqdn(host)
self.server_port = port
# Return headers set by Web framework/Web application
self.headers_set = []
def set_app(self, application):
self.application = application
def serve_forever(self):
server_socket = self.server_socket
while True:
self.client_connection, client_address = server_socket.accept()
# only handle one request.
self.handle_one_request()
def handle_one_request(self):
self.request_data = request_data = self.client_connection.recv(1024)
self.parse_request(request_data)
# Construct environment dictionary using request data
env = self.get_env()
# It's time to call our application callable and get
# back a result that will become HTTP response body
result = self.application(env, self.start_response)
# Construct a response and send it back to the client
self.finish_response(result)
def parse_request(self, text):
try:
request_line = text.splitlines()[0]
request_line = request_line.rstrip('\r\n')
# path the request line
(self.request_method, # GET
self.path, # /path
self.request_version # HTTP/1.1
) = request_line.split()
except StandardError as e:
pass
def get_env(self):
env = {}
# WCGI variables
env['wsgi.version'] = (1, 0)
env['wsgi.url_scheme'] = 'http'
env['wsgi.input'] = StringIO.StringIO(self.request_data)
env['wsgi.errors'] = sys.stderr
env['wsgi.multithread'] = False
env['wsgi.multiprocess'] = False
env['wsgi.run_once'] = False
# basic CGI variables
env['REQUEST_METHOD'] = self.request_method # GET
env['PATH_INFO'] = self.path # /hello
env['SERVER_NAME'] = self.server_name # localhost
env['SERVER_PORT'] = str(self.server_port) # 8888
return env
def start_response(self, status, response_headers, exc_info=None):
# Add necessary server headers
server_headers = [
('Description', 'Build with python2'),
('Server', 'WSGIServer'),
]
self.headers_set = [status, response_headers + server_headers]
def finish_response(self, result):
try:
status, response_headers = self.headers_set
response = 'HTTP/1.1 {status}\r\n'.format(status=status)
for header in response_headers:
response += '{0}: {1}\r\n'.format(*header)
response += '\r\n'
for data in result:
response += data
self.client_connection.sendall(response)
finally:
self.client_connection.close()
SERVER_ADDRESS = (HOST, PORT) = '', 7002
def make_server(server_address, application):
server = WSGIServer(server_address)
server.set_app(application)
return server
if __name__ == '__main__':
if len(sys.argv) < 2:
sys.exit('Provide a WSGI application object as module:callable')
app_path = sys.argv[1] #'flaskapp:app'
module, application = app_path.split(':')
module = __import__(module)
application = getattr(module, application)
httpd = make_server(SERVER_ADDRESS, application)
print('WSGIServer: Serving HTTP on port {port} ...\n'.format(port=PORT))
httpd.serve_forever()
This file takes a command line argument, it looks like module:callable. So we will run a flask application on this server.
Let's create flask.py:
from flask import Flask
from flask import Response
flask_app = Flask('flaskapp')
@flask_app.route('/')
def index():
return Response(
'Welcome to the world of Flask!\n',
mimetype='text/plain'
)
app = flask_app.wsgi_app
Now, run the command python server.py flask:app
, open a browser and take a look at the running flask application.