Intro to Relational Databases (closed)

The exact list of types differs from one database to another. For a full list of types, check the manual for your database, such as this one for PostgreSQL.

Text and string types

text — a string of any length, like Python str or unicode types.

char(n) — a string of exactly n characters.

varchar(n) — a string of up to n characters.

Numeric types

integer — an integer value, like Python int.

real — a floating-point value, like Python float. Accurate up to six decimal places.

double precision — a higher-precision floating-point value. Accurate up to 15 decimal places.

decimal — an exact decimal value.

Date and time types

date — a calendar date; including year, month, and day.

time — a time of day.

timestamp — a date and time together.

lesson 2

All the tables in the zoo database

animals

This table lists individual animals in the zoo. Each animal has only one row. There may be multiple animals with the same name, or even multiple animals with the same name and species.

name — the animal's name (example: 'George')
species — the animal's species (example: 'gorilla')
birthdate — the animal's date of birth (example: '1998-05-18')

diet

This table matches up species with the foods they eat. Every species in the zoo eats at least one sort of food, and many eat more than one. If a species eats more than one food, there will be more than one row for that species.

species — the name of a species (example: 'hyena')
food — the name of a food that species eats (example: 'meat')

taxonomy

This table gives the (partial) biological taxonomic names for each species in the zoo. It can be used to find which species are more closely related to each other evolutionarily.

name — the common name of the species (e.g. 'jackal')
species — the taxonomic species name (e.g. 'aureus')
genus — the taxonomic genus name (e.g. 'Canis')
family — the taxonomic family name (e.g. 'Canidae')
t_order — the taxonomic order name (e.g. 'Carnivora')

If you've never heard of this classification, don't worry about it; the details won't be necessary for this course. But if you're curious, Wikipedia articles Taxonomy and Biological classification may help.

ordernames

This table gives the common names for each of the taxonomic orders in the taxonomy table.

t_order — the taxonomic order name (e.g. 'Cetacea')
name — the common name (e.g. 'whales and dolphins')

The SQL for it

And here are the SQL commands that were used to create those tables. We won't cover the create table command until lesson 4, but it may be interesting to look at:

create table animals (  
       name text,
       species text,
       birthdate date);

create table diet (
       species text,
       food text);  

create table taxonomy (
       name text,
       species text,
       genus text,
       family text,
       t_order text); 

create table ordernames (
       t_order text,
       name text);

Remember: In SQL, we always put string and date values inside single quotes.

The Experiment Page(一些SQL语法实例)

这些全是python语法，所以要注意缩进

# Uncomment one of these QUERY variables at a time and use "Test Run" to run it.
# You'll see the results below.  Then try your own queries as well!

#没看懂这个max的选取方法
QUERY = "select max(name) from animals;"

#选取10行
QUERY = "select * from animals limit 10;"

#以日期排序
QUERY = "select * from animals where species = 'orangutan' order by birthdate;"

#日期倒序排序
QUERY = "select name from animals where species = 'orangutan' order by birthdate desc;"

#没看懂offset什么意思嘛
QUERY = "select name, birthdate from animals order by name limit 10 offset 20;"

#min的选取规则是什么，group又是什么
QUERY = "select species, min(birthdate) from animals group by species;"

#count(*)什么意思
QUERY = '''
select name, count(*) as num from animals
group by name
order by num desc
limit 5;
'''

Select Clauses

Here are the new select clauses introduced in the previous video:

... limit count
Return just the first count rows of the result table.

... limit count offset skip
Return count rows starting after the first skip rows.

... order by columns
... order by columns desc
Sort the rows using the columns (one or more, separated by commas) as the sort key. Numerical columns will be sorted in numerical order; string columns in alphabetical order. With desc, the order is reversed (desc-ending order).

... group by columns
Change the behavior of aggregations such as max, count, and sum. With group by, the aggregation will return one row for each distinct value in columns.

Insert - Quiz

The basic syntax for the insert statement:

insert into table ( column1, column2, ... ) values ( val1, val2, ... );

If the values are in the same order as the table's columns (starting with the first column), you don't have to specify the columns in the insert statement:

insert into table values ( val1, val2, ... );

For instance, if a table has three columns (a, b, c) and you want to insert into a and b, you can leave off the column names from the insert statement. But if you want to insert into b and c, or a and c, you have to specify the columns.

A single insert statement can only insert into a single table. (Contrast this with the select statement, which can pull data from several tables using a join.)

join

To join two tables, first choose the join condition, or the rule you want the database to use to match rows from one table up with rows of the other table. Then write a join in terms of the columns in each table.

For instance, if you want to join tables T and S by matching rows where T.color is the same as S.paint, you'd write a select statement using T join S on T.color = S.paint.

lesson 3 python DB-API

3.3 Trying Out DB-API

# To see how the various functions in the DB-API work, take a look at this code,
# then the results that it prints when you press "Test Run".
#
# Then modify this code so that the student records are fetched in sorted order
# by student's name.
#

import sqlite3

# Fetch some student records from the database.
db = sqlite3.connect("students")
c = db.cursor()
query = "select name, id from students order by name;"
c.execute(query)
rows = c.fetchall()

# First, what data structure did we get?
print "Row data:"
print rows

# And let's loop over it too:
print
print "Student names:"
for row in rows:
  print "  ", row[0]

db.close()

output

Row data:
[(u'Diane Paiwonski', 773217), (u'Harry Evans-Verres', 172342), (u'Hoban Washburne', 186753), (u'Jade Harley', 441304), (u'Jonathan Frisby', 917151), (u'Melpomene Murray', 102030), (u'Robert Oliver Howard', 124816), (u'Taylor Hebert', 654321), (u'Trevor Bruttenholm', 162636)]

Student names:
   Diane Paiwonski
   Harry Evans-Verres
   Hoban Washburne
   Jade Harley
   Jonathan Frisby
   Melpomene Murray
   Robert Oliver Howard
   Taylor Hebert
   Trevor Bruttenholm

(prepa install the database

Udacity教程：https://www.udacity.com/wiki/ud197/install-vagrant

In Lessons 3 and 5 of this course, you'll use a virtual machine (VM) to run a database server and a web app that uses it. The VM is a Linux server system that runs on top of your own computer. You can share files easily between your computer and the VM.

1, git
2,virtual box, ubuntu 14.04 必须从官方商店下，不能去virtualbox的官网下
3，vagrant，去官网直接下最新版，debian
4，在bios里打开VT-x features， http://amiduos.com/support/knowledge-base/article/enabling-virtualization-in-lenovo-systems
5，Use Git/GitHub to fetch the VM configuration， https://github.com/udacity/fullstack-nanodegree-vm 这个库里有各种配置文件和上课用的东西
6，在terminal 配置，change directory to fullstack/vagrant (cd fullstack/vagrant), then type vagrant up to launch your virtual machine.

Once it is up and running, type vagrant ssh to log into it. This will log your terminal in to the virtual machine, and you'll get a Linux shell prompt. When you want to log out, type exit at the shell prompt. To turn the virtual machine off (without deleting anything), type vagrant halt. If you do this, you'll need to run vagrant up again before you can log into it.
Files and commands we’ll be using (Relational Databases)

Files installed for this class are located in the /vagrant directory inside the virtual machine. Everything here is automatically shared with thevagrant directory inside the fullstack directory on your computer. Any code files you save into that directory from your favorite text editor will be automatically available in the VM.

If you’d like to see what was installed in the VM, look in /vagrant/pg_config.sh.

In this class you will mostly be running your work in Python from the command line. In addition you’ll use the psql program to interact with the PostgreSQL database.

To connect psql to the forum database for Lesson 3, type psql forum at the command line. To exit psql, type \q or Control-D (^D).

问题：vagrant up -> vagrant ssh -> cd vagrant, 找不到forum文件

解决：cd .. 退出到和home的同级菜单，发现另一个vagrant文件夹，进入这个文件夹，有forum文件

3.5 Running the Forum

vagrant@vagrant-ubuntu-trusty-32:/vagrant/forum$ ls
forumdb.py forum.py forum.sql

vagrant@vagrant-ubuntu-trusty-32:/vagrant/forum$ python forum.py
Serving HTTP on port 8000...
这个命令在本地8000建了一个forum的页面，可以在这个页面里写一些post

crtl+c kill 掉进程后，再python forum.py重新启动一下。reload网页，发现之前的post都消失了，数据库里并没有保存之前的post。

本课要解决这个问题。

3.6 Hello PostgreSQL

启动PostgreSQL

vagrant@vagrant-ubuntu-trusty-32:/vagrant/forum$ psql forum
psql (9.3.10)
Type "help" for help.

forum=>

可以在=>之后直接输入各种命令。

一个小问题

forum=> select * from posts  # 这个命令没有写完，所以没有返回结果，必须加`;`
forum-> ;
content | time | id 
---------+------+----
(0 rows)

3.7Give that App a Backend

Instructor Notes

The forum database has already been created for you. Your code will need to connect to it using psycopg2.connect("dbname=forum") and then perform select and insert operations on the posts table.

The existing GetAllPosts function returns all the entries from a list. So its database version should return all the entries from the posts table.

And likewise, the existing AddPost function inserts an entry into a list.

You do not need to provide the time column when you insert a post. The table is set up to already provide a timestamp.

The existing GetAllPosts function sorts the posts using a Python sort function. When you implement this function using the database, can you avoid sorting in Python by doing it in SQL?

Hint: When performing insert operations consider using string substitution. Example:

name = "Jeffrey"
nickname = "Jeff"
print "Name is %s and nickname is %s" % (name, nickname)
The output would be: Name is Jeffrey and nickname is Jeff

forumdb.py源代码

#
# Database access functions for the web forum.
# 

import time

## Database connection
DB = []

## Get posts from database.
def GetAllPosts():
    '''Get all the posts from the database, sorted with the newest first.
    Returns:
      A list of dictionaries, where each dictionary has a 'content' key
      pointing to the post content, and 'time' key pointing to the time
      it was posted.
    '''
    posts = [{'content': str(row[1]), 'time': str(row[0])} for row in DB]
    posts.sort(key=lambda row: row['time'], reverse=True)
    return posts

## Add a post to the database.
def AddPost(content):
    '''Add a new post to the database.
    Args:
      content: The text content of the new post.
    '''
    t = time.strftime('%c', time.localtime())
    DB.append((t, content))

发现其实GetAllPosts其实并没有调用database，而且排序也不是用SQL，而是python。AddPost也要改，用SQL的方式来添加content。

修改后的

#
# Database access functions for the web forum.
# 

import time
import psycopg2

## Database connection
DB = []

## Get posts from database.
def GetAllPosts():
    
    DB = psycopg2.connect("dbname = forum")
    c = DB.cursor()
    c.execute(
        "select time, content from posts order by time desc"
        )
    posts = [{'content': str(row[1]), 'time': str(row[0])} for row in c.fetchall()]
    #posts.sort(key=lambda row: row['time'], reverse=True)
    DB.close()
    return posts

## Add a post to the database.
def AddPost(content):

    DB = psycopg2.connect("dbname = forum")
    c = DB.cursor()
    c.execute("insert into posts(content) values('%s')" % (content)
            )
    #t = time.strftime('%c', time.localtime())
    #DB.append((t, content))
    DB.commit()
    DB.close()

漏洞，单引号和SQl injection（ Bobby Tables Destroyer of Posts）

但是还是有问题，两个。

比如当我写一个post, it's not fun! ,会返回error。

因为c.execute("insert into posts(content) values('%s')" % (content))中values('%s'),会变为value(' it's not fun! '),前两个单引号'已经构成了content,后面的t无法被识别。

在forum里的post里写'); delete from posts; --，会把之前所有的post都删除。

把这个post放入value内，value(' '); delete from posts; --,整个语句就变为

c.execute("insert into posts(content) value(' '); delete from posts; --" % (content))

这样会执行delete语句

这是security hole, called an SQL injection attact.

wiki:SQL攻击（SQL injection），简称注入攻击，是发生于应用程序之数据库层的安全漏洞。简而言之，是在输入的字符串之中注入SQL指令，在设计不良的程序当中忽略了检查，那么这些注入进去的指令就会被数据库服务器误认为是正常的SQL指令而运行，因此遭到破坏或是入侵。

解决方法：（范例）

plant = "pumpkin"
c = conn.cursor()
c.execute("insert into garden values(%s)", (plant,))

其中(plant,)是python tuple，

A tuple is a sequence of immutable Python objects
The empty tuple is written as two parentheses containing nothing −
tup1 = ();
To write a tuple containing a single value you have to include a comma, even though there is only one value −
tup1 = (50,);

在AddPost中实际只要改一行即可，原代码

c.execute("insert into posts(content) values('%s')" % (content))

#改为
c.execute("insert into posts(content) values(%s)" , (content,)

要注意，因为是tuple，连接符不能用%，得用，

注意事项：

Warning Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.
The correct way to pass variables in a SQL command is using the second argument of the execute() method:

>>> SQL = "INSERT INTO authors (name) VALUES (%s);" # Note: no quotes
>>> data = ("O'Reilly", )
>>> cur.execute(SQL, data) # Note: no % operator

3.8 漏洞 Spammy Tables

Paste this text into your forum (including the script tags!) and submit it:

<script>
setTimeout(function() {
    var tt = document.getElementById('content');
    tt.value = "<h2 style='color: #FF6699; font-family: Comic Sans MS'>Spam, spam, spam, spam,<br>Wonderful spam, glorious spam!</h2>";
    tt.form.submit();
}, 2500);
</script>

结果，每个2、3秒就会自动提交一个写有SPAM！ＳＰＡＭ！的ｐｏｓｔ，一直在运行。

3.9 Updating Away the Spam - Exercise

Instructor Notes

The syntax of the update statement:

update table set column = value where restriction;

The restriction works the same as in select and supports the same set of operators on column values.

The like operator supports a simple form of text pattern-matching. Whatever is on the left side of the operator (usually the name of a text column) will be matched against the pattern on the right. The pattern is an SQL text string (so it's in 'single quotes') and can use the % sign to match any sub-string, including the empty string.

If you are familiar with regular expressions, think of the % in like patterns as being like the regex .* (dot star).

If you are more familiar with filename patterns in the Unix shell or Windows command prompt, % here is a lot like * (star) in those systems.

For instance, for a table row where the column fish has the value 'salmon', all of these restrictions would be true:

fish like 'salmon'
fish like 'salmon%'
fish like 'sal%'
fish like '%n'
fish like 's%n'
fish like '%al%'
fish like '%'
fish like '%%%'

And all of these would be false:

fish like 'carp'
fish like 'salmonella'
fish like '%b%'
fish like 'b%'
fish like ''

My Solution

在原有的execute语句前又添加了一句execute

def GetAllPosts():

    DB = psycopg2.connect("dbname = forum")
    c = DB.cursor()
    c.execute(
        "update posts set content = 'cheese!' where content like '%spam%';"
        ) # 添加line
    c.execute(
        "select time, content from posts order by time desc;"
        )

    posts = [{'content': str(row[1]), 'time': str(row[0])} for row in c.fetchall()]
    #posts.sort(key=lambda row: row['time'], reverse=True)
    DB.close()
    return posts

3.10 Deleting the cheese

The syntax for the delete command:

delete from table where restriction ;

The restriction works the same way as in select, with the same set of operators allowed.
添加了一句

    c.execute(
        "update posts set content = 'cheese!' where content like '%spam%';"
        )
    c.execute(
        "delete from posts where content like '%cheese%'; "
        )
    c.execute(
        "select time, content from posts order by time desc;"
        )

如果用where content = 'cheese'的话，则不能删除cheese!或cheeses.

Lesson 4: Deeper Into SQL

In this lesson, you'll learn how to design and create new tables and databases. You'll learn about normalized design, which makes it easier to write effective code using a database. You'll also learn how to use the SQL join operators to rapidly connect data from different tables.

Rules for normalized tables

In a normalized database, the relationships among the tables match the relationships that are really there among the data. Examples here refer to tables in Lessons 2 and 4.

Every row has the same number of columns.
In practice, the database system won't let us literally have different numbers of columns in different rows. But if we have columns that are sometimes empty (null) and sometimes not, or if we stuff multiple values into a single field, we're bending this rule.

The example to keep in mind here is the diet table from the zoo database. Instead of trying to stuff multiple foods for a species into a single row about that species, we separate them out. This makes it much easier to do aggregations and comparisons.
There is a unique key and everything in a row says something about the key.
The key may be one column or more than one. It may even be the whole row, as in the diet table. But we don't have duplicate rows in a table.

More importantly, if we are storing non-unique facts — such as people's names — we distinguish them using a unique identifier such as a serial number. This makes sure that we don't combine two people's grades or parking tickets just because they have the same name.
Facts that don't relate to the key belong in different tables.
The example here was the items table, which had items, their locations, and the location's street addresses in it. The address isn't a fact about the item; it's a fact about the location. Moving it to a separate table saves space and reduces ambiguity, and we can always reconstitute the original table using a join.
Tables shouldn't imply relationships that don't exist.
The example here was the job_skills table, where a single row listed one of a person's technology skills (like 'Linux') and one of their language skills (like 'French'). This made it look like their Linux knowledge was specific to French, or vice versa ... when that isn't the case in the real world. Normalizing this involved splitting the tech skills and job skills into separate tables.

Lesson 5 final project

fetchall()

#!/usr/bin/python

import psycopg2

conn = psycopg2.connect(database="testdb", user="postgres", password="pass123", host="127.0.0.1", port="5432")
print "Opened database successfully"

cur = conn.cursor()

cur.execute("SELECT id, name, address, salary  from COMPANY")
rows = cur.fetchall()
for row in rows:
   print "ID = ", row[0]
   print "NAME = ", row[1]
   print "ADDRESS = ", row[2]
   print "SALARY = ", row[3], "\n"

print "Operation done successfully";
conn.close()

When above program is executed, it will produce the following result:

Opened database successfully
ID =  1
NAME =  Paul
ADDRESS =  California
SALARY =  20000.0

ID =  2
NAME =  Allen
ADDRESS =  Texas
SALARY =  15000.0

ID =  3
NAME =  Teddy
ADDRESS =  Norway
SALARY =  20000.0

ID =  4
NAME =  Mark
ADDRESS =  Rich-Mond
SALARY =  65000.0

Operation done successfully

http://www.tutorialspoint.com/postgresql/postgresql_python.htm

两个final project的实例
基础功能：
https://github.com/jkrooskos/tournament_results/blob/master/tournament.py

完整功能：
https://benjaminbrandt.com/relational-databases-final-project/

最后编辑于：2017.11.27 05:18:36

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 211,884评论 6赞 492
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 90,347评论 3赞 385
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 157,435评论 0赞 348
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 56,509评论 1赞 284
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 65,611评论 6赞 386
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 49,837评论 1赞 290
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 38,987评论 3赞 408
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 37,730评论 0赞 267
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 44,194评论 1赞 303
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 36,525评论 2赞 327
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 38,664评论 1赞 340
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 34,334评论 4赞 330
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 39,944评论 3赞 313
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 30,764评论 0赞 21
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 31,997评论 1赞 266
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 46,389评论 2赞 360
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 43,554评论 2赞 349

Intro to Relational Databases (closed)

lesson 2

All the tables in the zoo database

The SQL for it

The Experiment Page(一些SQL语法实例)

Select Clauses

Insert - Quiz

join

lesson 3 python DB-API

3.3 Trying Out DB-API

(prepa install the database

3.5 Running the Forum

本课要解决这个问题。

3.6 Hello PostgreSQL

3.7Give that App a Backend

Instructor Notes

forumdb.py源代码

漏洞，单引号和SQl injection（ Bobby Tables Destroyer of Posts）

3.8 漏洞 Spammy Tables

3.9 Updating Away the Spam - Exercise

Instructor Notes

My Solution

3.10 Deleting the cheese

Lesson 4: Deeper Into SQL

Lesson 5 final project

fetchall()

推荐阅读更多精彩内容