大纲
- 同步
- 网络
- 数据库
- 分布式
- 性能
- 估算
- 面向对象
案例
- 社交网站信息流
- 日志统计
- 网络爬虫
- 电商产品页面
Concurrency
Thread vs. Process
Consumer and Producer
Blockingqueue
Tracking:(log记录)
Synchronized(同步):每次请求直接写入磁盘
Asynchronized(异步):先放入缓冲区,每隔一段时间刷新到磁盘上
Network
OSI模型(上到下封包)
- Application Layer(*http1.0/1.1协议,应用层)
- Presentaation Layer(应用层)
- Session Layer(会话层)
- Transport Layer(*TCP,UDP协议,传输层)
- Network Layer(网络层)
- Data Link Layer(数据连接层)
- Physical Layer(物理层)
Visit URL
What happens after you typed a URL in your browser and pressed return key?
(*寻址与建立链接是关键)
- 访问DNS
- DNS返回网页服务器IP地址
- 与网页服务器建立连接(三次无首,80端口)
- 浏览器与服务器建立http会话(接受数据)
- 浏览器解析数据,渲染网页
- 关闭浏览器,终止http会话
Database
Relational DB vs. KV Store(关系型数据库VS.KeyValue存储)
Sharding vs. Clustering(分片VS.集群)
TinyURL:
Store the mapping from shortlink code to full URL.
document:
- code:varchar(8)
- url:varchar(1000)
- created_at:timestamp
- We also need to store the reverse mapping from URL back to code.
Distribute System
How to scala Tiny URL service?(规模化Tiny URL服务)
- Stateless frontend servers behind a load balancer(负载均衡)
- Sharded/replicated database(on shortlink code)(数据库备份化)
- Memcached to scala read traffic(逐层加cache提高性能)
- Spread write load(平分化)
- Locally buffered evevt tracking + async flush to high-throughput messagequeue
- Use a distributed unique IDgenerator(64-bit)
Performance
Cache is KEY!
- Numbers Everyone Should Know
- L1 cache reference 0.5ns
- Branch mispredict 5ns
- L2 cache reference 7ns
- Mutex lock/unlock 25ns
- Main memory reference 100ns
- Compress 1K bytes with Zippy 3,000ns
- Send 2K bytes over 1 Gbps network 20,000ns
- Read 1 MB sequentially from memory 250,000ns
- Round trip within same datacenter 500,000ns
- Disk seek 10,000,000ns
- Read 1 MB sequentially from disk 20,000,000ns
- Send packet CA->Netherlands->CA 150,000,000ns
Estimation
How many piano tuners are there in the entire world?
Tiny URL:How much is total storage?
- URL Length 10-1000 chars.
- Total accumulated URL number 100M
- New URL registrations are on the order 100,000/day(1/sec)
- Redirect requests are on the order of 100M/day(1000/sec)
Design Pattern
23patterns
- MVC(前后端分离,中间件连接)
- Singleton(保证只有一个实例)
- Factory(工业框架,生成一系列子类)
- Iterator(迭代器,提供方法顺序访问聚合对象并且不暴露内部特征)
- Decorator(装饰器,为对象添加额外职责)
- Facade