Bigtable
BigTable是一种压缩的、高性能的、高可扩展性的,基于Google文件系统(Google File System,GFS)的数据存储系统,用于存储大规模结构化数据,适用于云端计算。
BigTable发展于2004年[1],现今已成为Google的应用程序。像是MapReduce就常透过BigTable来存储或更改资料,[2]其他还有Google Reader[3]、Google Maps[4]、Google Book Search、"My Search History"、Google Earth、Blogger.com、Google Code hosting、Orkut[4]、YouTube[5]以及Gmail[6]等。Google自行发展出特别的巨型数据库的原因,自然是性能的问题[7]。
BigTable不是传统的关系型数据库,不支持JOIN这样的SQL语法,BigTable更像今日的NoSQL的Table-oriented,优势在于扩展性和性能。BigTable的Table数据结构包括row key、col key和timestamp,其中row key用于存储倒转的URL,例如www.google.com必须改成com.google.www。BigTable使用大量的Table,在Table之下还有Tablet。每一个Tablets大概有100-200MB,每台机器有100个左右的Tablets。所谓的Table是属于immutable的SSTables,也就是存储方式不可修改。另外Table还必须进行压缩,其压缩又分成table的压缩或系统的压缩。客户端有一指向META0的Tablets的指针,META0 tablets保储所有的META1的tablets的资料记录。
相关条目
[编辑]注释
[编辑]- ^ "First an overview. BigTable has been in development since early 2004 and has been in active use for about eight months (about February 2005)." Google's BigTable (页面存档备份,存于互联网档案馆)
- ^ "Bigtable can be used with MapReduce, a framework for running large-scale parallel computations developed at Google. We have written a set of wrappers that allow a Bigtable to be used both as an input source and as an output target for MapReduce jobs". pg 3 of "Bigtable: A Distributed Storage System for Structured Data", 2006
- ^ "Reader is using Google's BigTable in order to create a haven for what is likely to be a massive trove of items." Official Google Reader (页面存档备份,存于互联网档案馆) blog.
- ^ 4.0 4.1 "There are currently around 100 cells for services such as Print, Search History, Maps, and Orkut." Google's BigTable (页面存档备份,存于互联网档案馆)
- ^ "Their new solution for thumbnails is to use Google’s BigTable, which provides high performance for a large number of rows, fault tolerance, caching, etc. This is a nice (and rare?) example of actual synergy in an acquisition." YouTube Scalability Talk (页面存档备份,存于互联网档案馆)
- ^ "How Entities and Indexes are Stored - Google App Engine - Google Code". [2011-04-05]. (原始内容存档于2011-10-06).
- ^ "We have described Bigtable, a distributed system for storing structured data at Google....Our users like the performance and high availability provided by the Bigtable implementation, and that they can scale the capacity of their clusters by simply adding more machines to the system as their resource demands change over time...Finally, we have found that there are significant advantages to building our own storage solution at Google. We have gotten a substantial amount of flexibility from designing our own data model for Bigtable." from the Conclusion of "Bigtable: A Distributed Storage System for Structured Data", 2006
外部链接
[编辑]- Bigtable: A Distributed Storage System for Structured Data(页面存档备份,存于互联网档案馆) -(official paper; PDF(页面存档备份,存于互联网档案馆))
- BigTable: A Distributed Structured Storage System(video(页面存档备份,存于互联网档案馆))
- more video
- Google's BigTable -(notes on the official presentation)
- "How Google Works"[失效链接]
- Is the Relational Database Doomed ?