andy  presenter   @andycobley   acobely  at

distributed and decentralized   column orientated  key value store (point to columns)  Fault tolerant (machine goes down keeps on going)

JAVA  apache project   facebook open sourced

CAP triangle   (theory)  or brewers theorem

consistency  availability  partition tolerance  have TWO but not all three?

Changes are dramatic in cassandra  so be aware , documentation wrong      read units to see what is going on

multiple nodes  can be rack aware  keys can replicated across node,  DHT distriubed hash table  think bit torrent  (basically same technology)

how many copies   rackunaware  also make it data center aware

How many copies do we want to respond?  How accurate is the data?   columnfamilies   used  sperate value for read and write

write is much faster than reads  writes goes to commit log memtable, the flushed to disk   all writes are sequential

READS much slower

columns differing lengths and Supercolumns  family of columns  e.g blog entry and sub comments

supercolumns can be order

look at the real code for  a bloggyappy

