VOLTB

postsql  founder

address relational failures

13 rules relational databases, so  a long history

user

logical  ER no relationship to user model??

physical

what is stored is an extraction of the data

relational does not have to slow is the presenters them right now.  implementation choice makes it fast slow

VOLTDB

data inmemory  good speed but loose state (what is the data?)

all engineering is compromise

in memory  lockless  relational acid transaction sql

how fast memory over disk,    plenty more  (hard to tie down)  10,000 but hard to measures, little more expensive

Durability  ram  UPS  makes ram more durable  make multiple copies.  How long?  Through money at it.  memory copies

Speed

lockless    ie take out logging  (but logging was put in make faster!!)

logic of query given upfront then the database forms as an optimal entity to max speed / queries

ravenDB deepdive

Rob Ashton

uses LINQ   creates index   uses lucene

can run in memory  fast   (couch doesn’t do this?) (couch http but this does not have to)

all client side via hi-low

does automatic creation of indexes    (but can create those yourself)

does links but don’t call them (JOINS) like  uses projection to auto do that?

couchdb

Simon Wells   swells  at  computing.dundee.ac.uk

arguments on the web how to model in code  how can nosql be used? Chose couchdb that gives flexibility

couchdb   document database

quired and index    like mapreduce  using javascript   restful  JSON API  anything that supports HTTP  e.g curl  core API

Futon buildnweb based admin console,  third party client libraries

increamental replication   bi – di rection conflict detection and resolution

couchone.com  personal data store for mobile data??

written erlang/OTP   language  up time just about 100%

any discrete representation of meaning   RESTful   etc

JSON objects  capture anything  text to video ie whole webpage

Key Value store

curl ip  get back jscon object parse to get access to data

runs on port  5984 only

tutorial on how to use command line and futon tool kit UI.  The basics well told.

replication  a click locals db, and or remote db or two remote dbs   (like the sound of that )

MapReduce  used in couchdb

views are special ‘design documents’  and results   json  id  key value

helper function within couchdb allow json to be parsed e.g to produce html

can work off line  couchAPP  then upload to db

www.couchone.com

don’t recommend books

hadoop jonathon and hbase

Eelektrifi

scaling not an question any more

difference between structured and semi structured

listen to  Abhishek Mehta   cloudera.com  head of big data at bank of america

mine data is where all the value is now,   storage is commodity

Daniel Abadi  listen to his talks

semi structured data  where the game is at,  why is this important.

google uses mapreduce  manages processing inaffect

HBASE?

oracle becoming expense and mysql  problems at lowest volumes 500GB

zookeeper   co ordinates servers  (cassandra does not have?)

bascis  row column  intersect cell   families  upfront defined   rest done automatically

commodity hardware  but memory more in demand

think of column family as being tagged groups of data

no join world

BASE    like cap for Nosql

strata conference feb 2011

cassandra nosql

andy  presenter   @andycobley   acobely  at  computing.dundee.ac.uk

distributed and decentralized   column orientated  key value store (point to columns)  Fault tolerant (machine goes down keeps on going)

JAVA  apache project   facebook open sourced

CAP triangle   (theory)  or brewers theorem

consistency  availability  partition tolerance  have TWO but not all three?

Changes are dramatic in cassandra  so be aware , documentation wrong      read units to see what is going on

EVENTUAL CONSISTENCY
multiple nodes  can be rack aware  keys can replicated across node,  DHT distriubed hash table  think bit torrent  (basically same technology)

REPLICATION
how many copies   rackunaware  also make it data center aware

CONSISTENCY LEVELS
How many copies do we want to respond?  How accurate is the data?   columnfamilies   used  sperate value for read and write

http://wiki.apache.org/cassandra/API

WRITE READ SPEED
write is much faster than reads  writes goes to commit log memtable, the flushed to disk   all writes are sequential

READS much slower

columns differing lengths and Supercolumns  family of columns  e.g blog entry and sub comments

supercolumns can be order

look at the real code for  a bloggyappy

NOSQL start

@garyshort

The Nosql philosophy

why begun,  really noREL

store only once, storage now cheap

in relationship

ACID
atomic  transaction happen it all happens – a gurantee
consistency – move from on to status to the next
isolation – change row, now one else can get to it.
durability – play over and over again

NoSQL  Johan Oskarsson   open source, non relational, distributed  no acid gurantee

NO:east conference Atlanta 2009

no fixed table schema  no rel   aviod join operation (costs)  scale horizontally

DOCUMENT STORAGE

RavenDB Apache Jackrabbit  CouchDB MongoDB  SimplDB  XML databases – marklogic server

GRAPH STORAGE

Allegro

node  edges

e.g. FlockDB   twitter use  node = users   edges  = relationships

KEY / VALUE STORES

on disk   cache  ram   concept of strong and weak  can be order  look at alphabetical order

OBJECT DATABASE

How to index  nosql databases???

key value pairs work

how index document database   depends on db

ravendb  example  (rob coder)  link expression (re-treve data prediction)

Thorny issue is indexing  but can be done

real word scenarios

constant consistency  goal
every point in time e.g. financial or medical records, or bonded goods e.g. warehouse of whisky best use rational model

horizontal scalability goal
number of geographic regions, vast quantities of data, game server sharding    could use NOSQL for this    can use cloud sql e.g. amazonrds

MOBILE stuff

rarely read  nosql  key value use

BIG DATA

weather   statellites maps    use nosql  hadoop

BINARY BABY
e.g. youtube, flickr,   S3   don’t use NOSQL

Transient data   short term data  here today gone tomorrow  e.g. shopping cart  memecache

DATA REPLICATION e.g music example, desktop, mobile etc  couchDB

HIGH AVAILABILITY  e.g. gambling, pay per view,   high number important   e.g. cassandra

TWITTER

challenges   many graphs to store   follow, followme, reach status online to text   update – remove  set arthmetic  e.g @mentions

tried  relational,  key index   blue whale
under the hood complicated, so need a simple solution    horizontal partition    arrive out of oder, process more than once

result   flockBD

stores graph

not optimoised for a graph traversal operations   factural time  non polynominal  n  mathematics
but limit to  follower  not whole graph at all time

Optimised for large adjaceny list   edges of the graph

Optimised for fast read write

Optimised for page  arthimatics

data partitioned by a node ie per person  all queries answer by a single partition

idempotent  applied multiple times without changing results  e.g someone follows you twice without getting an error

Commutative

Idempotency – mathematics   O  set S  S x O x = x

set union  AU B    set interact  A n  B

Commutative    ordering of doing sums    do immediate  dump and going through live and dump  happy to mix

Performance  13 billion edges  20k writes per seconds  100k reads second

Lessons learned  aggressive timeouts     same path for error and normal ops  ignore  just try and try until fail   Instrument

Punchline  –    mysql   sits below  flock

So is this the future?

Yes and no  Gary point of view

techmeetup Aberdeen JQuery talk

techmeetup aberdeen  jQuery talk

scope of javascript  how function work in JS

Example on how JS closure and access to varables within a function

two basics   scope and closure  get that then good basis to under jQuery

jQuery  cross platform / browser compatibility

$  used but can write  jquery

first exception   run function to start jQuery when html page has load,  then add functionality from there.

find property really useful for identifying element with the document dom

events:   e.g. click

bind   v live  difference

handles  POST  url  data input

Overall valuable intro talk.

unconference magic continues #blc8

I participated at barcampLondon8 at the weekend.  A unconference format of event that uses open space techniques to allow a community to self form and to collectively learn and thus enable individual to self learn.  An unconference does not make much sense to most new to the concept but here is its magic:  with no specific planning, I attended a series of sessions that were on topics that I currently have a big need to learn and solve problems around.  I had no idea who was turning up or that those topics would put forward before I turned up.  But they did and guess what, this sort of experiences happens every time I attend such events.  Talking to others, this seems to be a shared outcome.

I lead a session on the LifestyleLinking open source project and I participated in session on  bee keeping,  beginners guide to jQuery, future of barcampLondon, DIY living sustainable, what is the semantic web?, Tiddly wiki, open source charting app, future of HTML ,HTTP what is 200 call 401 etc, Agile development, how to start and contribute to an open source project, crowding sourcing and school IT education.  Not forgetting informal chats on  Monty Hall problem, that was at 5am Sunday morning.  Probably a term of University education in a weekend.  Thank you to all barcampers for making the event the magic it was.