1 May 2002
Operating Systems

Distributed Computing
Chapter 17
------------
Atomic Transactions
  higher level synchronization than "semaphore"-like impl above
  can continue to work, but until commit, it is as if nothing changed
  multiple processes involved
  all or nothing: if one crashes, no commit
                  rollback

  Assume message loss handled by retransmission
  assume stable storage
  
  Properties: ACID
    Atomic =- appears indivisible to outside world
    Consistent - does not violate system invariants, can be broken inside
    Isolated - serializable, concurrent transactions do not interfere with  
               one another
              
    Durable - after commit, changes are permanent

  In a transaction: parent starts children
                      what if children commit, but parent does not?
  implementation
    private workspace - copy files/objects which are changed
  recovery log
    how to undo operations - only change after log successfully written
  two phase commit (Gray, 1978)

    coordinator                participant

    phase 1:                               
    write Prepare in log      
    send Prepare msg
                               write Ready (promise to commit)
                               send Ready
    ------------------------------------------
    phase 2:
    write log record
    send commit
                               write commit in log (might be abort)
                               commit
                               send finished
    ------------------------------------------
    resilient to crashes, read log, continue where left off

==================
 
Election Algorithms
===================

Several of these algorithms we have assumed a coordinator.  But what happens
when that coordinator dies?  We need a mechanism to elect a new coordinator

Assume: unique priority for each active process
Declare: the coordinator is *always* the process with the highest priority

 The Bully Algorithm
 ------------------
   If coordinator times out, assume you are the new leader
     send election msg to all w/ id > you
     if no reply within T, you are the leader
       send all procs your id, to identify you as the new leader
     if get a reply, wait T'
       if no reply, restart, id>you failed

   when a process recovers, restart algorihtm
     could bully out a lower numbered active coordinator


 The Ring Algorithm
 ------------------

  when proc knows coordinator not functioning - build election msg with self
    each proc adds self to list
  when gone full circle, change msg type to COORDINATOR
    and circulate new coord is member of lit with highest id

  having two lists circulating is fine, it just wwastes bandwidth
    both lists are the same



  unidirectional, logical ring.
  active list: list with all priorities of all active processes when
  the algorithm ends.


    1) P_i detects coordinator failure. new empty a-list. send
       "elect(i)" and add "i" to list.
    2) if P_i receives "elect(j)"
       (a) if this the first "elect" message, create new list with
           "i,j" and send "elect(i)", "elect(j)"
       (b) if i != j, add j to list, forward "elect(j)"
       (c) if i == j, a-list has all active nodes. the coordinator is
           the one with the highest priority number.


===============
Distributed deadlock detection
  something to think about
==============
reminder of course subjects:
         Device Management
         Process Management
         Scheduling
         Synchronization
         Memory Management
         File Systems
         Security
         Networks and Distributed Systems