ETOH USENIX Paper
Initial Outline 2000-05-27

Last Revised $Date: 2000-06-04 23:58:01-07 $

Vincent Cordrey & Jordan Schwartz

  1. Abstract
  2. Definition of Terms
  3. Background
    1. Backup Scheduling
      1. Part of Every Systems Managers Responsibilities
    2. Previous Schemes
      1. Standard schemes
        1. All Full
        2. Weekly Fulls
          1. 011112 ????? [I don't want to include this one]
          2. 055555
          3. 0123456 or 0iiiiii
          4. 0335577
        3. Monthly Fulls
      2. One dimensional of Towers of Hanoi
        1. Weekly Towers of Hanoi
          1. 0325476
        2. Monthly Towers of Hanoi
          1. 0325476132547613254761325476
      3. Baseline Schemes
        1. Adds (-1) to any other scheme
  4. Enhanced Pattern
    1. Not Earth Shattering, simply an Enhancement of previously existing techniques.
    2. We should consider moving the numbers up one level to permit baselining or longer cycles. In other words, use the numbers from the baseline schedule for the daily incrementals across the board instead of having different daily numbers in the standard ETOH and Baselined ETOH patterns.

    3. Basic Scheme is Monthly oriented
    4. Basic Monthly Pattern
    5. Base-lined ETOH Schedules
    6. Striping for Load Balancing
    7. Dimensions Discussion & Table
      1. Each dimension adds at least two copies
      2. Two patterns, one across, one vertical
      3. Striping adds a third dimension
  5. Retention Policies
    1. Baseline
    2. Full Backup
    3. Weekly Deltas
    4. Daily Incrementals
    5. Typical Implementation
    6. Simplified Implementation
  6. Striping as Load Balancing
    1. Weekend Full offset schedules
      1. 25% reduction of Full Backup Load
  7. Resource Utilization
    1. Media Consumption
    2. Index Size
    3. Savings Calculations Formulas
  8. Impacts
    1. Backup Times (positive)
    2. Restore Times (neutral)
    3. Number of volumes required for restore
    4. Calculation of probability of failure
  9. Implementation
    1. When to Implement ETOH
      1. Conditions
        1. Increase of Backup Load
        2. Running Past End of Backup Window
        3. High Churn Rate on Filesystem
      2. Data Types (How the data changes)
        1. Read-Only Filesystems
        2. Replicated Filesystems
        3. Regular Files
        4. Database Files
          1. Restoring the database from two weeks ago is worthless. (Data expires quickly)
      3. Business Types (When the data changes)
        1. Non Retail (9-5 Mon-Fri)
        2. Retail Stores (Open Weenend Days)
        3. Network Based Retailers (Active 24x7?)
    2. Algorithms to Calculate Cycle
      1. Pseudocode
      2. Full perl code for level w/ weekly striping offset on http://etoh.wopr.net
    3. Planning the Schedules
      1. Estimate Total Load
      2. Determine Cycle Type
      3. Create Backup Groups
      4. Evaulate need for Striping
      5. Scheduling the Cycles
        1. Ordinary Start of Month Fulls
        2. Striping Offsets
          1. Weekend Full offset schedules
          2. Monthly Delta (Baselined Strategies)
        3. Schedule Start Point Strategy
          1. Start the First Day of Cycle Unit
          2. Start First Weekend Day of Cycle
    4. Platform Specific
      1. Full Perl Executable (on etoh.wopr.net)
      2. Scripting Backup Schedule Table Creation
      3. Front-end Scripts
        1. Hostdump.sh
        2. Amanda
        3. Legato
  10. Case Studies
    1. RAND
      1. Weekly Cycle
      2. Every 4th full dump 1 Yr. Retention
      3. "Numbering Schemes"
        1. Fouth full dumps were called Sets"
        2. Saved one backup, Set 0,r per year forever
      4. Sets stored off site
      5. No cloning
    2. Hughes Space & Communications
      1. Genisis of ETOH
      2. Striping Applied
      3. Started with hand coded Legato schedule tables
      4. Cloning
      5. Year retention of monthly fulls and weekly deltas
      6. Quarterly retention of daily incrementals
  11. Summary
    1. Benifits
      1. Cost Benifits
      2. Robustness
        1. At least two copies of every file
        2. Reduced reliance on incrementals from several media back to the full. (This property is inhereted from the standard Towers of Hanoi scheme)
          1. Reduced numbers of media for restore
          2. Lower likely hood of failure
      3. Reduced Backup Time
      4. Preserved Restore Time
      5. Reduced Index Size
      6. Reduced Media Utilization
    2. Impacts
      1. Risks
      2. complexity
  12. Future Enhancements
    1. Call For More Levels
      1. Limited levels causes a 1 month sweet spot
      2. Baseline as -1
      3. Levels 10, 11, 12, 13, 14, ....
      4. Better Scheduling if more levels were available
        1. Longer Cycles
          1. Two month cycle
          2. Quarterly cycle
        2. Better control over "two copies" rule of thumb
        3. Enhancement to a "three copies" rule of thumb
      5. More granularity for retention classes
      6. Reduced Index size with long retentions
        1. Using a backup system as an archive
    2. Variable Length Cycle Implementation
    3. Something Jordan Said
    4. XML deffinitions [DTD] for Backup Illustrations
  13. Bibliography
    1. Curtis Preston Aligator Book
    2. Nemeth Systems Administration
    3. Man pages on ufsdump
    4. Amanda Reference Manual
    5. Legato Admin Guide
    6. NetBackup Admin Guide
    7. Towers of Hanoi at Toronto.edu
    8. How to see in More Dimensions, Ragnar-Olaf Buchweitz
    9. XML: A Primer, Second Edition, Simon St.Laurent