Skip to content

Latest commit

 

History

History
94 lines (76 loc) · 2.31 KB

File metadata and controls

94 lines (76 loc) · 2.31 KB

Cassandra Operations

Installation

  • w/ tf OR ansible -> good
  • JVM8 !!!
  • Clocks must be synced via NTP
  • Disable zone_reclaim_mode on NUMA Systems
  • tune / increase open file handlers
  • TCP Buffer Settings
  • disable swap

AWS ressources

  • better many smaller nodes, than less bigger nodes
    • C5 is good
    • 8 cores 8GB Ram, guaranteed 10GB network
    • Disk -> LAAAAAARGE :)

Nodetool Operations

  • nodetool flush
    • flushes from memtables to SSTables
  • nodetool drain
    • gracefull shutdown
    • flushes all memtables to SSTables & rejecting connections afterwards
  • nodetool cleanup
    • after compaction
    • scans all the data
    • -j 0 -> speed up
  • nodetool repair
    • better use cassandra reaper

Monitoring Repair

  • nodetool netstats
  • beside that no way to monitor repairs

SSTables Utilities

  • sstableutil <KEYSAPCE> <TABLE>
  • sstabledump <SSTABLE_FILE>
  • sstableverify <SSTABLE_FILE
  • sstablescrup <KEYSPACE> <TABLE>
  • sstablerepairdset -really-set <TABLES>
  • sstableexpiredblockers

Common Tasks

TIP

  • use seperate EBS volumes for OS, Logs, Datastore

add a node

  • cassandra version must match
  • cluster must consistent
  • use the same seed nodes
  • add nodes to all AZ at a time
  • can be automated

Failed node

  • Recover the AWS way: reuse the EBS (Backup) + ENI (same IP like before)

Replace running node

  • nodetool drain

Remove a node

  • nodetool decommission (on the node itself)
  • nodetool removenode <HOSTID> (on another node)
  • if all else fails: nodetool assissinate <IP> (aggressive, needs maybe a repair)

Backup

  • full backup (snapshots)
    • nodetool snapshot
    • hardlinks in FS
    • on a per node level -> so do it on every node !!! (e.g. pssh)
  • incremental backup
    • SSTables + commitlogs since full-backup (snapshot)
  • normally: user defined activity :(
  • Tools
    • Medusa
    • Tablesnap

Restoring

  • ensure schema is in place
  • truncate to ensure problematic changes are removed
  • nodetool refresh
  • changes since backup? -> sstableloader (inserts SSTable into the node)

Backup w/ AWS

  • Backup EBS disks -> Amazon Data Lifecycle Manager to automate backups
  • nodetool flush before backup
  • S3 Bucket lifecycle policy

Upgrade

  • TEST before upgrade -> give releases some time before upgrade
  • maybe config changes -> change ansible