- w/ tf OR ansible -> good
- JVM8 !!!
- Clocks must be synced via NTP
- Disable
zone_reclaim_modeon NUMA Systems - tune / increase open file handlers
- TCP Buffer Settings
- disable swap
- better many smaller nodes, than less bigger nodes
- C5 is good
- 8 cores 8GB Ram, guaranteed 10GB network
- Disk -> LAAAAAARGE :)
nodetool flush- flushes from memtables to SSTables
nodetool drain- gracefull shutdown
- flushes all memtables to SSTables & rejecting connections afterwards
nodetool cleanup- after compaction
- scans all the data
-j 0-> speed up
nodetool repair- better use
cassandra reaper
- better use
nodetool netstats- beside that no way to monitor repairs
sstableutil <KEYSAPCE> <TABLE>sstabledump <SSTABLE_FILE>sstableverify <SSTABLE_FILEsstablescrup <KEYSPACE> <TABLE>sstablerepairdset -really-set <TABLES>sstableexpiredblockers
TIP
- use seperate EBS volumes for OS, Logs, Datastore
- cassandra version must match
- cluster must consistent
- use the same seed nodes
- add nodes to all AZ at a time
- can be automated
- Recover the AWS way: reuse the EBS (Backup) + ENI (same IP like before)
nodetool drain
nodetool decommission(on the node itself)nodetool removenode <HOSTID>(on another node)- if all else fails:
nodetool assissinate <IP>(aggressive, needs maybe a repair)
- full backup (snapshots)
nodetool snapshot- hardlinks in FS
- on a per node level -> so do it on every node !!! (e.g.
pssh)
- incremental backup
- SSTables + commitlogs since full-backup (snapshot)
- normally: user defined activity :(
- Tools
- Medusa
- Tablesnap
- ensure schema is in place
truncateto ensure problematic changes are removednodetool refresh- changes since backup? ->
sstableloader(inserts SSTable into the node)
- Backup EBS disks -> Amazon Data Lifecycle Manager to automate backups
nodetool flushbefore backup- S3 Bucket lifecycle policy
- TEST before upgrade -> give releases some time before upgrade
- maybe config changes -> change ansible