Assuming you already had a single node Casandra running with data in it and now you are planning to add another node or multiple nodes to form a cluster, following is what needs to be done:
Instance 1 : 192.168.13.156 - Has data in data directory and already running cassandra
Shutdown cassandra -> kill <pid>
Make the following changes to cassandra.yaml and start Cassandra
cluster_name: 'Test Cluster'
seeds: "192.168.13.156"
listen_address: 192.168.13.156
rpc_address: 0.0.0.0
rpc_port: 9160
uncomment the following i.e. remove # -
# broadcast_rpc_address: 1.2.3.4
start cassandra
Instance 2: 192.168.104.29 - Fresh instance
Shutdown cassandra -> kill <pid>
Delete the data dir -> rm -rf data
The data will be synch'd from the seed on startup
Make the following changes to cassandra.yaml and start Cassandra
cluster_name: 'Test Cluster'
seeds: "192.168.13.156"
listen_address: 192.168.104.29
rpc_address: 0.0.0.0
rpc_port: 9160
uncomment the following i.e. remove # -
# broadcast_rpc_address: 1.2.3.4
run nodetool status to verify the cluster
Points to remember:
- Deleting the data dir on the new node ensures fresh data is synchedup without interference of the local data on the node
- Don't add the ips of one node as seed to another and vice-versa. It might result in loss of data. You should ensure the node which has all the data acts as seed to others, i.e. till the cluster is balanced.