Daemon & Client Overview
Architecture
Client <--> Daemon --> Control Plane
The Gremlin daemon, gremlind
is a binary installed on the operating system or available inside the Gremlin container. It heartbeats with the Gremlin Control Plane to let Gremlin know that the host is active and able to receive attack orders. It only communicates outbound with the Gremlin Control Plane. All traffic is encrypted.
The Gremlin client, gremlin
, refers to the Gremlin command line interface that is responsible for creating the local impact within the host.
The daemon bundled with the command line interface as a unit is referred to as a targetable Client to the platform.
Client Lifecycle
Gremlin clients (infrastructure and application) that have been authenticated to the Gremlin Control Plane appear in the infrastructure clients and application clients lists. You can only run attacks on "active" clients. A client goes into an "idle" state if there is no activity for the past 5 minutes. You cannot run or schedule attacks on idle clients. If Gremlin does not hear from these idle clients for a period of 24 hours, the clients are removed from the list. However, if a client starts communicating with Gremlin again while still within the 24 hour idle window, the client is reactivated and returned to the "active" state.
Logs
Logs can be found under the /var/log/gremlin
directory.
Daemon log entries can be found in the daemon.log
file. Log entries in this file may indicate events where the daemon is not able to communicate with the Control Plane and eventually trigger the Dead Man Switch.
Each attack on the host is logged under /var/log/gremlin/executions
using its unique attack execution ID.
Bandwidth Usage
Idle State
The daemon uses very little bandwidth in its idle state. In testing over a 15 minute period the daemon used only an average of 1.15 KB/sec over any given 10 seconds.
Inbound bandwidth average, zeros dropped
1.98918829974
Outbound bandwidth average, zeros dropped
0.55621495
Aggregate bandwidth average, zeros dropped
2.51990083564
Aggregate bandwidth average over testing period
1.15347573462
Attack State
While testing, there is a slight increase in overall bandwidth consumption during attacks. While attacks are being executed, the daemon stays in constant communication with the control plane as it checks for the abort condition to be executed. Regardless of attack being run, the attack state behavior looks the same. Provided are two data sets, one from a CPU Resource attack, and one from a Latency Network attack.
Inbound bandwidth average, zeros dropped
2.71233697984
Outbound bandwidth average, zeros dropped
0.87034858716
Aggregate bandwidth average, zeros dropped
3.56685480642
Aggregate bandwidth average over testing period
3.0556056175
Inbound bandwidth average, zeros dropped
2.64944416154
Outbound bandwidth average, zeros dropped
0.917963469407
Aggregate bandwidth average, zeros dropped
3.55221005449
Aggregate bandwidth average over testing period
3.11712392366
There is no statistically significant difference between the two attacks. Despite a nearly 3X rise in aggregate average traffic, the 3 KB/sec bandwidth utilization is still very low on a per client basis.