-
Notifications
You must be signed in to change notification settings - Fork 53
Storm topology LCM testing
Nikita Marchenko edited this page Jan 23, 2018
·
11 revisions
Main goal: Stay operational between reboot events.
The only way to update storm topology is to kill it and send a new one. From official documentation: Storm won't kill the topology immediately. Instead, it deactivates all the spouts so that they don't emit any more tuples, and then Storm waits Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS(topology.message.timeout.secs) seconds before destroying all the workers. This gives the topology enough time to complete any tuples it was processing when it got killed.
List of topologies: cache, flow, islstats, opentsdb, stats, wfm
- State data loss in Bolts (inheritors of BaseStatefulBolt) in case of topology restart
- Offset loss for Topology Spouts, which causes execution of already processed Tuples.
From: org.openkilda.wfm.topology.AbstractTopology
protected org.apache.storm.kafka.KafkaSpout createKafkaSpout(String topic, String spoutId) {
String zkRoot = String.format("/%s/%s", getTopologyName(), topic);
ZkHosts hosts = new ZkHosts(config.getZookeeperHosts());
SpoutConfig cfg = new SpoutConfig(hosts, topic, zkRoot, spoutId);
cfg.startOffsetTime = OffsetRequest.EarliestTime();
cfg.scheme = new SchemeAsMultiScheme(new StringScheme());
cfg.bufferSizeBytes = 1024 * 1024 * 4;
cfg.fetchSizeBytes = 1024 * 1024 * 4;
return new org.apache.storm.kafka.KafkaSpout(cfg);
}
- Issue: Bolts execute tuples before internal state were restored.
- State restoration implemented for Cache bolt only. Other bolts have no such functionality implemented.
- Size of Dump from TE can potentially more than Kafka can handle [need check]
Pre-conditions for all topologies:
- Healthy controller
- Added topology and check it
$ cat simple-topology.json
{
"controllers": [
{
"host": "kilda",
"name": "floodlight",
"port": 6653
}
],
"links": [
{
"node1": "00000001",
"node2": "00000002"
}
],
"switches": [
{
"dpid": "deadbeef00000001",
"name": "00000001"
},
{
"dpid": "deadbeef00000002",
"name": "00000002"
}
]
}
$ http POST :38080/topology < simple-topology.json
- Added flow and check it
$ NB_IP=$(docker inspect --format '{{ .NetworkSettings.Networks.openkilda_default.IPAddress }}' openkilda_northbound_1)
$ cat simple-flow.json
{
"flowid": "flow-01",
"source": {
"switch-id": "de:ad:be:ef:00:00:00:01",
"port-id": 1,
"vlan-id": 100
},
"destination": {
"switch-id": "de:ad:be:ef:00:00:00:02",
"port-id": 1,
"vlan-id": 100
},
"maximum-bandwidth": 10000,
"description": "flow-01"
}
$ http --auth kilda:kilda PUT $NB_IP:8080/api/v1/flows < simple-flow.json
$ http --auth kilda:kilda $NB_IP:8080/api/v1/flows/status/flow-01
- Dump cache state
$ ./probe dump-state "cache/cache"
+----------+-----------+---------+
| Topology | Component | Task ID |
+----------+-----------+---------+
| cache | cache | 3 |
+----------+-----------+---------+
+----------
| Flows
+---------------------+------------------------------------------------------------------+------------------------------------------------------------------+
| Property | Forward | Reverse |
+---------------------+------------------------------------------------------------------+------------------------------------------------------------------+
| flowid | flow-01 | flow-01 |
| bandwidth | 10000 | 10000 |
| cookie | 4611686018427387905 | 2305843009213693953 |
| description | flow-01 | flow-01 |
| last_updated | 2018-01-11T07:05:28.276Z | 2018-01-11T07:05:28.276Z |
| src_switch | de:ad:be:ef:00:00:00:01 | de:ad:be:ef:00:00:00:02 |
| dst_switch | de:ad:be:ef:00:00:00:02 | de:ad:be:ef:00:00:00:01 |
| src_port | 1 | 1 |
| dst_port | 1 | 1 |
| src_vlan | 100 | 100 |
| dst_vlan | 100 | 100 |
| meter_id | 1 | 1 |
| transit_vlan | 2 | 3 |
| flowpath:latency_ns | 1 | 1 |
| state | UP | UP |
| | switch_id port_no segment_latency seq_id | switch_id port_no segment_latency seq_id |
| path | ---------------------------------------------------------------- | ---------------------------------------------------------------- |
| | de:ad:be:ef:00:00:00:01 1 1 0 | de:ad:be:ef:00:00:00:02 1 1 0 |
| | de:ad:be:ef:00:00:00:02 1 None 1 | de:ad:be:ef:00:00:00:01 1 None 1 |
+---------------------+------------------------------------------------------------------+------------------------------------------------------------------+
- Kill cache topology
storm kill cache
- Wait 30 sec timeout
sleep 30
- Health check - FAILED
$ http --auth kilda:kilda $NB_IP:8080/api/v1/health-check
HTTP/1.1 504
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Content-Type: application/json;charset=UTF-8
Date: Thu, 11 Jan 2018 16:36:40 GMT
Expires: 0
Pragma: no-cache
Transfer-Encoding: chunked
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
{
"components": {
"cache-storm-topology": "non-operational",
"event-wfm-storm-topology": "operational",
"flow-storm-topology": "operational",
"statistics-storm-topology": "operational"
},
"description": "Northbound API service provides API for Kilda controller",
"name": "Northbound",
"version": "1.0-SNAPSHOT"
}
- Create new flow from NB - FAILED
$ http --auth kilda:kilda PUT $NB_IP:8080/api/v1/flows < simple-flow-2.json
HTTP/1.1 200
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Content-Type: application/json;charset=UTF-8
Date: Thu, 11 Jan 2018 16:38:43 GMT
Expires: 0
Pragma: no-cache
Transfer-Encoding: chunked
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
{
"description": "flow-02",
"destination": {
"port-id": 1,
"switch-id": "de:ad:be:ef:00:00:00:02",
"vlan-id": 101
},
"flowid": "flow-02",
"last-updated": "2018-01-11T16:38:43.783Z",
"maximum-bandwidth": 10000,
"source": {
"port-id": 1,
"switch-id": "de:ad:be:ef:00:00:00:01",
"vlan-id": 101
}
}
$ http --auth kilda:kilda $NB_IP:8080/api/v1/flows/status/flow-02
HTTP/1.1 200
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Content-Type: application/json;charset=UTF-8
Date: Thu, 11 Jan 2018 16:39:59 GMT
Expires: 0
Pragma: no-cache
Transfer-Encoding: chunked
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
{
"flowid": "flow-02",
"status": "ALLOCATED"
}
- Get all flows - OK
- Load topology
$ cd services/wfm
$ storm jar target/WorkflowManager-1.0-SNAPSHOT-jar-with-dependencies.jar \
org.openkilda.wfm.topology.cache.CacheTopology \
--name=cache src/main/resources/topology.properties
- Check flow: expected state ALLOCATED, current state UP
$ http --auth kilda:kilda $NB_IP:8080/api/v1/flows/status/flow-02
HTTP/1.1 200
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Content-Type: application/json;charset=UTF-8
Date: Fri, 12 Jan 2018 17:14:26 GMT
Expires: 0
Pragma: no-cache
Transfer-Encoding: chunked
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
{
"flowid": "flow-02",
"status": "UP"
}
- Dump cache state and verify. Two flow is UP.
$ ./probe dump-state "cache/cache"
+----------+-----------+---------+
| Topology | Component | Task ID |
+----------+-----------+---------+
| cache | cache | 3 |
+----------+-----------+---------+
+----------
| Flows
+---------------------+------------------------------------------------------------------+------------------------------------------------------------------+
| Property | Forward | Reverse |
+---------------------+------------------------------------------------------------------+------------------------------------------------------------------+
| flowid | flow-01 | flow-01 |
| bandwidth | 10000 | 10000 |
| cookie | 4611686018427387906 | 2305843009213693954 |
| description | flow-01 | flow-01 |
| last_updated | 2018-01-12T17:10:10.889Z | 2018-01-12T17:10:10.889Z |
| src_switch | de:ad:be:ef:00:00:00:01 | de:ad:be:ef:00:00:00:02 |
| dst_switch | de:ad:be:ef:00:00:00:02 | de:ad:be:ef:00:00:00:01 |
| src_port | 1 | 1 |
| dst_port | 1 | 1 |
| src_vlan | 100 | 100 |
| dst_vlan | 100 | 100 |
| meter_id | 2 | 2 |
| transit_vlan | 4 | 5 |
| flowpath:latency_ns | 1 | 1 |
| state | UP | UP |
| | switch_id port_no segment_latency seq_id | switch_id port_no segment_latency seq_id |
| path | ---------------------------------------------------------------- | ---------------------------------------------------------------- |
| | de:ad:be:ef:00:00:00:01 1 1 0 | de:ad:be:ef:00:00:00:02 1 1 0 |
| | de:ad:be:ef:00:00:00:02 1 None 1 | de:ad:be:ef:00:00:00:01 1 None 1 |
+---------------------+------------------------------------------------------------------+------------------------------------------------------------------+
+---------------------+------------------------------------------------------------------+------------------------------------------------------------------+
| Property | Forward | Reverse |
+---------------------+------------------------------------------------------------------+------------------------------------------------------------------+
| flowid | flow-02 | flow-02 |
| bandwidth | 10000 | 10000 |
| cookie | 4611686018427387907 | 2305843009213693955 |
| description | flow-02 | flow-02 |
| last_updated | 2018-01-12T17:10:10.844Z | 2018-01-12T17:10:10.844Z |
| src_switch | de:ad:be:ef:00:00:00:01 | de:ad:be:ef:00:00:00:02 |
| dst_switch | de:ad:be:ef:00:00:00:02 | de:ad:be:ef:00:00:00:01 |
| src_port | 1 | 1 |
| dst_port | 1 | 1 |
| src_vlan | 101 | 101 |
| dst_vlan | 101 | 101 |
| meter_id | 3 | 3 |
| transit_vlan | 6 | 7 |
| flowpath:latency_ns | 1 | 1 |
| state | UP | UP |
| | switch_id port_no segment_latency seq_id | switch_id port_no segment_latency seq_id |
| path | ---------------------------------------------------------------- | ---------------------------------------------------------------- |
| | de:ad:be:ef:00:00:00:01 1 1 0 | de:ad:be:ef:00:00:00:02 1 1 0 |
| | de:ad:be:ef:00:00:00:02 1 None 1 | de:ad:be:ef:00:00:00:01 1 None 1 |
+---------------------+------------------------------------------------------------------+------------------------------------------------------------------+
- Verify create\delete flow - OK