Troubleshooting =============== TGNMS runs as a series of containers deployed inside a `Docker Swarm `_. To diagnose and debug issues inside the Swarm, SSH access to the hosts running the NMS is necessary. Make sure you are logged into one of the Swarm hosts before running any of the steps below. :: # example: ssh root@192.168.1.100 $ ssh Common Troubleshooting Steps ---------------------------- Find which services are running and which are broken. The ``docker`` binary is included in the installation of the NMS. :: $ docker service ls :: # Sample output, yours may look different ID NAME MODE REPLICAS IMAGE PORTS y8plg8c1p9sr chihaya_chihaya replicated 1/1 quay.io/jzelinskie/chihaya:v2.0.0-rc.2 7dyluyqmde9e database_db replicated 1/1 mysql:5 w517noxowsld e2e-lab_f8_d_api_service replicated 1/1 ghcr.io/terragraph/e2e-controller:latest kyqgyw3wl83u e2e-lab_f8_d_e2e_controller replicated 1/1 ghcr.io/terragraph/e2e-controller:latest kof22ttbk9u7 e2e-lab_f8_d_nms_aggregator replicated 1/1 ghcr.io/terragraph/e2e-controller:latest snbdjn3loeh0 e2e-lab_f8_d_stats_agent replicated 1/1 ghcr.io/terragraph/e2e-controller:latest e8qnow4dx596 efk_elasticsearch global 3/3 docker.elastic.co/elasticsearch/elasticsearch:7.4.0 h31spsfpml2u efk_es_exporter replicated 1/1 justwatch/elasticsearch_exporter:1.0.2 mal90x8raeu4 efk_fluentd replicated 1/1 ghcr.io/terragraph/fluentd:stable zwe4iai65an7 efk_kibana replicated 1/1 docker.elastic.co/kibana/kibana:7.4.0 mh62145b6985 kafka_kafka global 3/3 ghcr.io/terragraph/kafka:stable xvpe9v0j9i68 kafka_zoo1 replicated 1/1 zookeeper:latest qy4vaolmq064 kafka_zoo2 replicated 1/1 zookeeper:latest 9ln8ld38gx85 kafka_zoo3 replicated 1/1 zookeeper:latest p27dsw42pd3z keycloak_keycloak replicated 1/1 jboss/keycloak:7.0.0 yn6nfzh6n9pr monitoring_cadvisor global 3/3 google/cadvisor:latest srq0xdlooff1 msa_analytics replicated 1/1 ghcr.io/terragraph/analytics:rc uw2c97t89gsh msa_default_routes_service replicated 1/1 ghcr.io/terragraph/default_routes_service:rc law793veyulo msa_network_test replicated 1/1 ghcr.io/terragraph/network_test:rc zp07zhcf30zx msa_scan_service replicated 1/1 ghcr.io/terragraph/scan_service:rc qxaih3zv3ila msa_topology_service replicated 1/1 ghcr.io/terragraph/topology_service:rc njqhnfwqae1q msa_weather_service replicated 1/1 ghcr.io/terragraph/weather_service:rc wkey83a0arce nms_docs replicated 1/1 ghcr.io/terragraph/nms_docs:rc w97mx4cb37oq nms_grafana replicated 1/1 grafana/grafana:latest fzepjjsm7wa0 nms_jupyter replicated 1/1 jupyter/scipy-notebook:latest ulwa7g607ojl nms_nms replicated 1/1 ghcr.io/terragraph/nmsv2:rc daaczr27inz6 stats_alertmanager replicated 1/1 prom/alertmanager:latest 3vxhh207gcp8 stats_alertmanager_configurer replicated 1/1 facebookincubator/alertmanager-configurer:1.0.1 p9mqpdft4qpz stats_prometheus replicated 1/1 prom/prometheus:latest yqxe9vmb71yy stats_prometheus_cache replicated 1/1 facebookincubator/prometheus-edge-hub:1.1.0 taz5i5dyuhft stats_prometheus_configurer replicated 1/1 facebookincubator/prometheus-configurer:1.0.1 v3ovbudhpuho stats_query_service replicated 1/1 ghcr.io/terragraph/cpp_backends:rc ljvv6z9m9y80 tg-alarms_alarms replicated 1/1 ghcr.io/terragraph/tg-alarms:rc If a service does not show ``n/n`` under ``REPLICAS``, it is likely having problems. To investigate further, check the service logs. :: $ docker service logs nms_nms