Deployment of the Test Network
This document describes how to deploy and configure an AnyLog/EdgeLake Network. This guided session provides directions to deploy an AnyLog/EdgeLake Network consisting of 4 nodes (2 operators, 1 query, 1 master).
When an EdgeLake node is deployed, the software packages needs to be organized on the node with proper configurations.
Each EdgeLake Node is using the same software stack, however, the nodes in the network are assigned to different roles, and these roles are determined by the configurations.
The main roles are summarized in the table below:
Node Type | Functionality |
Master | A node that manages the shared metadata (if a blockchain platform is used, this node is redundant) |
Operator | A node that hosts the data. In this session, users deploy 2 Operator nodes |
Query | A node that coordinates the query process |
Additional information on the types of nodes is in the Getting Started document.
The roles are determined by configuration commands which are processed by each node at startup and enable services offered by the node. The same node may be assigned to multiple roles - there are no restrictions on the services that can be offered by a node.
Since configuration is “command based”, it is simple to change configurations, and even dynamically (using the CLI), by disabling a service or enabling a service using the proper commands.
In this training, users will be using the default configuration file, and make some modifications to support their proprietary settings.
This session includes 4 sections:
- Prerequisites & Support Software
- Network Configurations
- Deploy EdgeLake
- Configuring Node
- Test & Query EdgeLake
Prerequisites & Support Software
Prerequisites
Prior to this session, users are required to prepare:
- 4 machines (virtual or physical) to host the EdgeLake nodes, as follows:
- 1 Machine (physical or virtual) for applications that interact with the network (i.e. Grafana), as follows
- Linux, Windows or MacOSX environment
- A minimum of 256MB of RAM
- A minimum of 10GB of disk space
- Each node accessible by IP and Port (remove firewalls restrictions)
The prerequisites for a customer deployment are available here.
Note: We recommend deploying an overlay network, such as nebula.
- It provides a mechanism to maintain static IPs.
- It provides the mechanisms to address firewalls limitations.
- It isoltates the network addressing security considerations.
Note 3 If an overlay network is not used in the training, remove firewalls restrictions to allow for nodes to communicate with peers and with 3rd parties applications.
Support Software
The following table summarizes the commonly used packages deployed with EdgeLake.
Software | Functionality |
Remote-CLI | A web based interface to the network |
PostgreSQL | SQL-based local database |
MongoDB | Local database for unstructured data |
Northbound Visualization Tool | BI tool for visualization of the data |
Southbound Data Generators | A connector to PLCs and sensors |
In this session, users will use the following packages:
- SQLite that serves as the local database (and is available by default without a dedicated install).
- Remote CLI - a WEB based application that interact with nodes in the network via REST. The Remote CLI is deployed with the Query Node.
- Grafana - a visualization application deployed on a dedicated node, as an example for an application interacting with the network data.
Static IPs
EdgeLake requires static IPs for the nodes in the network. With cloud instances, users may require some cloud specific configuration in order to enable static IPs.
Network Configurations
Users can configure the nodes to use any valid IP and Port.
For simplicity, the default setup is associating the same port values to nodes of the same type. The following tables summarizes the default port values used by EdgeLake.
Node Type | TCP | REST | Message Broker |
Master | 32048 | 32049 | |
Operator | 32148 | 32149 | 32150 |
Query | 32348 | 32349 |
Note:
- The port designated as TCP is used by the EdgeLake protocol when messages are send between nodes of the network.
- The port designated as REST is used to message a node using the REST protocol. 3rd party apps would be using REST to communicate with nodes in the network.
- The port designated as Message Broker is optional and is used to accept message from 3rd party apps like Kafka and MQTT
The Network ID
- With a Master Node deployment, the network ID is the Master’s IP and Port.
- A node can leverage any valid IP and port. In this deployment, the nodes are using their default IP (the IP that identifies the node on the network), and the ports are set by default as described above.
In this setup, the network ID is the IP of the Master and port 32048.
If the default IP is not known, when the Master node is initiated, the command get connections
on the node CLI returns the IPs and ports used - the Network ID is the IP and port assigned to TCP-External.
Deploy EdgeLake
Detailed directions for deploying each node can be found in Fast Deployment document.
- Clone docker-compose from EdgeLake repository
git clone https://github.com/EdgeLake/docker-compose cd docker-compose
- Edit
LEDGER_CONN
in Query and Operator using IP address of master node - Update .env configurations for the node(s) being deployed
- docker_makefile/edgelake_master.env
- docker_makefile/edgelake_operator.env
- docker_makefile/edgelake_query.env
Sample Configuration File:#--- General --- # Information regarding which AnyLog node configurations to enable. By default, even if everything is disabled, AnyLog starts TCP and REST connection protocols NODE_TYPE=master # Name of the AnyLog instance NODE_NAME=anylog-master # Owner of the AnyLog instance COMPANY_NAME=New Company #--- Networking --- # Port address used by AnyLog's TCP protocol to communicate with other nodes in the network ANYLOG_SERVER_PORT=32048 # Port address used by AnyLog's REST protocol ANYLOG_REST_PORT=32049 # A bool value that determines if to bind to a specific IP and Port (a false value binds to all IPs) TCP_BIND=false # A bool value that determines if to bind to a specific IP and Port (a false value binds to all IPs) REST_BIND=false #--- Blockchain --- # TCP connection information for Master Node LEDGER_CONN=127.0.0.1:32048 #--- Advanced Settings --- # Whether to automatically run a local (or personalized) script at the end of the process DEPLOY_LOCAL_SCRIPT=false
- Start Node using makefile
make up EDGELAKE_TYPE=[NODE_TYPE]
Makefile Commands for Docker
- Help
Usage: make [target] EDGELAKE_TYPE=[anylog-type] Targets: build Pull the docker image up Start the containers attach Attach to EdgeLake instance exec Attach to shell interface for container down Stop and remove the containers logs View logs of the containers clean Clean up volumes and network help Show this help message supported EdgeLake types: master, operator and query Sample calls: make up EDGELAKE_TYPE=master | make attach EDGELAKE_TYPE=master | make clean EDGELAKE_TYPE=master
- Bring up (Query) Node
make up EDGELAKE_TYPE=query
- Attach to (Query) Node
# to detach: ctrl-d make attach EDGELAKE_TYPE=query
- Bring down (Query) Node
make down EDGELAKE_TYPE=query
- Clean (Query) Node - this removes the volume(s) and image from disk
make clean EDGELAKE_TYPE=query
Configuring Node
For the demo purposes, everything should deploy automatically using configuration policies.
- Based on used environment parameters (
.env
file), set EdgeLake parameters to be used - Connect to network services (TCP and REST)
# TCP Server <run tcp server where external_ip = [ip] and external_port = [port] and internal_ip = [local_ip] and internal_port = [local_port] and bind = [true/false] and threads = [threads count]> # REST server <run rest server where external_ip = [external_ip ip] and external_port = [external port] and internal_ip = [internal ip] and internal_port = [internal port] and timeout = [timeout] and ssl = [true/false] and bind = [true/false]>
- Using
blockchain seed
, get the latest copy of the blockchainblockchain seed from !ledger_conn
- Declare Node policy - in Operator Node, it also creates the correlating Cluster Policy
{"operator": { "name": "anylog-operator", "company": "AnyLog Co.", "ip": "136.23.47.189", "local_ip": "136.23.47.189", "port": 32148, "rest_port": 32149, "cluster": "f3e300855609ba4fc83b550179f584a4", "loc": "37.425423, -122.078360", "country": "US", "state": "CA", "city": "Mountain View" }}
- connect to logical database(s)
# Master Node connect dbms blockchain where type=sqlite # Operator Node connect dbms [DEFAULT_DBMS] where type=psql and host=127.0.0.1 and port=5432 and user=!db_user and port=!db_port connect dbms almgm where type=psql and host=127.0.0.1 and port=5432 and user=!db_user and port=!db_port # Query Node connect dbms system_query where type=sqlite and memory=true
- Run blockchain sync
run blockchain sync where source=master and time="30 seconds" and dest=file and connection=!ledger_conn
- Monitor nodes - send Remote-CLI information like
cpu utilization
,disk space
andmemory usage
- When setting
ENABLE_MQTT
to true on an Operator Node, that will automatically flow in from a 3rd party application via MQTT<run msg client client where broker=139.144.46.246 and port=1883 and user=anyloguser and password=mqtt4AnyLog! and log=false and topic=( name=anylog-demo and dbms="bring [dbms]" and table="bring [table]" and column.timestamp.timestamp="bring [timestamp]" and column.value=(type=float and value="bring [value]") )>
Test & Query EdgeLake
Validate Node is Running
- Validate node is reachable by the network members - On each deployed node issue the command
The command returns the list of registered nodes in the network and validates that the members are reachable using their published IPs and Ports. For each node, the value in the status column needs to be the plus sign (+) that designates connectivity. if the plus sign is missing, the node is down or not reachable.test network
- View the background processes enabled using the command
get processes # Sample processes for Operator Node: Process Status Details ---------------|------------|----------------------------------------------------------------------------| TCP |Running |Listening on: 172.105.86.168:32148, Threads Pool: 6 | REST |Running |Listening on: 172.105.86.168:32149, Threads Pool: 5, Timeout: 20, SSL: False| Operator |Running |Cluster Member: True, Using Master: 45.79.74.39:32048, Threads Pool: 3 | Blockchain Sync|Running |Sync every 30 seconds with master using: 45.79.74.39:32048 | Scheduler |Running |Schedulers IDs in use: [0 (system)] [1 (user)] | Blobs Archiver |Running | | MQTT |Running | | Message Broker |Running |Listening on: 172.105.86.168:32150, Threads Pool: 5 | SMTP |Not declared| | Streamer |Running |Default streaming thresholds are 60 seconds and 10,240 bytes | Query Pool |Running |Threads Pool: 3 | Kafka Consumer |Not declared| | gRPC |Not declared| | Publisher |Not declared| | Distributor |Running | | Consumer |Running |No peer Operators supporting the cluster |
- Communicate with peer nodes. The basic command is
get status
(similar to ping) which is exemplified below (from the CLI of the master)EL edgelake-master > run client (198.74.50.131:32148) get status [From Node 198.74.50.131:32148] 'edgelake-operator_1@198.74.50.131:32148 running'
- On an Operator Node, users can view data coming in using an array of commands:
- Statistics on the streaming processes
get streaming # Output Flush Thresholds Threshold Value Streamer ----------------|------|--------| Threshold Time | 60|Running | Threshold Volume|10,240| | Write Immediate |False | | Buffered Rows | 49| | Flushed Rows | 101| | Statistics Put Put Streaming Streaming Cached Counter Threshold Buffer Threshold Time Left Last Process DBMS-Table files Rows Calls Rows Rows Immediate Volume(KB) Fill(%) Time(sec) (Sec) HH:MM:SS --------------------|------|-----|-|---------|---------|------|----------|-----------|--------|----------|---------|------------| edgex.rand_data | 0| 0| | 243,780| 243,780| 49| 0| 10| 28.55| 60| 32|00:00:04 |
- Information on messages received by clients subscribed to message brokers.
get msg client # Output Subscription ID: 0001 User: anyloguser Broker: 139.144.46.246:1883 Connection: Connected Messages Success Errors Last message time Last error time Last Error ---------- ---------- ---------- ------------------- ------------------- ---------------------------------- 24389 24389 0 2024-05-14 20:17:01 Subscribed Topics: Topic QOS DBMS Table Column name Column Type Mapping Function Optional Policies -----------|---|-----|-----------|-----------|-----------|----------------|--------|--------| anylog-demo| 0|edgex|['[table]']|timestamp |timestamp |now() |False | | | | | |value |float |['[value]'] |False | |
- Statistics on the streaming processes
Sample Queries
- View list of tables
get tables where dbms=[DB_NAME]
- View columns in table
get columns where dbms=[DB_NAME] and table=rand_data
- Get row count
# Using AnyLog tool get rows count where dbms=edgex and table=rand_data # Using SQL run client () sql edgex "select count(*) from rand_data;"
- Get raw data
run client () sql edgex format=table "select timestamp, value from rand_data limit 100"
- Increment Function
run client () sql edgex format=table "select increments(day, 1, timestamp), min(timestamp), max(timestamp), min(value), avg(value), max(value), count(*) from rand_data"
- Period Function
run client () sql edgex format=table "select timestamp, value from rand_data where period(minute, 1, now(), timestamp)"