Differences From
Artifact [4fedc4f1ce]:
81 81
82 82 * Community-Lab testbed architecture
83 83 ** Overall architecture
84 84 This architecture applies to all testbeds using the CONFINE software.
85 85 # Move over overlay diagram less overlay connections plus overlay network.
86 86 - A testbed consists of a set of nodes managed by the same server.
87 87 - Server managed by testbed admins.
88 - - Network and node managed by node admins (usually node owners).
89 - - Node admins must adhere to a set of conditions.
90 - - Solves management vs. ownersip problem.
91 -- All components in testbed reachable via management network (tinc mesh VPN).
88 + - Network and node managed by node admins (usually owners and CN members).
89 + - Node admins must adhere to testbed conditions.
90 + - This decouples testbed management from infrastructure ownership and mgmt.
91 +- Testbed management traffic uses a tinc mesh VPN:
92 92 - Avoids problems with firewalls and private networks in nodes.
93 - - Avoids address scarcity and incompatibility (well structured IPv6 schema).
94 - - Public CN addresses still used for experiments when available.
95 -- Gateways connect disjoint parts of the management network.
96 - - Allows a testbed spanning different CNs and islands through external means
97 - (e.g. FEDERICA, the Internet).
98 - - A gateway reachable from the Internet can expose the management network
99 - (if using public addresses).
93 + - Uses IPv6 to avoid address scarcity and incompatibility between CNs.
94 + - Short-lived mgmt connections make components mostly autonomous and
95 + tolerant to link instability.
96 +- A testbed can span multiple CNs thanks to gateways.
97 + - Bridging the mgmt net over external means (e.g. FEDERICA, the Internet).
98 + - Gateways can route the management network to the Internet.
100 99 - A researcher runs the experiments of a slice in slivers each running in a
101 100 different node…
102 101
103 102 ** Nodes, slices and slivers
104 103 - …a model inspired in PlanetLab.
105 104 - The slice (a management concept) groups a set of related slivers.
106 105 - A sliver holds the resources (CPU, memory, disk, bandwidth, interfaces…)
107 106 allocated for a slice in a given node.
108 107 # Diagram: Slices and slivers, two or three nodes with a few slivers on them,
109 108 # each with a color identifying it with a slice.)
110 109
111 110 ** Node architecture
112 -Mostly autonomous, no long-running connections to server, asynchronous
113 -operation: robust under link instability.
114 111 # Node simplified diagram, hover to interesting parts.
115 112 - The community device
116 - - Completely normal CN network device, possibly already existing.
117 - - Routes traffic between the CN and devices in the node's local network
118 - (wired, runs no routing protocol).
113 + - Completely normal CN device, so existing ones can be used.
114 + - Routes traffic between the CN and devices in the node's wired local
115 + network (which runs no routing protocol).
119 116 - The research device
120 - - More powerful than CD, it runs OpenWrt firmware customized by CONFINE.
121 - - Experiments run here. The separation between CD and RD allows:
122 - - Minumum CONFINE-specific tampering with CN hardware.
123 - - Minimum CN-specific configuration for RDs.
124 - - Greater compatibility and stability for the CN.
117 + - Usually more powerful than CD, since experiments run here.
118 + - Separating CD/RD makes integration with any CN simple and safe:
119 + - Little CONFINE-specific tampering with CN infrastructure.
120 + - Little CN-specific configuration for RDs.
121 + - Misbehaving experiments can't crash CN infrastructure.
122 + - Runs OpenWrt firmware customized by CONFINE.
125 123 - Slivers are implemented as Linux containers.
126 - - LXC: lightweight virtualization (in Linux mainstream).
127 - - Provides a familiar env for researchers.
128 - - Easier resource limitation, resource isolation and node stability.
124 + - Lightweight virtualization supported mainstream.
125 + - Provides a familiar and flexible env for researchers.
126 + - Direct interfaces allow experiments to bypass the CD when interacting with
127 + the CN.
129 128 - Control software
130 - - Manages containers and resource isolation through LXC tools.
131 - - Ensures network isolation and stability through traffic control (QoS)
132 - and filtering (from L2 upwards).
133 - - Protects users' privacy through traffic filtering and anonimization.
134 - - Optional, controlled direct interfaces for experiments to interact
135 - directly with the CN (avoiding the CD).
129 + - Uses LXC tools on containers to enforce resource limitation, resource
130 + isolation and node stability.
131 + - Uses traffic control, filtering and anonymization to ensure network
132 + stability, isolation and privacy.
136 133 - The recovery device can force a hardware reboot of the RD from several
137 134 triggers and help with upgrade and recovery.
138 135
139 136 ** Node and sliver connectivity
140 137 # Node simplified diagram, hover to interesting parts.
141 138 Slivers can be configured with different types of network interfaces depending
142 139 on what connectivity researchers need for experiments:
143 140 - Home computer behind a NAT router: a private interface with traffic
144 - forwarded using NAT to the CN. Outgoing traffic is filtered to ensure
145 - network stability.
141 + forwarded using NAT to the CN and filtered to ensure network stability.
146 142 - Publicly open service: a public interface (with a public CN address) with
147 - traffic routed directly to the CN. Outgoing traffic is filtered to ensure
148 - network stability.
143 + traffic routed directly to the CN and filtered to ensure network stability.
149 144 - Traffic capture: a passive interface using a direct interface for capture.
150 - Incoming traffic is filtered and anonymized by control software.
145 + Incoming traffic is filtered and anonymized to ensure network privacy.
151 146 - Routing: an isolated interface using a VLAN on top of a direct interface.
152 147 It only can reach other slivers of the same slice with isolated interfaces
153 148 on the same link. All traffic is allowed.
154 149 - Low-level testing: the sliver is given raw access to the interface. For
155 150 privacy, isolation and stability reasons this should only be allowed in
156 151 exceptional occasions.
157 152