DigI:TI1.1

From its-wiki.no

Revision as of 11:13, 21 September 2018 by Josef.Noll (Talk | contribs)

Jump to: navigation, search

T1.1 Low-cost infrastructure

Task Title Low-cost infrastructure for InfoIntenet
WP DigI:WP-I1
Lead partner Basic Internet Foundation
Leader
Contributors BasicInternet
edit this task

Objective

This task will establish the architecture for low cost access, including:

  • cost calculation for TZ and Congo
Category:Task


Deliverables in T1.1 Low-cost infrastructure

Add Deliverable


Equipment supplier

see DigI:TI1.2 for pilot installations


Content filtering

by Iñaki GaritanoWED SEP 12 Basic understanding of InfoInternet standard:

  • Text & pictures: allowed
  • Streaming, games, high-bandwidth content

The way to filter is known from the security industry, a.o. Palo Alto Networks. However, their solution is focussing on security, and not on low-cost provision of information.

Required:

  • Roadmap to reach the InfoInternet standard
  • Today: whilelist, blacklist, content metadata
  • tommorrow: automatic analysis (either be real-time or off-line)
  • Final InfoInternet standard: Public Database supporting local filtering

Methods

  • Decentralized = each Mikrotik has to do something.
  • Centralized = all traffic (at least the unauthenticated one) goes through the Basic Internet core.

Methods, ordered by centralized/decentralized filtering plus difficulty/time to implement:

1.- Decentralized filtering

  • 1.1.- Whitelist
  • 1.2.- Blacklist of already known Content Delivery Network (CDN), addresses (Akamai, Cloudfare, CloudFront, Wowza, IBM Cloud Video, Livestream, DaCast, etc.)
  • 1.3.- RouterOS L7 filter

2.- Semi centralized/decentralized - some actions have to be done in the core while others in the Mikrotiks

  • 2.1.- Web crawler to analyze requested web pages and populate the blacklists

3.- Centralized filtering

  • 3.1.- Commercial proxy/firewall to filter by Content-Type
  • 3.2.- Open-Source proxy/firewall to filter by Content-Type

4.- Needs more research because maybe could be done decentralized

  • 4.1.- Traffic pattern based connection filtering

PROS & CONS

1.- Decentralized filtering

  • Cons:
  • Mikrotik devices have to be populated by new configuration updates.
  • Some performance overhead may occur. Would be interesting to somehow measure it.
  • overhead is very small, we can easily handle 60 GByte/day on an RB960 (RDB952) - only when traffic is tagged and measured

2.- Semi centralized/decentralized - some actions have to be done in the core while others in the Mikrotiks

  • Cons:
  • Core infrastructure needs to be prepared. % core infrastructure is in place. Whitelist is centrally located (owncloud), and populated to the LNCC
  • Mikrotik devices have to be populated by new configuration updates.
  • Some performance overhead may occur. Would be interesting to somehow measure it.

3.- Centralized filtering

  • Cons:
  • All traffic needs to go through a centralized device.
  • % applicable to some traffic, not all traffic - not suitable for all traffic, as the backbone traffic is the main cost (and other topics such as virus filtering, internation traffic...)

1.1.- Whitelist

White listing is the closed world approach, which is easier to manage

Pros:

  • The easiest one to implement.
  • Allows to reduce most of the traffic.

Cons:

  • The most restrictive one.
  • Requires to analyze the content of each web page.
  • Dynamically generated web pages such as Facebook have to be blocked because is not possible to analyze their content beforehand.
  • not completely right, as facebook uses video servers, which can be blocked

1.2 Blacklist of already known Content Delivery Network (CDN) addresses

Blacklist is the open world approach. A potential starting point is to use the topp 500 web pages (national, international,....) and analyse them in depth. Which Web pages are they calling ("all levels below"). This should give us an overview over 90%(?) of the traffic. Strategy: you measure upcoming new web sites, and their traffic, and if the traffic exceeds xxx MB, then you analyse

Pros:

  • The second easiest one to implement.
  • Allows to reduce well known CDNs' traffic.

Cons:

  • Video/Audio content delivered through not known CDNs or other addresses is not filtered.
  • Requires to analyze and update the addresses of CDNs %all the time
  • hard to catch new CDNs

1.3.- RouterOS L7 filter

Pros:

Cons:

  • Only unencrypted HTTP can be matched. NOT HTTPS.
  • Not 100% reliable.

2.1.- Web crawler

to analyze requested web pages and populate the blacklists

Pros:

  • Could be combined with 1.1, 1.2 and 1.3.

Cons:

  • Dynamically generated web pages cannot be partially filtered. Such as login based pages.
  • Not 100% reliable.
  • Requires many resources to analyze web pages.

3.1. Commercial proxy/firewall to filter by Content-Type

Pros:

  • Easy to implement.
  • Able to filter even HTTPS connections.

Cons:

  • All traffic needs to be centralized.
  • Price.
  • Need to perform a man in the middle for HTTPS connections.
  • Even if it is paid most probably will not block 100% of not desired traffic.

3.2.- Open-Source proxy/firewall to filter by Content-Type

Examples:

Pros:

  • Cheap.

Cons:

  • All traffic needs to be centralized.
  • Need to perform a man in the middle for HTTPS connections.

4.1.- Traffic pattern based connection filtering

Research topic

- Pros: - Works either for HTTP and HTTPS - Cons: - Traffic patterns need to be generated for different content-type, bandwidth, etc. - Final implementation on Mikrotiks needs to be analyzed. - If not possible, all traffic would need to be centralized

=================================

IMPLEMENTATION PLAN: 1.1.- Whitelist - Done. 1.2.- Blacklist of already known Content Delivery Network (CDN) addresses (Akamai, Cloudfare, CloudFront, Wowza, IBM Cloud Video, Livestream, DaCast, etc.) - Done. 1.3.- RouterOS L7 filter - I would need a student to try different filters and check how it performs. - Filter updating scripts would need to be generated. - Mikrotik performance impact would have to be measured.

2.1.- Web crawler to analyze requested web pages and populate the blacklists. - Different crawlers such as Apache Nutch have to be analyzed. - Scripts to get DNS requests for later analysis have to be developed.

3.1.- Commercial proxy/firewall to filter by Content-Type - Topology needs to be changed to centralize all traffic or at least the unauthenticated one. - Device needs to be configured.

3.2.- Open-Source proxy/firewall to filter by Content-Type - Different proxy/firewall solutions have to be analyzed to select those performing well. - Topology needs to be changed to centralize all traffic or at least the unauthenticated one. - Device needs to be configured.

4.1.- Traffic pattern based connection filtering - This will require a bachelor or master thesis to analyze traffic patterns and create a lightweight content based filter. - Analyze if it is possible to implement the content filter on the Mikrotiks.

Josef, I would like to further discuss with you all these ideas. In the mean time at Mondragon we will continue with the multi-language voucher platform development and virtually duplicating the infrastructure.

Best regards,

-- Iñaki Garitano Data Analysis and Cybersecurity Electronics and Computing Department Mondragon University - Faculty of Engineering Goiru, 2; 20500 Arrasate - Mondragón (Gipuzkoa), Spain Tel. : +(34) 647503682 / +(34) 943794700 + Ext. 8119 www.mondragon.edu www.garitano.info / www.garitano.eu

@mention a user or group to share this mail. Content-Type / Media-Type / MIME filtering 6 garitano

Here is your Smart Chat (Ctrl+Space)

New ideas

QR code scanning for wifi access code

QR code for voucher access, alternative: SMS

Cost calculation

Calculations of costs, using TZ as example (owncloud confidential) https://owncloud.unik.no/index.php/apps/files/ajax/download.php?dir=%2F1-Projects%2FBasicInternet%2FTechnology%2FCost-Infrastructure&files=Infra_cost_Template_Tz.xlsx