Difference between revisions of "DigI:TI1.1"

From its-wiki.no

Jump to: navigation, search
(New ideas)
(Equipment supplier)
Line 11: Line 11:
 
= Equipment supplier =
 
= Equipment supplier =
 
see [[DigI:TI1.2]] for pilot installations
 
see [[DigI:TI1.2]] for pilot installations
 +
 +
 +
= Content filtering =
 +
by Iñaki GaritanoWED SEP 12
 +
== Methods ==
 +
Decentralized = each Mikrotik has to do something.
 +
Centralized = all traffic (at least the unauthenticated one) goes
 +
through the Basic Internet core.
 +
 +
Methods, ordered by centralized/decentralized filtering plus
 +
difficulty/time to implement:
 +
 +
1.- Decentralized filtering
 +
* 1.1.- Whitelist
 +
* 1.2.- Blacklist of already known Content Delivery Network (CDN), addresses (Akamai, Cloudfare, CloudFront, Wowza, IBM Cloud Video, Livestream, DaCast, etc.)
 +
* 1.3.- RouterOS L7 filter
 +
 +
2.- Semi centralized/decentralized - some actions have to be done in
 +
the core while others in the Mikrotiks
 +
* 2.1.- Web crawler to analyze requested web pages and populate the blacklists
 +
 +
3.- Centralized filtering
 +
* 3.1.- Commercial proxy/firewall to filter by Content-Type
 +
* 3.2.- Open-Source proxy/firewall to filter by Content-Type
 +
 +
4.- Needs more research because maybe could be done decentralized
 +
* 4.1.- Traffic pattern based connection filtering
 +
 +
== PROS & CONS =
 +
1.- Decentralized filtering
 +
- Cons:
 +
- Mikrotik devices have to be populated by new configuration updates.
 +
- Some performance overhead may occur. Would be interesting to
 +
somehow measure it.
 +
 +
2.- Semi centralized/decentralized - some actions have to be done in
 +
the core while others in the Mikrotiks
 +
- Cons:
 +
- Core infrastructure needs to be prepared.
 +
- Mikrotik devices have to be populated by new configuration updates.
 +
- Some performance overhead may occur. Would be interesting to
 +
somehow measure it.
 +
 +
3.- Centralized filtering
 +
* Cons:
 +
* All traffic needs to go through a centralized device.
 +
* '''not suitable, as the backbone traffic is the main cost''' (and other topics such as virus filtering, internation traffic...)
 +
 +
=== 1.1.- Whitelist  ===
 +
- Pros:
 +
- The easiest one to implement.
 +
- Allows to reduce most of the traffic.
 +
 +
Cons:
 +
* The most restrictive one.
 +
* Requires to analyze the content of each web page.
 +
* Dynamically generated web pages such as Facebook have to be  blocked because is not possible to analyze their content beforehand.
 +
* '''not completely right, as facebook uses video servers, which can be blocked'''
 +
 +
=== Blacklist of already known Content Delivery Network (CDN) addresses  ===
 +
Pros:
 +
- The second easiest one to implement.
 +
- Allows to reduce well known CDNs' traffic.
 +
 +
Cons:
 +
* Video/Audio content delivered through not known CDNs or other
 +
addresses is not filtered.
 +
* Requires to analyze and update the addresses of CDNs.
 +
 +
=== 1.3.- RouterOS L7 filter  ===
 +
- Pros:
 +
- Many interesting filters are already created.
 +
- https://wiki.mikrotik.com/wiki/Manual:IP/Firewall/L7
 +
- http://l7-filter.sourceforge.net/protocols
 +
- Easy to implement.
 +
- Could be implemented together with 1.1 and 1.2 solutions.
 +
- Cons:
 +
- Only unencrypted HTTP can be matched. NOT HTTPS.
 +
- Not 100% reliable.
 +
 +
2.1.- Web crawler to analyze requested web pages and populate the blacklists
 +
- Pros:
 +
- Could be combined with 1.1, 1.2 and 1.3.
 +
- Cons:
 +
- Dynamically generated web pages cannot be partially filtered.
 +
Such as login based pages.
 +
- Not 100% reliable.
 +
- Requires many resources to analyze web pages.
 +
 +
3.1.- Commercial proxy/firewall to filter by Content-Type
 +
- Pros:
 +
- Easy to implement.
 +
- Able to filter even HTTPS connections.
 +
- Cons:
 +
- All traffic needs to be centralized.
 +
- Price.
 +
- Need to perform a man in the middle for HTTPS connections.
 +
- Even if it is paid most probably will not block 100% of not
 +
desired traffic.
 +
3.2.- Open-Source proxy/firewall to filter by Content-Type
 +
- Examples:
 +
- L7-filter - http://l7-filter.clearos.com/
 +
- nDPI - https://www.ntop.org/products/deep-packet-inspection/ndpi/
 +
- OpenDPI - https://github.com/thomasbhatia/OpenDPI
 +
- Pros:
 +
- Cheap.
 +
- Cons:
 +
- All traffic needs to be centralized.
 +
- Need to perform a man in the middle for HTTPS connections.
 +
 +
4.1.- Traffic pattern based connection filtering (research)
 +
- Pros:
 +
- Works either for HTTP and HTTPS
 +
- Cons:
 +
- Traffic patterns need to be generated for different
 +
content-type, bandwidth, etc.
 +
- Final implementation on Mikrotiks needs to be analyzed.
 +
- If not possible, all traffic would need to be centralized
 +
 +
=============================================
 +
 +
IMPLEMENTATION PLAN:
 +
1.1.- Whitelist
 +
- Done.
 +
1.2.- Blacklist of already known Content Delivery Network (CDN)
 +
addresses (Akamai, Cloudfare, CloudFront, Wowza, IBM Cloud Video,
 +
Livestream, DaCast, etc.)
 +
- Done.
 +
1.3.- RouterOS L7 filter
 +
- I would need a student to try different filters and check how it performs.
 +
- Filter updating scripts would need to be generated.
 +
- Mikrotik performance impact would have to be measured.
 +
 +
2.1.- Web crawler to analyze requested web pages and populate the blacklists.
 +
- Different crawlers such as Apache Nutch have to be analyzed.
 +
- Scripts to get DNS requests for later analysis have to be developed.
 +
 +
3.1.- Commercial proxy/firewall to filter by Content-Type
 +
- Topology needs to be changed to centralize all traffic or at least
 +
the unauthenticated one.
 +
- Device needs to be configured.
 +
 +
3.2.- Open-Source proxy/firewall to filter by Content-Type
 +
- Different proxy/firewall solutions have to be analyzed to select
 +
those performing well.
 +
- Topology needs to be changed to centralize all traffic or at least
 +
the unauthenticated one.
 +
- Device needs to be configured.
 +
 +
4.1.- Traffic pattern based connection filtering
 +
- This will require a bachelor or master thesis to analyze traffic
 +
patterns and create a lightweight content based filter.
 +
- Analyze if it is possible to implement the content filter on the Mikrotiks.
 +
 +
Josef, I would like to further discuss with you all these ideas. In
 +
the mean time at Mondragon we will continue with the multi-language
 +
voucher platform development and virtually duplicating the
 +
infrastructure.
 +
 +
Best regards,
 +
 +
--
 +
Iñaki Garitano
 +
Data Analysis and Cybersecurity
 +
Electronics and Computing Department
 +
Mondragon University - Faculty of Engineering
 +
Goiru, 2; 20500 Arrasate - Mondragón (Gipuzkoa), Spain
 +
Tel. : +(34) 647503682 / +(34) 943794700 + Ext. 8119
 +
www.mondragon.edu
 +
www.garitano.info / www.garitano.eu
 +
 +
@mention a user or group to share this mail.
 +
Content-Type / Media-Type / MIME filtering
 +
6
 +
garitano
 +
 +
Here is your Smart Chat (Ctrl+Space)
  
 
= New ideas =
 
= New ideas =

Revision as of 09:33, 21 September 2018

T1.1 Low-cost infrastructure

Task Title Low-cost infrastructure for InfoIntenet
WP DigI:WP-I1
Lead partner Basic Internet Foundation
Leader
Contributors BasicInternet
edit this task

Objective

This task will establish the architecture for low cost access, including:

  • cost calculation for TZ and Congo
Category:Task


Deliverables in T1.1 Low-cost infrastructure

Add Deliverable


Equipment supplier

see DigI:TI1.2 for pilot installations


Content filtering

by Iñaki GaritanoWED SEP 12

Methods

Decentralized = each Mikrotik has to do something. Centralized = all traffic (at least the unauthenticated one) goes through the Basic Internet core.

Methods, ordered by centralized/decentralized filtering plus difficulty/time to implement:

1.- Decentralized filtering

  • 1.1.- Whitelist
  • 1.2.- Blacklist of already known Content Delivery Network (CDN), addresses (Akamai, Cloudfare, CloudFront, Wowza, IBM Cloud Video, Livestream, DaCast, etc.)
  • 1.3.- RouterOS L7 filter

2.- Semi centralized/decentralized - some actions have to be done in the core while others in the Mikrotiks

  • 2.1.- Web crawler to analyze requested web pages and populate the blacklists

3.- Centralized filtering

  • 3.1.- Commercial proxy/firewall to filter by Content-Type
  • 3.2.- Open-Source proxy/firewall to filter by Content-Type

4.- Needs more research because maybe could be done decentralized

  • 4.1.- Traffic pattern based connection filtering

= PROS & CONS

1.- Decentralized filtering - Cons: - Mikrotik devices have to be populated by new configuration updates. - Some performance overhead may occur. Would be interesting to somehow measure it.

2.- Semi centralized/decentralized - some actions have to be done in the core while others in the Mikrotiks - Cons: - Core infrastructure needs to be prepared. - Mikrotik devices have to be populated by new configuration updates. - Some performance overhead may occur. Would be interesting to somehow measure it.

3.- Centralized filtering

  • Cons:
  • All traffic needs to go through a centralized device.
  • not suitable, as the backbone traffic is the main cost (and other topics such as virus filtering, internation traffic...)

1.1.- Whitelist

- Pros: - The easiest one to implement. - Allows to reduce most of the traffic.

Cons:

  • The most restrictive one.
  • Requires to analyze the content of each web page.
  • Dynamically generated web pages such as Facebook have to be blocked because is not possible to analyze their content beforehand.
  • not completely right, as facebook uses video servers, which can be blocked

Blacklist of already known Content Delivery Network (CDN) addresses

Pros: - The second easiest one to implement. - Allows to reduce well known CDNs' traffic.

Cons:

  • Video/Audio content delivered through not known CDNs or other

addresses is not filtered.

  • Requires to analyze and update the addresses of CDNs.

1.3.- RouterOS L7 filter

- Pros: - Many interesting filters are already created. - https://wiki.mikrotik.com/wiki/Manual:IP/Firewall/L7 - http://l7-filter.sourceforge.net/protocols - Easy to implement. - Could be implemented together with 1.1 and 1.2 solutions. - Cons: - Only unencrypted HTTP can be matched. NOT HTTPS. - Not 100% reliable.

2.1.- Web crawler to analyze requested web pages and populate the blacklists - Pros: - Could be combined with 1.1, 1.2 and 1.3. - Cons: - Dynamically generated web pages cannot be partially filtered. Such as login based pages. - Not 100% reliable. - Requires many resources to analyze web pages.

3.1.- Commercial proxy/firewall to filter by Content-Type - Pros: - Easy to implement. - Able to filter even HTTPS connections. - Cons: - All traffic needs to be centralized. - Price. - Need to perform a man in the middle for HTTPS connections. - Even if it is paid most probably will not block 100% of not desired traffic. 3.2.- Open-Source proxy/firewall to filter by Content-Type - Examples: - L7-filter - http://l7-filter.clearos.com/ - nDPI - https://www.ntop.org/products/deep-packet-inspection/ndpi/ - OpenDPI - https://github.com/thomasbhatia/OpenDPI - Pros: - Cheap. - Cons: - All traffic needs to be centralized. - Need to perform a man in the middle for HTTPS connections.

4.1.- Traffic pattern based connection filtering (research) - Pros: - Works either for HTTP and HTTPS - Cons: - Traffic patterns need to be generated for different content-type, bandwidth, etc. - Final implementation on Mikrotiks needs to be analyzed. - If not possible, all traffic would need to be centralized

=================================

IMPLEMENTATION PLAN: 1.1.- Whitelist - Done. 1.2.- Blacklist of already known Content Delivery Network (CDN) addresses (Akamai, Cloudfare, CloudFront, Wowza, IBM Cloud Video, Livestream, DaCast, etc.) - Done. 1.3.- RouterOS L7 filter - I would need a student to try different filters and check how it performs. - Filter updating scripts would need to be generated. - Mikrotik performance impact would have to be measured.

2.1.- Web crawler to analyze requested web pages and populate the blacklists. - Different crawlers such as Apache Nutch have to be analyzed. - Scripts to get DNS requests for later analysis have to be developed.

3.1.- Commercial proxy/firewall to filter by Content-Type - Topology needs to be changed to centralize all traffic or at least the unauthenticated one. - Device needs to be configured.

3.2.- Open-Source proxy/firewall to filter by Content-Type - Different proxy/firewall solutions have to be analyzed to select those performing well. - Topology needs to be changed to centralize all traffic or at least the unauthenticated one. - Device needs to be configured.

4.1.- Traffic pattern based connection filtering - This will require a bachelor or master thesis to analyze traffic patterns and create a lightweight content based filter. - Analyze if it is possible to implement the content filter on the Mikrotiks.

Josef, I would like to further discuss with you all these ideas. In the mean time at Mondragon we will continue with the multi-language voucher platform development and virtually duplicating the infrastructure.

Best regards,

-- Iñaki Garitano Data Analysis and Cybersecurity Electronics and Computing Department Mondragon University - Faculty of Engineering Goiru, 2; 20500 Arrasate - Mondragón (Gipuzkoa), Spain Tel. : +(34) 647503682 / +(34) 943794700 + Ext. 8119 www.mondragon.edu www.garitano.info / www.garitano.eu

@mention a user or group to share this mail. Content-Type / Media-Type / MIME filtering 6 garitano

Here is your Smart Chat (Ctrl+Space)

New ideas

QR code scanning for wifi access code

QR code for voucher access, alternative: SMS

Cost calculation

Calculations of costs, using TZ as example (owncloud confidential) https://owncloud.unik.no/index.php/apps/files/ajax/download.php?dir=%2F1-Projects%2FBasicInternet%2FTechnology%2FCost-Infrastructure&files=Infra_cost_Template_Tz.xlsx