Difference between revisions of "DigI:TI1.1"
From its-wiki.no
Josef.Noll (Talk | contribs) (→New ideas) |
Josef.Noll (Talk | contribs) (→Equipment supplier) |
||
Line 11: | Line 11: | ||
= Equipment supplier = | = Equipment supplier = | ||
see [[DigI:TI1.2]] for pilot installations | see [[DigI:TI1.2]] for pilot installations | ||
+ | |||
+ | |||
+ | = Content filtering = | ||
+ | by Iñaki GaritanoWED SEP 12 | ||
+ | == Methods == | ||
+ | Decentralized = each Mikrotik has to do something. | ||
+ | Centralized = all traffic (at least the unauthenticated one) goes | ||
+ | through the Basic Internet core. | ||
+ | |||
+ | Methods, ordered by centralized/decentralized filtering plus | ||
+ | difficulty/time to implement: | ||
+ | |||
+ | 1.- Decentralized filtering | ||
+ | * 1.1.- Whitelist | ||
+ | * 1.2.- Blacklist of already known Content Delivery Network (CDN), addresses (Akamai, Cloudfare, CloudFront, Wowza, IBM Cloud Video, Livestream, DaCast, etc.) | ||
+ | * 1.3.- RouterOS L7 filter | ||
+ | |||
+ | 2.- Semi centralized/decentralized - some actions have to be done in | ||
+ | the core while others in the Mikrotiks | ||
+ | * 2.1.- Web crawler to analyze requested web pages and populate the blacklists | ||
+ | |||
+ | 3.- Centralized filtering | ||
+ | * 3.1.- Commercial proxy/firewall to filter by Content-Type | ||
+ | * 3.2.- Open-Source proxy/firewall to filter by Content-Type | ||
+ | |||
+ | 4.- Needs more research because maybe could be done decentralized | ||
+ | * 4.1.- Traffic pattern based connection filtering | ||
+ | |||
+ | == PROS & CONS = | ||
+ | 1.- Decentralized filtering | ||
+ | - Cons: | ||
+ | - Mikrotik devices have to be populated by new configuration updates. | ||
+ | - Some performance overhead may occur. Would be interesting to | ||
+ | somehow measure it. | ||
+ | |||
+ | 2.- Semi centralized/decentralized - some actions have to be done in | ||
+ | the core while others in the Mikrotiks | ||
+ | - Cons: | ||
+ | - Core infrastructure needs to be prepared. | ||
+ | - Mikrotik devices have to be populated by new configuration updates. | ||
+ | - Some performance overhead may occur. Would be interesting to | ||
+ | somehow measure it. | ||
+ | |||
+ | 3.- Centralized filtering | ||
+ | * Cons: | ||
+ | * All traffic needs to go through a centralized device. | ||
+ | * '''not suitable, as the backbone traffic is the main cost''' (and other topics such as virus filtering, internation traffic...) | ||
+ | |||
+ | === 1.1.- Whitelist === | ||
+ | - Pros: | ||
+ | - The easiest one to implement. | ||
+ | - Allows to reduce most of the traffic. | ||
+ | |||
+ | Cons: | ||
+ | * The most restrictive one. | ||
+ | * Requires to analyze the content of each web page. | ||
+ | * Dynamically generated web pages such as Facebook have to be blocked because is not possible to analyze their content beforehand. | ||
+ | * '''not completely right, as facebook uses video servers, which can be blocked''' | ||
+ | |||
+ | === Blacklist of already known Content Delivery Network (CDN) addresses === | ||
+ | Pros: | ||
+ | - The second easiest one to implement. | ||
+ | - Allows to reduce well known CDNs' traffic. | ||
+ | |||
+ | Cons: | ||
+ | * Video/Audio content delivered through not known CDNs or other | ||
+ | addresses is not filtered. | ||
+ | * Requires to analyze and update the addresses of CDNs. | ||
+ | |||
+ | === 1.3.- RouterOS L7 filter === | ||
+ | - Pros: | ||
+ | - Many interesting filters are already created. | ||
+ | - https://wiki.mikrotik.com/wiki/Manual:IP/Firewall/L7 | ||
+ | - http://l7-filter.sourceforge.net/protocols | ||
+ | - Easy to implement. | ||
+ | - Could be implemented together with 1.1 and 1.2 solutions. | ||
+ | - Cons: | ||
+ | - Only unencrypted HTTP can be matched. NOT HTTPS. | ||
+ | - Not 100% reliable. | ||
+ | |||
+ | 2.1.- Web crawler to analyze requested web pages and populate the blacklists | ||
+ | - Pros: | ||
+ | - Could be combined with 1.1, 1.2 and 1.3. | ||
+ | - Cons: | ||
+ | - Dynamically generated web pages cannot be partially filtered. | ||
+ | Such as login based pages. | ||
+ | - Not 100% reliable. | ||
+ | - Requires many resources to analyze web pages. | ||
+ | |||
+ | 3.1.- Commercial proxy/firewall to filter by Content-Type | ||
+ | - Pros: | ||
+ | - Easy to implement. | ||
+ | - Able to filter even HTTPS connections. | ||
+ | - Cons: | ||
+ | - All traffic needs to be centralized. | ||
+ | - Price. | ||
+ | - Need to perform a man in the middle for HTTPS connections. | ||
+ | - Even if it is paid most probably will not block 100% of not | ||
+ | desired traffic. | ||
+ | 3.2.- Open-Source proxy/firewall to filter by Content-Type | ||
+ | - Examples: | ||
+ | - L7-filter - http://l7-filter.clearos.com/ | ||
+ | - nDPI - https://www.ntop.org/products/deep-packet-inspection/ndpi/ | ||
+ | - OpenDPI - https://github.com/thomasbhatia/OpenDPI | ||
+ | - Pros: | ||
+ | - Cheap. | ||
+ | - Cons: | ||
+ | - All traffic needs to be centralized. | ||
+ | - Need to perform a man in the middle for HTTPS connections. | ||
+ | |||
+ | 4.1.- Traffic pattern based connection filtering (research) | ||
+ | - Pros: | ||
+ | - Works either for HTTP and HTTPS | ||
+ | - Cons: | ||
+ | - Traffic patterns need to be generated for different | ||
+ | content-type, bandwidth, etc. | ||
+ | - Final implementation on Mikrotiks needs to be analyzed. | ||
+ | - If not possible, all traffic would need to be centralized | ||
+ | |||
+ | ============================================= | ||
+ | |||
+ | IMPLEMENTATION PLAN: | ||
+ | 1.1.- Whitelist | ||
+ | - Done. | ||
+ | 1.2.- Blacklist of already known Content Delivery Network (CDN) | ||
+ | addresses (Akamai, Cloudfare, CloudFront, Wowza, IBM Cloud Video, | ||
+ | Livestream, DaCast, etc.) | ||
+ | - Done. | ||
+ | 1.3.- RouterOS L7 filter | ||
+ | - I would need a student to try different filters and check how it performs. | ||
+ | - Filter updating scripts would need to be generated. | ||
+ | - Mikrotik performance impact would have to be measured. | ||
+ | |||
+ | 2.1.- Web crawler to analyze requested web pages and populate the blacklists. | ||
+ | - Different crawlers such as Apache Nutch have to be analyzed. | ||
+ | - Scripts to get DNS requests for later analysis have to be developed. | ||
+ | |||
+ | 3.1.- Commercial proxy/firewall to filter by Content-Type | ||
+ | - Topology needs to be changed to centralize all traffic or at least | ||
+ | the unauthenticated one. | ||
+ | - Device needs to be configured. | ||
+ | |||
+ | 3.2.- Open-Source proxy/firewall to filter by Content-Type | ||
+ | - Different proxy/firewall solutions have to be analyzed to select | ||
+ | those performing well. | ||
+ | - Topology needs to be changed to centralize all traffic or at least | ||
+ | the unauthenticated one. | ||
+ | - Device needs to be configured. | ||
+ | |||
+ | 4.1.- Traffic pattern based connection filtering | ||
+ | - This will require a bachelor or master thesis to analyze traffic | ||
+ | patterns and create a lightweight content based filter. | ||
+ | - Analyze if it is possible to implement the content filter on the Mikrotiks. | ||
+ | |||
+ | Josef, I would like to further discuss with you all these ideas. In | ||
+ | the mean time at Mondragon we will continue with the multi-language | ||
+ | voucher platform development and virtually duplicating the | ||
+ | infrastructure. | ||
+ | |||
+ | Best regards, | ||
+ | |||
+ | -- | ||
+ | Iñaki Garitano | ||
+ | Data Analysis and Cybersecurity | ||
+ | Electronics and Computing Department | ||
+ | Mondragon University - Faculty of Engineering | ||
+ | Goiru, 2; 20500 Arrasate - Mondragón (Gipuzkoa), Spain | ||
+ | Tel. : +(34) 647503682 / +(34) 943794700 + Ext. 8119 | ||
+ | www.mondragon.edu | ||
+ | www.garitano.info / www.garitano.eu | ||
+ | |||
+ | @mention a user or group to share this mail. | ||
+ | Content-Type / Media-Type / MIME filtering | ||
+ | 6 | ||
+ | garitano | ||
+ | |||
+ | Here is your Smart Chat (Ctrl+Space) | ||
= New ideas = | = New ideas = |
Revision as of 09:33, 21 September 2018
Digital Inclusion (DigI) | |||||||
---|---|---|---|---|---|---|---|
|
T1.1 Low-cost infrastructure
Task Title | Low-cost infrastructure for InfoIntenet |
---|---|
WP | DigI:WP-I1 |
Lead partner | Basic Internet Foundation |
Leader | |
Contributors | BasicInternet |
edit this task |
Objective
This task will establish the architecture for low cost access, including:
- cost calculation for TZ and Congo
Category:Task |
Deliverables in T1.1 Low-cost infrastructure
Equipment supplier
see DigI:TI1.2 for pilot installations
Content filtering
by Iñaki GaritanoWED SEP 12
Methods
Decentralized = each Mikrotik has to do something. Centralized = all traffic (at least the unauthenticated one) goes through the Basic Internet core.
Methods, ordered by centralized/decentralized filtering plus difficulty/time to implement:
1.- Decentralized filtering
- 1.1.- Whitelist
- 1.2.- Blacklist of already known Content Delivery Network (CDN), addresses (Akamai, Cloudfare, CloudFront, Wowza, IBM Cloud Video, Livestream, DaCast, etc.)
- 1.3.- RouterOS L7 filter
2.- Semi centralized/decentralized - some actions have to be done in the core while others in the Mikrotiks
- 2.1.- Web crawler to analyze requested web pages and populate the blacklists
3.- Centralized filtering
- 3.1.- Commercial proxy/firewall to filter by Content-Type
- 3.2.- Open-Source proxy/firewall to filter by Content-Type
4.- Needs more research because maybe could be done decentralized
- 4.1.- Traffic pattern based connection filtering
= PROS & CONS
1.- Decentralized filtering - Cons: - Mikrotik devices have to be populated by new configuration updates. - Some performance overhead may occur. Would be interesting to somehow measure it.
2.- Semi centralized/decentralized - some actions have to be done in the core while others in the Mikrotiks - Cons: - Core infrastructure needs to be prepared. - Mikrotik devices have to be populated by new configuration updates. - Some performance overhead may occur. Would be interesting to somehow measure it.
3.- Centralized filtering
- Cons:
- All traffic needs to go through a centralized device.
- not suitable, as the backbone traffic is the main cost (and other topics such as virus filtering, internation traffic...)
1.1.- Whitelist
- Pros: - The easiest one to implement. - Allows to reduce most of the traffic.
Cons:
- The most restrictive one.
- Requires to analyze the content of each web page.
- Dynamically generated web pages such as Facebook have to be blocked because is not possible to analyze their content beforehand.
- not completely right, as facebook uses video servers, which can be blocked
Blacklist of already known Content Delivery Network (CDN) addresses
Pros: - The second easiest one to implement. - Allows to reduce well known CDNs' traffic.
Cons:
- Video/Audio content delivered through not known CDNs or other
addresses is not filtered.
- Requires to analyze and update the addresses of CDNs.
1.3.- RouterOS L7 filter
- Pros: - Many interesting filters are already created. - https://wiki.mikrotik.com/wiki/Manual:IP/Firewall/L7 - http://l7-filter.sourceforge.net/protocols - Easy to implement. - Could be implemented together with 1.1 and 1.2 solutions. - Cons: - Only unencrypted HTTP can be matched. NOT HTTPS. - Not 100% reliable.
2.1.- Web crawler to analyze requested web pages and populate the blacklists - Pros: - Could be combined with 1.1, 1.2 and 1.3. - Cons: - Dynamically generated web pages cannot be partially filtered. Such as login based pages. - Not 100% reliable. - Requires many resources to analyze web pages.
3.1.- Commercial proxy/firewall to filter by Content-Type - Pros: - Easy to implement. - Able to filter even HTTPS connections. - Cons: - All traffic needs to be centralized. - Price. - Need to perform a man in the middle for HTTPS connections. - Even if it is paid most probably will not block 100% of not desired traffic. 3.2.- Open-Source proxy/firewall to filter by Content-Type - Examples: - L7-filter - http://l7-filter.clearos.com/ - nDPI - https://www.ntop.org/products/deep-packet-inspection/ndpi/ - OpenDPI - https://github.com/thomasbhatia/OpenDPI - Pros: - Cheap. - Cons: - All traffic needs to be centralized. - Need to perform a man in the middle for HTTPS connections.
4.1.- Traffic pattern based connection filtering (research) - Pros: - Works either for HTTP and HTTPS - Cons: - Traffic patterns need to be generated for different content-type, bandwidth, etc. - Final implementation on Mikrotiks needs to be analyzed. - If not possible, all traffic would need to be centralized
=================================
IMPLEMENTATION PLAN: 1.1.- Whitelist - Done. 1.2.- Blacklist of already known Content Delivery Network (CDN) addresses (Akamai, Cloudfare, CloudFront, Wowza, IBM Cloud Video, Livestream, DaCast, etc.) - Done. 1.3.- RouterOS L7 filter - I would need a student to try different filters and check how it performs. - Filter updating scripts would need to be generated. - Mikrotik performance impact would have to be measured.
2.1.- Web crawler to analyze requested web pages and populate the blacklists. - Different crawlers such as Apache Nutch have to be analyzed. - Scripts to get DNS requests for later analysis have to be developed.
3.1.- Commercial proxy/firewall to filter by Content-Type - Topology needs to be changed to centralize all traffic or at least the unauthenticated one. - Device needs to be configured.
3.2.- Open-Source proxy/firewall to filter by Content-Type - Different proxy/firewall solutions have to be analyzed to select those performing well. - Topology needs to be changed to centralize all traffic or at least the unauthenticated one. - Device needs to be configured.
4.1.- Traffic pattern based connection filtering - This will require a bachelor or master thesis to analyze traffic patterns and create a lightweight content based filter. - Analyze if it is possible to implement the content filter on the Mikrotiks.
Josef, I would like to further discuss with you all these ideas. In the mean time at Mondragon we will continue with the multi-language voucher platform development and virtually duplicating the infrastructure.
Best regards,
-- Iñaki Garitano Data Analysis and Cybersecurity Electronics and Computing Department Mondragon University - Faculty of Engineering Goiru, 2; 20500 Arrasate - Mondragón (Gipuzkoa), Spain Tel. : +(34) 647503682 / +(34) 943794700 + Ext. 8119 www.mondragon.edu www.garitano.info / www.garitano.eu
@mention a user or group to share this mail. Content-Type / Media-Type / MIME filtering 6 garitano
Here is your Smart Chat (Ctrl+Space)
New ideas
QR code for voucher access, alternative: SMS
- generated from http://blog.qr4.nl/QR-Code-WiFi.aspx
Cost calculation
Calculations of costs, using TZ as example (owncloud confidential) https://owncloud.unik.no/index.php/apps/files/ajax/download.php?dir=%2F1-Projects%2FBasicInternet%2FTechnology%2FCost-Infrastructure&files=Infra_cost_Template_Tz.xlsx