Notes on VPC

depth
2

Overview

  • AWS VPC – Virtual Private Cloud is a virtual network dedicated to the AWS account. It is logically isolated from other virtual networks in the AWS cloud. VPC allows the users complete control over their virtual networking environment.
  • VPC is a regional service. It can span across multiple Availability Zones (AZ). VPC contains one or more subnets. The subnets must be in same region, but could be in different availability zones.
  • VPC needs a set of IP addresses in the form of a Classless Inter-Domain Routing (CIDR) block for e.g, 10.0.0.0/16, which allows 2^16 (65536) IP address to be available.
  • The allowed CIDR block size is between a /16 netmask (65,536 IPs) and /28 netmask (16 IPs)
  • Each VPC is separate from any other VPC created with the same CIDR block even if it resides within the same AWS account. i.e. You can clone VPCs with same CIDR blocks within same account but they will be different VPCs.
  • A machine in VPC should either have access to internet or be directly connected to corporate data center using private direct link.
  • VPN requires public internet access and associated with latency. Not an ideal solution to connect corporate intranet to AWS.
  • A bastion host is a host in VPC with access to public internet and which acts like a gateway to other machines in connected to it's private subnet. By shutting down the bastion host when not needed, you ensure high security for your machines in VPC. Your machines in VPC can access AWS services without public internet access.
  • DNS addresses are scarce. VPC allows to configure our own set of EC2 machines with our own intranet setup. You can use same CIDR block IP addresses in 2 different VPCs in same AWS account. They are totally independent.
  • VPC is associated with:
    • Subnets (using CIDR Block -- By Default 172.31. 0.0/16)
    • Security groups (Firewall rules selectively attached to EC2 instances in VPC)
    • Network ACLs (Virtual Firewall at subnet level)
    • Route tables (Maps subnets to router IP addresses; One per subnet is optional; Implicit route table attached to VPC at the top; Subnet uses VPC route-table if there is no subnet specific route-table.)
    • AWS Network Firewall (optional) - Layer 3 to Layer 7 protection
    • Internet gateways
    • VPC peering connections. (Intra region or Inter region)
    • DHCP options (DNS Server, NTP Server, DNS domain name, whether DNS resolution enabled)
  • Default VPC: Your Amazon Web Services account, if it was created after 2013-12-04, has a default VPC in each AWS Region. The default VPC includes a default public subnet in each Availability Zone and an internet gateway (no charges for internet gateways) that is attached to your VPC.
  • There are two launch models within EC2. They're known as EC2-Classic and EC2-VPC. If you have account(s) opened before the end of 2013, you have access to both EC2-Classic and EC2-VPC. Within Classic, you can launch instances “naked”, with direct connection to the internet, without a VPC.
  • By default, each VPC endpoint can support a bandwidth of up to 10 Gbps per Availability Zone, and automatically scales up to 100 Gbps.
  • You can't currently configure VPC with CIDR block of public Internet address range
  • To connect your corporate network to VPC, make sure to use different CIDR blocks.
  • If you specify VPC as dedicated (vs shared) every EC2 instance in VPC will be configured as dedicated (assigned hardware to single customer) which is costly but may be required due to security compliance requirements.

IP Ranges

CIDR Block

Classless Inter-Domain Routing (CIDR) block for e.g, 10.0.0.0/16. CIDR allows fine-grained subnet specification where the number of bits could be anywhere between 1 to 32. e.g. 8, 10, 16, 20, 24, 28, 29, etc. With classful (A, B, C class networks), it must be 8, 16 or 24 resulting in too small or too big sub-network.

Note: In AWS, Smallest block is /28 with 16 addresses, largest block is /16 with 64K addresses ?? Technically largest could be /8 network for 10.*.*.* network.

Private Reserved Addresses

As per RFC1918, AWS uses private IP ranges :

10.0.0.0/8 (255.0.0.0)           Largest Block. 16 Million addresses. 2^24
172.16.0.0/12 (255.240.0.0)      1 Million Addresses. 2^20
192.168.0.0/16 (255.255.0.0)     64K addresses. 2^16
Link-Local Address      Purpose
-------------------------------------------------------------
169.254.169.254          EC2 Instance Metadata Service (IMDS)
169.254.169.123          Amazon Time Sync Service (NTP)
169.254.x.x (general)    Non-routable local communication
-------------------------------------------------------------

Reserved subnet IP addresses

AWS reserves 5 IPs address (first 4 and last 1 IP address) in each Subnet which are not available for use for EC2 instance. For e.g. for a Subnet with a CIDR block 10.0.0.0/24 the five IPs are reserved:

10.0.0.0: Network address
10.0.0.1: Reserved by AWS for the VPC router
10.0.0.2: Reserved by AWS for mapping to Amazon-provided DNS
10.0.0.3: Reserved by AWS for future use
10.0.0.255: The address is reserved for future support for Broadcast.

Private vs public IP addresses

  • Every instance must have a private IP address associated with it attached to default (elastic) network interface eth0. The primary private IP address can not be reassigned to another instance even when EC2 is down.
  • Instance can have additional private IP addresses assigned to it. This can be easily reassigned to another instance.
  • There is also a dynamic public IP assigned to EC2 instance. On reboot this may change.
  • If you assign Elastic IP, it does not change. The associated private IP is also linked to the Elastic IP as long as it is bound to that instance.

ENI - Elastic Network Interface

.      
.              1:N
.       ENI ---------- Private IPs   (Primary Private IP required + Optional )
.
.              1:1
.       ENI ---------- Public IP (eth0 only) | (primary eth0) Elastic IP
.
.              1:N
.       ENI ----------  (Elastic IP + Private IP) (Every Elastic IP requires one Private IP)
.
.              1:N                                      1:1
.       ENI ----------  Security Groups          ENI -------- MAC Address
.
.       Note: ENI belongs to AZ.
.
  • You can have all of these for single ENI:
    • A primary private IP address (for eth0 interface)
    • Additional one or more secondary private IP addresses.
    • One Elastic IP or Public IP associated with it. If Public IP, it must be eth0.
    • One or more security groups
    • A MAC Address
  • ENI is AZ specific. You can detach and attach it with in AZ. Floating ENI has it's attributes follow when you reattach (like SG, etc)
  • You cannot detach a primary network interface eth0 from an instance.
  • Destination check attribute of ENI enabled means ==> Do not allow packet sniffing. (Default). Destination check disabled ==> Allow packet sniffing.

Subnet types

The subnet type is determined by how you configure routing for your subnets.

  • Public subnet – The subnet has a direct route to an internet gateway. i.e. 0.0.0.0/0 => IGw
  • Private subnet – The subnet does not have a direct route to an internet gateway. Resources in a private subnet requires route to NAT device only if public internet access needed. i.e. 0.0.0.0/0 => NAT ;
  • VPN-only subnet – The subnet has a route to a Site-to-Site VPN connection through a virtual private gateway. The subnet does not have a route to an internet gateway. i.e. 0.0.0.0/0 => VGW
  • Isolated subnet – The subnet has no routes to destinations outside its VPC. Resources in the subnet can only access resources within the same VPC. By default every subnet has access to all other subnets. If you want to isolate the subnet, you need to edit Network ACL to deny undesired traffic.

Availability Zone

  • Subnets belong to Availability Zone.
  • The real AZ identifiers look like use1-az1, use1-az2, use1-az3;
  • The term us-east-1a in one account may represent different one in different account.
  • The AZ ids are visible in VPC console.

Public vs Private Subnet in AWS VPC

The main difference is the route for 0.0.0.0/0 in the associated route table.

  • A private subnet sets the default route (0.0.0.0/0) to a NAT instance.
  • Private subnet EC2 instances only need a private ip and internet traffic is routed through the NAT in the public subnet.
  • You could also have no route to 0.0.0.0/0 to make it a truly private subnet with no internet access in or out.
  • A public subnet routes 0.0.0.0/0 through an Internet Gateway (igw). Instances in a public subnet require public IPs to talk to the internet.

Network and Security Appliances

  • Load balancers
  • Network address translation (NAT) servers
  • Proxy servers (Software)

These are EC2 instances that be configured with multiple network interfaces.

EC2 + Multiple subnets

  • Single EC2 can belong to multiple subnets even across VPCs!
  • But those subnets should be in same AZ!
  • You attach multiple ENIs to EC2.
  • Use cases:
    • Management Network: One ENI for public IP in subnet A and another for private IP in subnet B. Control the network
    • Network and Security Appliance: e.g. Load balancers, NAT instances, proxy servers, firewall appliance
    • High Availability: If machine goes down, attach the ENI to another instance.
|
|         VPC1                  VPC2
|
|         Public                Private     [Subnets must be in Same AZ]
|         Subnet      EC2       Subnet
|                ENI1     ENI2
|
|

Route Tables

|
|  +---VPC----IGW-----------VGW----------+
|  |                                     |
|  |      VPC-Main-Route-Table           |
|  |                                     |
|  |   Route-Table1    Route-Table2      |
|  |     Subnet1         Subnet2         |
|  |                                     |
|  +-------------------------------------+

Example Subnet Route Table Entries:
----------------------------------------------------------------------------------
Destination      Target          Notes
----------------------------------------------------------------------------------
10.0.0.0/16      local           Default route for intra-VPC traffic.
0.0.0.0/0        igw-xxxxxxxx    Public internet access (for public subnets).
0.0.0.0/0        nat-xxxxxxxx    Internet access via NAT Gateway (for private subnets).
192.168.0.0/16   pcx-xxxxxxxx    VPC Peering connection route.
172.16.0.0/12    vgw-xxxxxxxx    VPN Gateway route for hybrid connections.
10.1.0.0/16      tgw-xxxxxxxx    AWS Transit Gateway route for connected VPCs.
  • Each VPC has implicit router
  • Each VPC has "Main Route Table". Also can have multiple custom route tables.
  • A subnet can have atmost 1 route table. If not explicitly created, it is the VPC "Main Route Table".
  • Single route table can be reused/associated with multiple subnets.
  • You need not worry about creating entries to route traffic within VPC between the subnets. There is already implicit local route entry which can't be deleted.
  • You only need to setup routes for Internet gateways, Virtual Private gateways, VPC Peering, VPC Endpoints, NAT Device etc.

IGW - Internet Gateway

|        VPC
+---------------------------------+
|       IGW                       |
|                                 |
|       Firewall                  |
|                                 |
|                                 |
|      NACL          NACL         |
|    PubSubnet     PrivSubnet     |
|      NAT-GW                     |
|                                 |
|      EC2            EC2         |
|      SG             SG          |
|                                 |
+---------------------------------+
  • Internet Gateway is attached to VPC
  • It performs NAT (Network Address Translation) for EC2 instances with public IP.
  • For EC2 instance to have 2-way (unlike NAT) internet access, it should have public IP address (or Elastic IP) and subnet route tables must have entry for IGW.

NAT

  • Using NAT device, You can initiate connection to internet, but not other way.
  • NAT is used so that machines in private subnet can connect to internet.
  • There are 2 kinds of NAT devices:
    • NAT Gateway (Managed Service, Less admin effort)
    • NAT Instance. (AMI Linux Instance configured as NAT device)
  • NAT Gateway:
    • Associated with one Elastic IP
    • Associated with subnet and a Priviate IP address
  • NAT instance:
    • AMI Linux instance
    • Security Groups must be explicitly configured
    • Source destination check must be disabled
    • Not highly available (unlike NAT Gateway), Limited Bandwidth.
    • You can put NAT instance inside Auto Scaling Group (with min=max=1) so that it will be auto-started in case of failure for HA.

Security Group

.   Implicit Deny.  Stateful
.
.   Inbound:
.                           
.   TCP       80   0.0.0.0/0    
.   TCP      443   0.0.0.0/0 
.   TCP       22   sg-xxxx       Allow connections from EC2 for which sg-xxx is attached.
.
.   Outbound:
.
.   TCP      ALL   0.0.0.0/0   Allow all outbound traffic.
.

Represents firewall rules. Associated at (EC2) instance level.

  • Security groups are stateful, which means that if inbound traffic is allowed, the corresponding outbound traffic is automatically allowed as well. This is because stateful security groups automatically apply changes to outgoing rules based on incoming rules.

  • Security group rules are whitelist only, and contain an implicit DENY ANY rule.

  • Overall, security groups are like a Layer 4 distributed firewall.

  • Security groups can reference other security groups for source. The referenced security group is used like an alias for the private IP addresses of the instances that SG is attached to:

    sg-one   TCP 22 allow : Means accept ssh connection from all private IPs 
                            for which the sg-one is attached to.
    

Network ACL

.  Implicit Allow  Stateless.
.
.  Inbound Rules:
.  Rule   
.  100   TCP   80      0.0.0.0/0  ALLOW    Allow HTTP traffic from anywhere.
.  110   TCP  443      0.0.0.0/0  ALLOW    Allow HTTPS traffic from anywhere.
.  120   All  All      0.0.0.0/0  DENY     Deny all other inbound traffic.
.
.  Outbound Rules:
.
.  100   All  All      0.0.0.0/0  ALLOW  Allow all outbound traffic.
.
  • Network ACLs are applied per subnet. Note that security groups is for EC2 instance.
  • The default network ACL allows all packets.
  • Network ACLs are stateless. (Securtiy Groups are stateful)
  • Network ACLs work only on CIDR ranges and can’t reference specific EC2 instances.
  • If you want to block SSH for an entire subnet, you could add a DENY entry for TCP port 22.

AWS Network Firewall

|
|     VPC                      Firewall VPC or Firewall Subnet (One Per AZ)
|                Redirect
|    Subnet1   ----------->    Gateway Load Balancer   (Firewall internally Uses GWLB)
|              <----------                    Firewall Appliances
|
|    Subnet2
|
|
|                Redirect
|    VPC1      ----------->      VPC with Firewall (Inspection VPC)
|              <-----------
|    VPC2                          (Firewall Interface Endpoint)
|
  • Network Firewall is a managed firewall solution. Attached to VPC.
  • It provides interface VPC endpoints one per AZ. (It does not provide you with IP address.)
  • The default routing entry of protected subnets should point to Firewall VPC Endpoint.
.
.                        +-----------------------------
.                        |       0.0.0.0/0            |
.                        V                            |
.        Ingress     Firewall      Local              |
.   IGW ---------> VPC Endpoint ------------>  Protected Subnets
.                    (Subnet)                     
.                        |        
.       <----------------+        
.          0.0.0.0/0          
.                                 
.                                 
  • Network Firewall internally uses Gateway Load Balancer. The endpoint looks similar/same as GWLB endpoint.
  • Protects entire VPC.
  • Layer 3 to Layer 7 protection.
  • You can inspect traffic between:
    • VPC to VPC
    • Outbound to internet
    • Inbound from internet
    • To/From Direct Connect and Site-to-Site VPN
  • Internally it uses AWS Gateway Load Balancer (AWS internal network appliance)
  • Entire VPC (aka Inspection VPC) or single subnet (aka Inspection subnet) is dedicated for Firewall
  • Rules can be centrally managed cross-acount by AWS Firewall Manager to applay to many VPCs.
  • You can filter traffic and raise alerts.
  • Send logs of rule matches to S3, Cloudwatch logs, Kinesis Firehose

Elastic network interface (ENI)

  • An elastic network interface (ENI) is like a virtual network interface card (NIC).
  • You can apply multiple ENIs to an instance, and move it to another instance in same subnet.
  • Multiple Elastic IP addresses can be applied to an ENI.
  • An ENI has a dynamically assigned private address in the assigned subnet, and can optionally have a dynamically assigned public IP address as well.
  • Multiple addresses can be assigned to an ENI.

Elastic IP Address (EIP)

  • Elastic IP is static public IP that is applied to ENI.
  • There is also regular public IP address used by AWS that is assigned to EC2 on creation. But those public IPs can change when instance is rebooted.
  • Every public address (Elastic IP or not) has a private address associated with it.
  • This private address is static unless the Elastic IP address is moved to another subnet.
  • Usually you can assign Elastic IP to any instance in your region. However if you have local Zone or 5g wireless network zone, then specific restrictions apply.

Bastion Host

  • Aka Jump Server, any server on the outer surface, that is hardened but provides access to some service.
  • Mainly useful for "SSH" into private instances via this.
  • But VPN Server, E-mail server, etc also could be called as bastion (military fort) server.
  • Bastion host does not do NAT, so it is not an alternative to NAT gateway. Both are required.
  • Using SSM Session Manager you can log into any EC2 instance, public or private --this is a better alternative to Bastion Host.

NAT Instance

  • It is an alternative to NAT Gateway.
  • Configure EC2 to be a NAT instance. Save the AMI and reuse it anytime.
  • Cheaper than NAT Gateway but not HA.
  • Note that NAT instances can be used through VPN or DirectConnect connections but NAT Gateways can not be reused after any kind of Peering.
  • See https://docs.aws.amazon.com/vpc/latest/userguide/VPC_NAT_Instance.html

VPC Peering

|
|      Region1                                        Region2
|      Account1                                       Account2
|                    (Peering Connection pcxxxx)                          
|         VPC-A   <------------------------------->   VPC-B
|
|                    (Request & Accept Peering)
|
|         Note: Just update route tables in each VPC's subnets.
|               Update SG inbound/outbound rules.
|
|
|         Destination     Target
|         10.0.0.0/16     Local
|         172.31.0.0/16   pcx-1122334
|
|     Optional: EnableDNS at Requestor Side and Acceptor Side
|
|
  • First create VPC peering between 2 VPCs and get the Peering Id which looks like pcx-112233445566 Use this to update all network tables of relevant VPCs to route traffic.
  • Can connect 2 VPCs in different regions (different countries). No Gateways, VPNs are involved, Data does not travel through internet. Data flows through AWS backbone.
  • Cheapest way to do data replication between different regions by connecting the VPCs using VPC Peering connection.
  • Both VPCs should not have overlapping CIDR blocks to enable Peering.
  • To enable VPC Peering, you just update your subnet route tables and EC2 instance security group. No new devices are added. No extra cost.
  • Max 125 Peering connections per VPC.
  • VPC Peering does not give you access to the other VPC's VPN, Direct Connect, IGW, Gateway VPC endpoint, etc Just Remember Peering is not transitive by design.
  • If you enable DNS on Peering Connection at Requestor side, the Requestor VPC will automatically get fallback DNS resolution from Acceptor VPC. And vice versa.

VPC Shared Subnets aka VPC Sharing

.
.      Shared Subnets   ===  Get Implicit Routing 
.
.
.      VPC1                        VPC2          VPC3
.
.      Private-Subnet1   
.      Private-Subnet2 ------------------------------ (Only Subnet2 shared)
.                                    Share
.
  • The owner vpc can share it's subnet with one or more VPCs in other accounts.

  • Owner Accounts must be managed by organizations (same or different organizations OK).

  • VPC Console, you can view all subnets (with owner VPC information). You can communicate with EC2 on shared subnet without having to update any route tables!

  • Share from VPC console or use cli:

    aws ram create-resource-share   ....
    aws ram associate-resource-share   ....
    

Use Cases:

  • Expose your EC2 instances
  • Expose your applications running in shared subnet

Note: RAM share also enables you to share specific resources like PHZ, Resolver Rules, Prefix List, Transit Gateway, etc.

AWS Transit Gateways - TGW

.                                                          +-------------+
.                                           Peering Mesh   |   TGW       |
.                                         +----------------| Inter/Intra |    
+---------------------------+             |                |   Region    |         +---- on-premise
|                           |      +--------------+        +-------------+         |
|    Shared VPC - Account-1 |      |              |                   +--> VGW-VPN-in-Another-Account
|    [ To Share NAT-GW]     |      |   TGW        |    [Attachments]  |   
|                           |======| [Account1]   |-------------------+--> VPC-in-Dev-Account 
|    [Central Network Ac]   |      | [RAM Shared] |    [GRE Tunnel]   |
|                           |      +--------------+    [BGP ]         +--> VPC-in-Prod-Account
+---------------------------+                                   
.
.     ==== indicates initial attachment to shared VPC on TGW creation.
.     There is one route-table associated with one attachment.
.     You must share TGW before attaching VPC if it belongs to another account.
.
  • Transit Gateway is a global Resource in an account but not useful without atleast one VPC attachment.

  • The following are the key concepts for transit gateways.

  • Attachments — You can attach the following:

    • One or more VPCs
    • A Connect SD-WAN/third-party network appliance
    • An AWS Direct Connect gateway
    • A peering connection with another transit gateway
    • A VPN connection to a transit gateway
  • Transit gateway Maximum Transmission Unit (MTU) — The maximum transmission unit (MTU) is 8500 bytes. Traffic over VPN connections can have an MTU of 1500 bytes.

  • Transit gateway route table — A transit gateway has a default route table and can optionally have additional route tables. It can have static and dynamic (propogated by the attachments) entries. The target of these routes could be any transit gateway attachment.

  • Associations — Each attachment is associated with exactly one route table. Each route table can be associated with zero to many attachments.

  • Route propagation — A VPC, VPN connection, or Direct Connect gateway can dynamically propagate routes to a transit gateway route table. With a Connect attachment, TGW routes are propagated by default. With a VPC, you must create static routes to send traffic to the transit gateway. With a VPN connection, routes are propagated from the transit gateway to your on-premises router using Border Gateway Protocol (BGP). With a Direct Connect gateway, allowed prefixes are originated to your on-premises router using BGP. With a peering attachment, you must create a static route in the transit gateway route table to point to the peering attachment.

  • Create Transit Gateway in One account, share it using RAM in other accounts. Create "Transit Gateway Attachment" from TGW to VPC.

  • TGW is highly available and scalable service to consolidate the AWS VPC routing configuration for a region with a hub-and-spoke architecture.

  • Acts as a Regional virtual router to interconnect VPCs and on-premises networks.

  • Traffic always stays on the global AWS backbone. Immune to DDoS attacks.

  • TGWs across different regions can peer with each other to enable VPC communications across regions.

  • VPC Peering and Transit Gateway are used to connect multiple VPCs.

  • VPC Peering is not transitive. You need full mesh connectivity.

  • Transit Gateway provides hub-and-spoke architecture. Best if you have too many VPCs.

  • AWS Transit Gateway is a fully managed service that connects VPCs.

  • Connects On-Premises networks through a central hub without relying on numerous point-to-point connections or Transit VPC.

  • You can attach all your hybrid connectivity (VPN and Direct Connect connections) to a single Transit Gateway instance, consolidating and controlling your organization’s entire AWS routing configuration in one place! Very powerful.

  • TGW: Up to 5,000 Attachments per Region.

  • TGW: Up to 50 Gbps (burst)/attachment. For VPC peering, there is no limit since it is only limited by the instance bandwidth limit.

  • Slightly lower bandwidth than VPC peering since it involves a hardware (router) hop.

  • Slightly higher cost due to Data transfer, Data processing, and Hourly per attachment. However operational (administration) cost is lower with simplified architecture.

  • Security Group (cross-referencing) (per EC2 instance) is supported in Peering but not in TGW.

  • Extra cost of hourly charge per attachment in addition to data fees.

  • Use VPC Peering if number of VPC connected is small (<10) or heavy data transfer involved or specific need for low latency or more than 50 Gpbs throughput required.

  • You can not share NAT Gateway using RAM directly with other VPCs. To share NAT GW among VPCs, you should share Transit GW as described above:

    See https://aws.amazon.com/blogs/networking-and-content-delivery/
                      using-nat-gateways-with-multiple-amazon-vpcs-at-scale/
    
  • You can also connect 2 Transit Gateways from 2 accounts with TGW Peering. It is less common but powerful. See https://docs.aws.amazon.com/vpc/latest/tgw/tgw-peering.html

Transit Gateway Connect Attachment vs Connect Peers

TGW connect attachment is used with GRE (point to point tunneling) protocol and is used to attach VPC. TGW connect peer is used with primarily GRE + BGP (Border Gateway Protocol) and is used to attach SD-WAN or network appliances. :

|           VPC Attachment                                   VPC3
|     VPC1 --------+                                                
|                  |                           Connect       
|                 TGW                         Attachment    Appliance
|              Gateway-IP (e.g. 192.0.2.1) ---------------- Peer IP (e.g. 172.31.1.11)
|              BGP-IPs (169.254.6.2)       ...............  BGP IP  (e.g. 169.254.6.1)
|                  |                         BGP Peering
|     VPC2 --------+
|

AWS VPN

  • VPN connection could be any one of the following types:
    • AWS Hardware VPN (aka site-to-site VPN) involves creating Virtual GW and Customer GW. (Site-to-Site VPN is a hardware VPN but managed solution)
    • AWS Software VPN (EC2 instance running third party VPN appliance software)
    • AWS Client VPN - Attach Client VPN component in an AWS target network. Ask clients to connect using OpenVPN. (Managed solution. No software/hardware installation required) See below for more notes.
    • AWS Direct Connect + AWS Hardware VPN.

Site-to-Site VPN

  • On the AWS side of the VPN connection, a Virtual Private Gateway (VGW) provides two VPN endpoints for automatic failover. (Managed hardware solution)

  • On the customer side, a customer gateway (CGW) needs to be configured, which is the physical device or software application on the remote side of the VPN connection:

    VGW - Virtual Private Gateway (Max only on1 VGW per VPC allowed. Default is None)
    TGW - Transit Gateway
    .
    VPC             Site-to-Site-VPN         On-Premises-Network
    +-----------+-------------------------+------------------+
    |           |       VPN Connection    |                  |
    |       VGW |         1:10+           |                  |
    |        or |   <===================> | Customer-Gateway |
    |       TGW |         BGP             |                  |
    |           |    (2 VPN Tunnels)      |                  |
    +-----------+-------------------------+------------------+
    .
    .
    .                    --------------- On-Premise-1
    .               VGW  --------------- On-Premise-2
    .                    --------------- On-Premise-3
    .
    .                     (VPN CloudHub)
    .
    
  • You can also use Transit Gateway to On-promises configuration using Site-to-Site VPN if you want to connect multiple VPCs.

  • Recommended that customer Gateway VPN device to support BGP (Border Gateway Protocol) since it auto advertises the connection routing from/to virtual gateway.

  • One VGW can support upto 10 Customer Gateway creating VPN CloudHub! (Limit can be increased)

  • Site-to-site VPN Concepts:

    • VPN connection: A secure IPSec connection between your on-premises equipment and your VPCs.

    • VPN tunnel: An encrypted link where data can pass from the customer network to or from AWS. Each VPN connection includes two VPN tunnels which you can simultaneously use for high availability.

    • Customer gateway: An AWS resource which provides information to AWS about your customer gateway device.

    • Customer gateway device: A physical device or software application on your side of the Site-to-Site VPN connection.

    • Target gateway: A generic term for the VPN endpoint on the Amazon side of the Site-to-Site VPN connection.

    • Virtual private gateway: A virtual private gateway is the VPN endpoint on the Amazon side of your Site-to-Site VPN connection that can be attached to a single VPC.

    • Transit gateway: A transit hub that can be used to interconnect multiple VPCs and on-premises networks, and as a VPN endpoint for the Amazon side of the Site-to-Site VPN connection.

      Note:

      Note that Dx (DirectConnect GW) can connect multiple VPCs too but traffic will have
      to go through DirectConnect-DataCenterRouter if you are not using TGW.
      
  • VPN cloud Hub: To connect multiple remote networks (e.g. multiple branch offices) we may have multiple VPN Hardware devices (with each one per branch). Use Transit Gateway for this.

  • Always make sure every VGW and customer Gateway has unique ASN especially in HUB kind of configuration so that BGP kind of protocols will work fine.

Site-to-Site VPN Connection Throughput Increase

.
.              ECMP
.    TGW  ------------------ Customer Gateway
.            Faster VPN
  • The maximum throughput of a Site-to-Site VPN connection is 1.25 Gbps.
  • If you have higher internet bandwidth, you could use equal cost multipath (ECMP) routing.
  • ECMP routing is available for VPN connections that are attached to a transit gateway.
  • With ECMP routing, you can aggregate multiple VPN connections to achieve a higher effective throughput.

Client VPN

  • If you want the client to connect using OpenVPN software without installing any kind of special CustomerGateway software/hardware installed on premise, then you need client VPN.
|
|                                   VPC
|                       Internet
|      clientVPN       --------->  clientVPN-Endpoint -------> One subnet per AZ in VPC.
|    {OpenVPN client}              {public DNS Name}
|                                  {created by AWS }
|
|
|
  • AWS Client VPN supports authentication mechanisms like Active Directory, Certificate-based authentication, and Federated Authentication using SAML-2.0. It is a software solution.
  • You create clientVPN endPoint that is attached to one subnet per AZ in the VPC. You can attach multiple subnets from the same VPC but only one per AZ. This means the client VPN will get allocated private IP from the connected subnet. Once you reach an EC2 in that subnet, you can reach anyother resources in VPC from there.
  • The target subnet must have atleast 20 available IPs at anytime.
  • If the target VPC is connected to Transit Gateway which in turn connects to multiple corporate locations (by VPN) and multiple VPCs, then it is a perfect solution to access everything from home VPN client!!!

AWS Directconnect

DX = Direct Connect VIF = Virtual Interface VGW = Virtual Private Gateway TGW = Transit Gateway DX-GW = Direct Connect Gateway - Use it to connect to multiple regions. DX-GW? = Optional Direct Connect Gateway

Note: VGW - Virtual Private Gateway is used as both purposes as:
              VPN terminator (with public IP)  Or
              DirectConnect Terminator (to connect to VPC).
      VGW is attached to VPC, not subnet.
      Dx-GW is also a Direct Connect Terminator.
      TGW is not DirectConnect Terminator. You need Dx-GW in front of TGW.
      VIF, DX-GW, VGW all are logical constructs and not physical devices.
      AWS Backbone is present at the DirectConnect Location itself.
      VIF is logically present in DirectConnect Location.
      VGW is logically present in VPC.
      DX-GW is a global resource and outside region.
      Need seperate VIF per VPC or TGW

+-----------------+                           
|  AWS Region     |               Max 50 VIFs +--------------+                 
|                 |                           |              |
|  S3-Service  <--|----------------  Pub-VIF  |              |
|                 |                           |              |    LAG
|  VPC1 -- VGW  <-|-DX-GW?---------  Priv-VIF | Dx-Router in | <--------->  Customer-Router
|                 |                           | Dx-Location  | <--------->    On-premise 
|  VPC2 -- TGW  <-|-DX-GW----------  Transit  |              |                  BGP        
|                 |                    VIF    +--------------+ 
+-----------------+                                                            
.
.                          +---------+       VIF1   +-------------+
.  Region1   VPC1 VGW -----|  DX-GW  |--------------| Dx-Location |   One VIF per connection.
.  Region2   VPC2 VGW -----|         |--------------|             |
.                          +---------+       VIF2   +-------------+
.
| Note: Pub-VIF allows you to connect to any Amazon public IP (e.g. s3.amazonaws.com).
|       The traffic does not go through public internet.
|
|       LAG - Link Aggregation Group uses multiple connection as single logical connection.
|             This is to increase the speed.
|
  • AWS Direct Connect is available at locations around the world such as Mumbai.

  • Customer network router device must support Border Gateway Protocol (BGP) and BGP MD5 authentication.

  • 802.1Q VLAN encapsulation must be supported across the entire connection, including intermediate devices.

  • Private VIF, by default, used to connect to single VPC using VGW.

  • For connecting to multiple VPCs in multiple regions, use DirectConnect Gateway.

  • If you want to connect to multiple VPCs, you need VGW+DX-GW or TGW+Dx-GW.

  • DX-GW before connecting to TGW is required.

  • However Dx-GW before VGW is required only if you want to connect to many VPCs.

  • The transit Gateway multi-region peering is also supported now.

  • If you enable SiteLink feature while creating VIF, it allows traffic with-in multiple DX locations without going through AWS regions. If you have multiple On-premise Datacenters, then data will flow through Dx Locations faster:

    +---------------+-------+
    |      AWS      |       |---- VIF  DX-Location-1 ----- On-Premise-DC1
    |    Region-1   | Dx-GW |     
    |    Region-2   |       |---- VIF  DX-Location-2 ----- On-Premise-DC2
    +---------------+-------+
    
    Note: For connection between DC1 and DC2, SiteLink enables quick flow bypassing Regional routers.
    
  • There are many other options also possible:

    See https://docs.aws.amazon.com/whitepapers/latest/building-scalable-secure-multi-vpc-network-infrastructure/direct-connect.html

Direct Connect + VPN configuration

Virtual Private Gateway is either:

- VPN terminator at AWS side OR
- Direct Connect Terminator

Configure the VGW as VPN terminator.

This acts like Site-To-Site VPN with Direct Connect:

| 
|  Direct Connect + Site-to-Site VPN
|                                        Dx Location
|                     IPSec           ---------------------+      On-Premise
|       VPC - VGW <------------Public-|DxRouter--Customer  |-----Customer GW
|                              VIF    |          Router    |     (Router+BGP+IPSec)
|                                     +--------------------+
|

Recollect A simple DirectConnect:

|  Simple Direct Connect 
|                                        Dx Location
|                                     ---------------------+     On-Premise
|  VPC-VGW--[Dx]---------------Private|DxRouter--Customer  |-----Router (Router+BGP)
|                              VIF    |          Router    |     (IPSec Not Required)
|                                     +--------------------+
|

Recollect A simple VPN Terminater for Site-to-Site VPN:

|  Simple Site-to-Site (Hardware) VPN Terminator 
|                        
|                                On-Premise
|       VPC-VGW-------------------Customer (Router + IPSec) + Optional BGP
|                                   GW
|

Note:

The peer GRE address is the private IP address of the SD-WAN instance that you want to create 
the GRE tunnel to. The Transit Gateway GRE address is one of the available IP addresses from 
the Transit Gateway CIDR. (e.g. 192.0.2.0/24, etc)
The BGP inside IPs are part of a /29 CIDR block. (e.g. from the 169.254.0.0/16 range for IPv4). 

Virtual Private Gateway is similar to Transit Gateway and is associated with IP address in
range 169.254.0.0/16 CIDR block with highest 8 addresses. i.e. Last byte being 248-254

GRE - Generic Routing Encapsulation - Tunneling

Generic Routing Encapsulation is a tunneling protocol developed by Cisco Systems that can encapsulate a wide variety of network layer protocols inside virtual point-to-point links or point-to-multipoint links over an Internet Protocol network.

AWS Transit VPC

Central VPC with VPN connections with other VPCs is a transit VPC.

|
|                 VPN                   VPN
|       VPC1    <------>  Transit-VPC <------->    VPC2  
|       VGW               CustomerGW               VGW   
|                         Appliances
|
  • Transit Gateway is preferred over Transit VPC. Both are used to interconnect VPCs. TGW is a managed scalable service.
  • Transit VPC also follows hub and spokes model just like TGW.
  • EC2 instance is configured as a router with VPN customer gateway software installed on it. This connects to multiple VPCs using IPSec/VPN connection. Each such VPC has a VGW attached to it. Essentially, if you reach the transit VPC by some means, then you have VPN connection with set of other AWS VPCs.
  • Transit Gateway is a managed scalable router service which is more sophisticated.
  • VPN cloud Hub is another alternative which is more natural and simpler setup. VPN cloudhub is more focused on connecting multiple on-premise locations using site-to-site VPN where as Transit VPC is more concerned about connecting multiple VPCs together.
aws ec2 describe-network-interfaces

VPC Endpoint

VPC Endpoint is a virtual device (like a router) which enables EC2 instances to access AWS services without going over internet.

There are two types of VPC Endpoints:

  • VPC Gateway Endpoint (legacy). Works for S3 and DynamoDB only. Free.
  • VPC Interface Endpoint. Uses PrivateLink technology. Works for All AWS Services and other your own customer services in other VPCs.

Note: VPC endpoint is associated with a resource policy. This usually restricts the principals and roles who can access this endpoint.

If this endpoint is pointing to the service, say S3, then the S3 bucket policies further restricts access to the principals -- it can even have conditions like aws:SourceVpc and aws:SourceVpce to protect the access.

The request context of AWS sdk calls may include aws:SourceVPC details as much as possible.

If you have your own Service that you want to make it available to other AWS customers, you can use AWS PrivateLink technology. i.e. You create VPC-Endpoint-Service and let your customers know about it. Your customers will create VPC (interface) Endpoint and connect to your Service :

+------------------------------+                      +--------------------------------+
|   VPC (Service Provider)     |                      |      Another AWS customer VPC  |
|                              |                      |                                |
|   [Your Own Service]         |     PrivateLink      |                                |
|                 +--------+   | <------------------- |  {VPC-Interface-Endpoint}      |
| +------------+  |EndPoint|   |                      |       (Security Group)         |
| |NLB or GWLB |<-|Service |   |                      |                                |
| +------------+  +--------+   |                      |                                |
+------------------------------+                      +--------------------------------+

Endpoint Service dns name looks like below:

vpce-svc-01234567890abcdef.us-east-1.vpce.amazonaws.com   # Resolves to NLB internal IP Address.
my-service.my-domain.com   # To associate private dns name, must verify domain ownership
  • Endpoint service name is not just a holder for IP address. It is a "key name" that AWS understands and maps it to the unique service. i.e. (Service-Provider-VPC, IP-addr). The private dns service name also is used as the alias key name to locate that service.

At the other end, the VPC Endpoint is associated with an ENI with unicast routing enabled to this destination service IP.

  • PrivateLink only exposes single TCP port from the service provider. Better for security. Limits ability to share the service provider capability in terms of other ports and protocols.
  • Bit expensive since it must use a network load balancer. But you can use that load balancer for other purposes as well. Scalability is built-in due to network load balancer.
  • Endpoints are associated with security group. Especially you need to allow inbound connections.

VPC Gateway Endpoint vs VPC Interface Endpoint

  • Interface Endpoint is a set of ENIs (think network card one per AZ) within your VPC.

  • It uses DNS record to direct your traffic to the private IP address of the nearest interface.

  • It requires application changes to change the service URL, For example:

    https://my-bucket-name.vpce-sd98fs0d9f8g.s3.us-west-2.vpce.amazonaws.com
    
  • Gateway Endpoint also uses DNS and route prefix in your route table to direct traffic meant for S3 or DynamoDB to the Gateway Endpoint (something like 0.0. 0.0/0 -> igw).

  • Both endpoints are associated with permission policies.

  • Interface endpoint requires Security Group, but Gateway Endpoint do not.

VPC Gateway Endpoint

  • Only S3 and Dynamo DB supported for Gateway Endpoint. For other services, use VPC interface Endpoint
  • Belongs to VPC.
  • Think of this VPC Gateway Endpoint as S3 Gateway and DynamoDB Gateway (similar to IGW) at VPC level.
  • Older technology, replaced by VPC interface Endpoint.
  • No Application changes required. Public S3 dns names auto routed through VPC private gateway endpoint.

The gateway endpoints are created at VPC level and available from all subnets provided their route tables are pointing to them.

|
|                                                       Routing           Destination
|                                                       pl-68154001       vpce-12345678
|
|                 RouteTable (PrefixList)
|      Subnet1  --------------------------->
|                                              VPC-Gateway-Endpoint (No IP Address. Similar to IGW) 
|      Subnet2  --------------------------->
|                 RouteTable
|
|   E.g.  prefix list id: pl-68a54001 (com.amazonaws.us-west-2.s3) and endpointID vpce-12345678
|
  • Uses an implicit Private IP (we need not configure). No public IP needed.

  • Endpoint must be associated with one or more route tables.

  • Endpoint Policy controls access to resources within the service. (e.g. For S3 gateway endpoint, you can control what s3 buckets can be accessed).

  • One Gateway Endpoint per Service. i.e. One S3 Gateway Endpoint and Another DynamoDB endpoint.

  • Can access service in same region only.

  • Note: No additional charges for using VPC gateway endpoint. Only bandwidth data transfer cost.

  • When you create a gateway endpoint (for example for S3), you select the VPC route tables for the subnets that you enable. The following route is automatically added to each route table that you select:

    Destination        Target
    prefix_list_id     Gateway-endpoint-id
    
    E.g.  prefix list id: pl-68a54001 (com.amazonaws.us-west-2.s3) and endpointID vpce-12345678
    
  • One subnet can be associated with multiple vpc endpoints (one per service).

  • One VPC gateway (for a service) can be associated with multiple subnets.

  • Gateway Endpoint is free, interface endpoint costs money.

  • Using Gateway Endpoint the client applications don't need to modify the S3 access code and no private DNS required. The https://s3.us-west-2.amazonaws.com/ endpoint is correctly resolved to internal VPC address using that special routing table entry involving prefix list.

VPC Interface Endpoint

  • One VPC Interface Endpoint for one AWS Service (like API Gateway, S3, NLB, etc). Possible for almost all AWS Services.

Usually VPC endpoint means VPC interface Endpoint at the client side to access some service (provided by AWS or by another VPC).

+------------------------------------------------+
|                                                |
|   VPC  {VPC-Interface-Endpoint}                |
|   (client) (policy)                            |
|                                                |
|   vpc-endpoint-service-URL For App             |    PrivateLink        AWS-Service 
|                                                | ------------------>                   (VPC Endpoint Service)
|   subnet1-AZ1                                  |    Technology        Partner Service
|   subnet2-AZ1 [ENI-For-Endpoint-One-per-AZ]    |                      (NLB IP, Partner VPC)
|                       (SG)                     |                      Service-dns-name
|   subnet3-AZ2                                  |                      (private hosted zone)
|   subnet4-AZ2 [ENI-For-Endpoint-One-per-AZ]    |
|               [Additional per AZ optional]     |
+------------------------------------------------+


PrivateLink + (VPC Peering | Transit Gateway Peering)  ===> Access Service Across Regions.

PrivateLink Only          ===> Same Region Access Only. Overlapping IP not a problem.
  • Service domain names look like:

    ec2.us-east-1.amazonaws.com
    s3-east-1.amazonaws.com
    
  • VPC Interface Endpoints, by default, have a DNS name like below. This needs application changes to point to this service (e.g. S3 service):

    vpce-xxx.s3.us-east-1.vpce.amazonaws.com  (S3 interface Regional Endpoint)
    vpce-xxx.s3.us-east-1d.vpce.amazonaws.com (AZ2) (S3 interface Zonal Endpoint)
    
  • To enable interface endpoint DNS names to resolve to private IP for a VPC endpoint, you can:

    • Ensure that DNS hostnames and DNS resolution are enabled for the VPC
    • Enable private DNS names on the endpoint
    • Applications do not need to change since default service URL s3.us-east-1.amazonaws.com will automatically resolve to the interface endpoint.
Note: Following is the default vs interface regional endpoints.

               Default S3 Bucket Endpoint   |  S3 Interface Bucket Endpoint
-----------------------------------------------------------------------------------------------
Bucket         s3.us-east-1.amazonaws.com   |  bucket.vpce-xxx.s3.us-east-1.vpce.amazonaws.com
AccessPoint    s3-accesspoint.us-east-1.am* |  accesspoint.vpce-xxx.*
Control        s3-control.us-east-1.am*     |  control.vpce-xxx.*
-----------------------------------------------------------------------------------------------

From outside VPC the interface bucket endpoint will resolve to public regional endpoint.
From inside VPC, it will resolve to the nearest interface endpoint IP address.

Note: Control endpoint is used for S3 operations such as upload file.

aws s3 ls s3://my-bucket/ --region us-east-1 --endpoint-url https://bucket.vpce-xxx.s3.*

If you enable private dns name, you don't need to verify ownership, because it is totally
internal to consumer VPC. (Remember for endpoint service name, you need to verify ownership)
  • When you create it, you can associate multiple subnets with it, but atmost one subnet per AZ. It will create ENI with private IP address for every subnet associated. The endpoint closest to the AZ is preferred for instances in subnet.
  • Usually, one Interface Endpoint per VPC per service is enough. You may want to associate it with multiple AZ for High Availability (and minimize inter-zone data transfer). (You may create multiple interface endpoints for same service for very high bandwidth requirements).
  • The connectivity to AWS Services (e.g. API Gateway) is powered by modern AWS Private Link technology.
  • You can also use this to connect to AWS partner services.
  • Network ACL for subnet may restrict the traffic, must be configured properly.
  • It is billed.
  • Interface endpoint policy restricts what can be accessed through this and who can access it. (e.g. For S3 interface endpoint, you can control what s3 buckets can be accessed and by whom).

Centralized VPC Endpoint architecture

  • Too many VPC endpoints could become messy and too difficult to manage.
  • Useful for easier administration and cost saving.
  • Use "Transit Gateway" to share your VPC interface endpoints across VPCs and across accounts.
  • There is one shared VPC dedicated to serving all interface endpoints to all services. This has specific unique CIDR range. Use TGW to route this CIDR range to this VPC.

VPC VPN Connections (Revisited)

  • VPC VPN connections provide secure IPSec connections from on-premise computers/services to AWS.
  • VPN connection could be software based (using OpenVPN protocol by installing VPN server software in EC2 instance) or hardware based (Needs hardware device on premise). Note that software VPN is nothing to do with VPC VPN.
  • VPN cloud Hub: To connect multiple remote networks (e.g. multiple branch offices) we may have multiple VPN Hardware devices (with each one per branch) connected using site-to-site VPN.
  • For VPN connection between VPC and customer site(s) using site-to-site hardware VPN you need:
    • VGW - Virtual Private Gateway at AWS side which is attached to VPC.
    • CGW - Customer Gateway at customer site which hardware vpn device (or software)

VPC Sharing (aka subnet sharing)

By default, when you say Shared VPC it means the central VPC with it's subnet shared with many other VPCs. You can have NLB, NATGW, EC2, RDS etc in this subnet that can be easily shared with other VPCs.

.
.
.         VPC1           Central VPC             VPC2
.        subnet1+        central-Subnet          subnet2+
.                          NLB
.                          NAT-GW, VGW
.                          RDS, EC2
.                          
.                          
  • Go to AWS console, select VPC service, then select subnet.
  • Share the subnet with another AWS Account (under same AWS Organization)
  • After a subnet is shared, the participants can view, create, modify, and delete their application resources in the subnets shared with them.
  • VGW - Virtual Private Gateway is also called (hardware) VPN Gateway.

Cost of VPCs

  • Following things are free:
    • VPCs
    • security groups
    • route tables
    • Internet gateways
    • virtual private gateways (VGW - AWS side of VPN Concentrator)
    • network ACLs
    • ENIs are free.
  • VPN connections have an hourly charge.
  • Elastic IP addresses have a cost if they are not associated with an instance or if an instance has more than one Elastic IP address, or if the instance is remapped more than 100 times in a month.
  • Inbound traffic from the Internet is free, and outbound traffic has a data transfer fee.
  • There is a data transfer fee between Availability Zones.

VPC Flow Logs

Format:

version      interface-id              dstport protocol bytes 
  accountId            srcaddr dstaddr srcport   packets     start end   action 
2 12345789010 eni-xxxx a.b.c.d w.x.y.z 2020 22 6 20     4249 xxxxx xxxxx ACCEPT OK
2 12345789010 eni-xxxx a.b.c.d w.x.y.z 2020 22 6 20     4249 xxxxx xxxxx REJECT OK
  • VPC Flow logs are not sent to cloudwatch by default. You can do it with some filters and then use cloudwatch insights to analyze it.
  • Flow logs can be sent to S3, cloudwatch and Firehose.
  • Monitor REJECT actions and see if it is due to malformed NACL or SG or malicious connects.
  • NACL rules are simple and stateless. SG rules are stateful i.e. outbound allowed when inbound allowed.

VPC Internals

  • AWS VPC implementation is more like a software designed network.
  • subnets are defined with in single AZ.
  • Everything in same VPC has Layer 2 reachability (DataLink layer)
  • One VPC is usually only one CIDR block. But it could have additional CIDR blocks too! Internally, VPC has unique ID, the CIDR block is not that significant -- Any subnet must just be unique: (VPC-Id, subnet-CIDR) is unique.
  • The Address Resolution Protocol (ARP) is not used in VPC. All traffic in a VPC is unicast.
  • Promiscuous mode spoofing the network by EC2 machine does not reveal any other packets than the ones meant for itself.
  • Subnet Route tables act like a source-based policy-based routing (PBR) rule. In other words, you can choose which direction packets should go based on the subnet the instance is in.
  • One NAT gateway per AZ is a good practice. In theory, you can use one NAT gateway for all subnets in VPC, but inter AZ traffic is charged. Unless you
  • A NAT gateway is $0.045 per hour. If you have more than 4.5 GB of data per hour going cross AZ to reach a NAT Gateway in a different AZ then giving the AZ it's own NAT gateway will be cheaper.
  • Overlapping CIDR block is a problem for both VPC peering and transit gateways. PrivateLink is an option to expose your service where overlapping IP address is not an issue.
  • Data transfer cost with in availability zone is free.

VPC Routing and Switching

  • All traffic is unicast.
  • The Address Resolution Protocol, or ARP, is proxy handled locally by hypervisor.
  • The VPC forwarding design knows where all IP addresses are.
  • VPC is connected to outside world through Internet gateway, VPC-to-VPC peering, VPN gateways, Transit Gateways, or Direct Connect (leased lines) gateways.
  • VPC Peering is not transitive, so use Transit Gateways if you have many VPCs.
  • There’s a maximum transmission unit (MTU) of 1,500 when traffic leaves the VPC (between VPCs, regions, Direct Connect, or the Internet). Jumbo frames are supported within a VPC.
  • There are placement groups that allow for full 10-gigabit Ethernet bandwidth connectivity between instances that have enhanced networking enabled.
  • Use private IP addresses while communicating with in VPC for best performance.