Egress tunnel capture + agent identity#338
Egress tunnel capture + agent identity#338Nina Polshakova (npolshakova) wants to merge 3 commits into
Conversation
| egressCaptureHTTPPort = uint16(15001) | ||
| egressCaptureHTTPSPort = uint16(15002) |
There was a problem hiding this comment.
Why do you need to capture HTTP vs HTTPS separately? A single listener could handle both and since SO_ORIGINAL_DST is used anyway to lookup the original port in deriveConnectAuthority.
There was a problem hiding this comment.
Yeah I think that's a good clean up- originally I was testing http first and had one redirect per protocol.
9b63f65 to
e678eeb
Compare
| const ( | ||
| egressCapturePort = uint16(15001) | ||
| egressOriginalHTTPPort = uint16(80) | ||
| egressOriginalHTTPSPort = uint16(443) |
There was a problem hiding this comment.
Is there any specific reason we're only capturing these ports, and not just redirecting everything?
There was a problem hiding this comment.
So right now ateom derives the CONNECT authority from HTTP Host header for port 80 OR TLS SNI for port 443. For other ports, the current capture code just falls back to raw original destination IP:port.
We could support trying to figure out the authority with other ports, but it would be a little more complicated because we'd need some sort of classifier:
- try TLS SNI on any port
- try HTTP Host on any port
- fall back to recent per-actor DNS correlation
- if still can't figure it out, use IP:port
| egressOriginalHTTPPort = uint16(80) | ||
| egressOriginalHTTPSPort = uint16(443) |
There was a problem hiding this comment.
Same question about only supporting these 2 ports
| cleanupErr = errors.Join(cleanupErr, fmt.Errorf("while removing actor nftables rules: %w", err)) | ||
| slog.WarnContext(ctx, "Failed to remove actor nftables rules; continuing actor netns cleanup", slog.Any("err", err)) | ||
| } | ||
| if s.egressCapture != nil { |
There was a problem hiding this comment.
Is there a time we would want no egress capture?
There was a problem hiding this comment.
That's a good question- I think in general, egress capture should always be enabled but you can imagine scenarios where that doesn't make sense:
- No PEP egress is setup. In this case I think traffic should still be captured by the ateom proxy, but not forwarded to any PEP
- Flexibility for workloads using traffic we do not capture yet (UDP/QUIC, non-80/443 TCP if we don't want to do the classifier, etc.)
There was a problem hiding this comment.
Mike Morris (@mikemorris) This is kind of related to the discussion we had in slack. Like you brought up, if we're expecting all traffic to always egress through a PEP, then the PEP registration can be a first-class API. Are there cases though where we don't want the traffic to get captured/egress through a PEP?
| return counter | ||
| } | ||
|
|
||
| const defaultEgressURL = "https://httpbin.org/get" |
There was a problem hiding this comment.
Why was this added to the counter, seems like we should have a different demo potentially
There was a problem hiding this comment.
It seemed more lightweight to just have a new endpoint under the counter demo for future testing, but we can make it a separate demo!
There was a problem hiding this comment.
Should we name this package egress, or maybe tunnel?
| // Keep tunnel protocol support behind factories so additional transports | ||
| // such as HBONE can plug in without changing capture/listener logic. |
There was a problem hiding this comment.
I know we discussed keeping the protocol generic, but do we need to start there, it makes the first merge more confusing to read, and I'm not clear if it's actually necessary
There was a problem hiding this comment.
Yeah I was hoping to show how it could be extended in the future, but let me clean it up.
Signed-off-by: npolshakova <nina.polshakova@solo.io>
Signed-off-by: npolshakova <nina.polshakova@solo.io>
Signed-off-by: npolshakova <nina.polshakova@solo.io>
0995370 to
2e910e3
Compare
POC for #126
Based on design discussed in https://docs.google.com/document/d/1KmpIFu2gnqy9gp95wASgIo_vkJ_dA1DZckV8upET6bs/edit?usp=sharing
Summary:
This is a proof of concept egress capture path for actors. It introduces a reusable
internal/egresscapturepackage that:Hostor TLS SNI, and opens a CONNECT-style tunnel to a configured PEP address.The gVisor and microvm runtimes wire this into actor network setup by redirecting actor HTTP/HTTPS egress traffic to local capture ports. Agentgateway is used as the receiving proxy to prove that captured actor traffic reaches the tunnel endpoint.
Notes:
x-ate-actor-idx-ate-actor-templatex-ate-actor-template-namespacex-ate-original-destinationx-ate-connect-authority