Telepresence Release Notes

Version 2.4.7 (November 24, 2021)

Feature: Injector service-name annotation

The agent injector now supports a new annotation, telepresence.getambassador.io/inject-service-name, that can be used to set the name of the service to be intercepted. This will help disambiguate which service to intercept for when a workload is exposed by multiple services, such as can happen with Argo Rollouts

Feature: Skip the Ingress Dialogue

You can now skip the ingress dialogue by setting the ingress parameters in the corresponding flags.

Feature: Never proxy subnets

The kubeconfig extensions now support a never-proxy argument, analogous to also-proxy, that defines a set of subnets that will never be proxied via telepresence.

Change: Daemon versions check

Telepresence now checks the versions of the client and the daemons and asks the user to quit and restart if they don't match.

Change: No explicit DNS flushes

Telepresence DNS now uses a very short TTL instead of explicitly flushing DNS by killing the mDNSResponder or doing resolvectl flush-caches

Bug Fix: Legacy flags now work with global flags

Legacy flags such as `--swap-deployment` can now be used together with global flags.

Bug Fix: Outbound connection closing

Outbound connections are now properly closed when the peer closes.

Bug Fix: Prevent DNS recursion

The DNS-resolver will trap recursive resolution attempts (may happen when the cluster runs in a docker-container on the client).

Bug Fix: Prevent network recursion

The TUN-device will trap failed connection attempts that results in recursive calls back into the TUN-device (may happen when the cluster runs in a docker-container on the client).

Bug Fix: Traffic Manager deadlock fix

The Traffic Manager no longer runs a risk of entering a deadlock when a new Traffic agent arrives.

Bug Fix: webhookRegistry config propagation

The configured webhookRegistry is now propagated to the webhook installer even if no webhookAgentImage has been set.

Bug Fix: Login refreshes expired tokens

When a user's token has expired, telepresence login will prompt the user to log in again to get a new token. Previously, the user had to telepresence quit and telepresence logout to get a new token.

Version 2.4.5 (October 15, 2021)

Feature: Get pod yaml with gather-logs command

Adding the flag --get-pod-yaml to your request will get the pod yaml manifest for all kubernetes components you are getting logs for ( traffic-manager and/or pods containing a traffic-agent container). This flag is set to false by default.
Feature: Get pod yaml with gather-logs command

Feature: Anonymize pod name + namespace when using gather-logs command

Adding the flag --anonymize to your command will anonymize your pod names + namespaces in the output file. We replace the sensitive names with simple names (e.g. pod-1, namespace-2) to maintain relationships between the objects without exposing the real names of your objects. This flag is set to false by default.
Feature: Anonymize pod name + namespace when using gather-logs command

Feature: Added context and defaults to ingress questions when creating a preview URL

Previously, we referred to OSI model layers when asking these questions, but this terminology is not commonly used. The questions now provide a clearer context for the user, along with a default answer as an example.
Feature: Added context and defaults to ingress questions when creating a preview URL

Feature: Support for intercepting headless services

Intercepting headless services is now officially supported. You can request a headless service on whatever port it exposes and get a response from the intercept. This leverages the same approach as intercepting numeric ports when using the mutating webhook injector, mainly requires the initContainer to have NET_ADMIN capabilities.

Change: Use one tunnel per connection instead of multiplexing into one tunnel

We have changed Telepresence so that it uses one tunnel per connection instead of multiplexing all connections into one tunnel. This will provide substantial performance improvements. Clients will still be backwards compatible with older managers that only support multiplexing.

Bug Fix: Added checks for Telepresence kubernetes compatibility

Telepresence currently works with Kubernetes server versions 1.17.0 and higher. We have added logs in the connector and traffic-manager to let users know when they are using Telepresence with a cluster it doesn't support.

Bug Fix: Traffic Agent security context is now only added when necessary

When creating an intercept, Telepresence will now only set the traffic agent's GID when strictly necessary (i.e. when using headless services or numeric ports). This mitigates an issue on openshift clusters where the traffic agent can fail to be created due to openshift's security policies banning arbitrary GIDs.

Version 2.4.4 (September 27, 2021)

Feature: Numeric ports in agent injector

The agent injector now supports injecting Traffic Agents into pods that have unnamed ports.

Feature: New subcommand to gather logs and export into zip file

Telepresence has logs for various components (the traffic-manager, traffic-agents, the root and user daemons), which are integral for understanding and debugging Telepresence behavior. We have added the telepresence gather-logs command to make it simple to compile logs for all Telepresence components and export them in a zip file that can be shared to others and/or included in a github issue. For more information on usage, run telepresence gather-logs --help .
Feature: New subcommand to gather logs and export into zip file

Feature: Pod CIDR strategy is configurable in Helm chart

Telepresence now enables you to directly configure how to get pod CIDRs when deploying Telepresence with the Helm chart. The default behavior remains the same. We've also introduced the ability to explicitly set what the pod CIDRs should be.

Bug Fix: Compute pod CIDRs more efficiently

When computing subnets using the pod CIDRs, the traffic-manager now uses less CPU cycles.

Bug Fix: Prevent busy loop in traffic-manager

In some circumstances, the traffic-manager's CPU would max out and get pinned at its limit. This required a shutdown or pod restart to fix. We've added some fixes to prevent the traffic-manager from getting into this state.

Bug Fix: Added a fixed buffer size to TUN-device

The TUN-device now has a max buffer size of 64K. This prevents the buffer from growing limitlessly until it receies a PSH, which could be a blocking operation when receiving lots of TCP-packets.

Bug Fix: Fix hanging user daemon

When Telepresence encountered an issue connecting to the cluster or the root daemon, it could hang indefintely. It now will error correctly when it encounters that situation.

Bug Fix: Improved proprietary agent connectivity

To determine whether the environment cluster is air-gapped, the proprietary agent attempts to connect to the cloud during startup. To deal with a possible initial failure, the agent backs off and retries the connection with an increasing backoff duration.

Bug Fix: Telepresence correctly reports intercept port conflict

When creating a second intercept targetting the same local port, it now gives the user an informative error message. Additionally, it tells them which intercept is currently using that port to make it easier to remedy.

Version 2.4.3 (September 15, 2021)

Feature: Environment variable TELEPRESENCE_INTERCEPT_ID available in interceptor's environment

When you perform an intercept, we now include a TELEPRESENCE_INTERCEPT_ID environment variable in the environment.

Bug Fix: Improved daemon stability

Fixed a timing bug that sometimes caused a "daemon did not start" failure.

Bug Fix: Complete logs for Windows

Crash stack traces and other errors were incorrectly not written to log files. This has been fixed so logs for Windows should be at parity with the ones in MacOS and Linux.

Bug Fix: Log rotation fix for Linux kernel 4.11+

On Linux kernel 4.11 and above, the log file rotation now properly reads the birth-time of the log file. Older kernels continue to use the old behavior of using the change-time in place of the birth-time.

Bug Fix: Improved error messaging

When Telepresence encounters an error, it tells the user where they should look for logs related to the error. We have refined this so that it only tells users to look for errors in the daemon logs for issues that are logged there.

Bug Fix: Stop resolving localhost

When using the overriding DNS resolver, it will no longer apply search paths when resolving localhost, since that should be resolved on the user's machine instead of the cluster.

Bug Fix: Variable cluster domain

Previously, the cluster domain was hardcoded to cluster.local. While this is true for many kubernetes clusters, it is not for all of them. Now this value is retrieved from the traffic-manager.

Bug Fix: Improved cleanup of traffic-agents

Telepresence now uninstalls traffic-agents installed via mutating webhook when using telepresence uninstall --everything.

Bug Fix: More large file transfer fixes

Downloading large files during an intercept will no longer cause timeouts and hanging traffic-agents.

Bug Fix: Setting --mount to false when intercepting works as expected

When using --mount=false while performing an intercept, the file system was still mounted. This has been remedied so the intercept behavior respects the flag.

Bug Fix: Traffic-manager establishes outbound connections in parallel

Previously, the traffic-manager established outbound connections sequentially. This resulted in slow (and failing) Dial calls would block all outbound traffic from the workstation (for up to 30 seconds). We now establish these connections in parallel so that won't occur.

Bug Fix: Status command reports correct DNS settings

Telepresence status now correctly reports DNS settings for all operating systems, instead of Local IP:nil, Remote IP:nil when they don't exist.

Version 2.4.1 (August 30, 2021)

Feature: External cloud variables are now configurable

We now support configuring the host and port for the cloud in your config.yml. These are used when logging in to utilize features provided by an extension, and are also passed along as environment variables when installing the `traffic-manager`. Additionally, we now run our testsuite with these variables set to localhost to continue to ensure Telepresence is fully fuctional without depeneding on an external service. The SYSTEMA_HOST and SYSTEMA_PORT environment variables are no longer used.
Feature: External cloud variables are now configurable

Feature: Helm chart can now regenerate certificate used for mutating webhook on-demand.

You can now set agentInjector.certificate.regenerate when deploying Telepresence with the Helm chart to automatically regenerate the certificate used by the agent injector webhook.

Change: Traffic Manager installed via helm

The traffic-manager is now installed via an embedded version of the Helm chart when telepresence connect is first performed on a cluster. This change is transparent to the user. A new configuration flag, timeouts.helm sets the timeouts for all helm operations performed by the Telepresence binary.

Change: traffic-manager gets cluster ID itself instead of via environment variable

The traffic-manager used to get the cluster ID as an environment variable when running telepresence connnect or via adding the value in the helm chart. This was clunky so now the traffic-manager gets the value itself as long as it has permissions to "get" and "list" namespaces (this has been updated in the helm chart).

Bug Fix: Telepresence now mounts all directories from /var/run/secrets

In the past, we only mounted secret directories in /var/run/secrets/kubernetes.io. We now mount *all* directories in /var/run/secrets, which, for example, includes directories like eks.amazonaws.com used for IRSA tokens.

Bug Fix: Max gRPC receive size correctly propagates to all grpc servers

This fixes a bug where the max gRPC receive size was only propagated to some of the grpc servers, causing failures when the message size was over the default.

Bug Fix: Updated our Homebrew packaging to run manually

We made some updates to our script that packages Telepresence for Homebrew so that it can be run manually. This will enable maintainers of Telepresence to run the script manually should we ever need to rollback a release and have latest point to an older verison.

Bug Fix: Telepresence uses namespace from kubeconfig context on each call

In the past, Telepresence would use whatever namespace was specified in the kubeconfig's current-context for the entirety of the time a user was connected to Telepresence. This would lead to confusing behavior when a user changed the context in their kubeconfig and expected Telepresence to acknowledge that change. Telepresence now will do that and use the namespace designated by the context on each call.

Bug Fix: Idle outbound TCP connections timeout increased to 7200 seconds

Some users were noticing that their intercepts would start failing after 60 seconds. This was because the keep idle outbound TCP connections were set to 60 seconds, which we have now bumped to 7200 seconds to match Linux's tcp_keepalive_time default.

Bug Fix: Telepresence will automatically remove a socket upon ungraceful termination

When a Telepresence process terminates ungracefully, it would inform users that "this usually means that the process has terminated ungracefully" and implied that they should remove the socket. We've now made it so Telepresence will automatically attempt to remove the socket upon ungraceful termination.

Bug Fix: Fixed user daemon deadlock

Remedied a situation where the user daemon could hang when a user was logged in.

Bug Fix: Fixed agentImage config setting

The config setting images.agentImages is no longer required to contain the repository, and it will use the value at images.repository.

Version 2.4.0 (August 04, 2021)

Feature: Windows Client Developer Preview

There is now a native Windows client for Telepresence that is being released as a Developer Preview. All the same features supported by the MacOS and Linux client are available on Windows.
Feature: Windows Client Developer Preview

Feature: CLI raises helpful messages from Ambassador Cloud

Telepresence can now receive messages from Ambassador Cloud and raise them to the user when they perform certain commands. This enables us to send you messages that may enhance your Telepresence experience when using certain commands. Frequency of messages can be configured in your config.yml.
Feature: CLI raises helpful messages from Ambassador Cloud

Bug Fix: Improved stability of systemd-resolved-based DNS

When initializing the systemd-resolved-based DNS, the routing domain is set to improve stability in non-standard configurations. This also enables the overriding resolver to do a proper take over once the DNS service ends.

Bug Fix: Fixed an edge case when intercepting a container with multiple ports

When specifying a port of a container to intercept, if there was a container in the pod without ports, it was automatically selected. This has been fixed so we'll only choose the container with "no ports" if there's no container that explicitly matches the port used in your intercept.

Bug Fix: $(NAME) references in agent's environments are now interpolated correctly.

If you had an environment variable $(NAME) in your workload that referenced another, intercepts would not correctly interpolate $(NAME). This has been fixed and works automatically.

Bug Fix: Telepresence no longer prints INFO message when there is no config.yml

Fixed a regression that printed an INFO message to the terminal when there wasn't a config.yml present. The config is optional, so this message has been removed.

Bug Fix: Telepresence no longer panics when using --http-match

Fixed a bug where Telepresence would panic if the value passed to --http-match didn't contain an equal sign, which has been fixed. The correct syntax is in the --help string and looks like --http-match=HTTP2_HEADER=REGEX

Bug Fix: Improved subnet updates

The `traffic-manager` used to update subnets whenever the `Nodes` or `Pods` changed, even if the underlying subnet hadn't changed, which created a lot of unnecessary traffic between the client and the `traffic-manager`. This has been fixed so we only send updates when the subnets themselves actually change.

Version 2.3.7 (July 23, 2021)

Feature: Also-proxy in telepresence status

An also-proxy entry in the Kubernetes cluster config will show up in the output of the telepresence status command.

Feature: Non-interactive telepresence login

telepresence login now has an --apikey=KEY flag that allows for non-interactive logins. This is useful for headless environments where launching a web-browser is impossible, such as cloud shells, Docker containers, or CI.
Feature: Non-interactive telepresence login

Bug Fix: Mutating webhook injector correctly hides named ports for probes.

The mutating webhook injector has been fixed to correctly rename named ports for liveness and readiness probes

Bug Fix: telepresence current-cluster-id crash fixed

Fixed a regression introduced in 2.3.5 that caused `telepresence current-cluster-id` to crash.

Bug Fix: Better UX around intercepts with no local process running

Requests would hang indefinitely when initiating an intercept before you had a local process running. This has been fixed and will result in an Empty reply from server until you start a local process.

Bug Fix: API keys no longer show as "no description"

New API keys generated internally for communication with Ambassador Cloud no longer show up as "no description" in the Ambassador Cloud web UI. Existing API keys generated by older versions of Telepresence will still show up this way.
Bug Fix: API keys no longer show as "no description"

Bug Fix: Fix corruption of user-info.json

Fixed a race condition that logging in and logging out rapidly could cause memory corruption or corruption of the user-info.json cache file used when authenticating with Ambassador Cloud.

Bug Fix: Improved DNS resolver for systemd-resolved

Telepresence's systemd-resolved-based DNS resolver is now more stable and in case it fails to initialize, the overriding resolver will no longer cause general DNS lookup failures when telepresence defaults to using it.

Bug Fix: Faster telepresence list command

The performance of telepresence list has been increased significantly by reducing the number of calls the command makes to the cluster.

Version 2.3.3 (July 07, 2021)

Feature: Traffic Manager Helm Chart

Telepresence now supports installing the Traffic Manager via Helm. This will make it easy for operators to install and configure the server-side components of Telepresence separately from the CLI (which in turn allows for better separation of permissions).
Feature: Traffic Manager Helm Chart

Feature: Traffic-manager in custom namespace

As the traffic-manager can now be installed in any namespace via Helm, Telepresence can now be configured to look for the Traffic Manager in a namespace other than ambassador. This can be configured on a per-cluster basis.
Feature: Traffic-manager in custom namespace

Feature: Intercept --to-pod

telepresence intercept now supports a --to-pod flag that can be used to port-forward sidecars' ports from an intercepted pod.
Feature: Intercept --to-pod

Change: Change in migration from edgectl

Telepresence no longer automatically shuts down the old api_version=1 edgectl daemon. If migrating from such an old version of edgectl you must now manually shut down the edgectl daemon before running Telepresence. This was already the case when migrating from the newer api_version=2 edgectl.

Bug Fix: Fixed error during shutdown

The root daemon no longer terminates when the user daemon disconnects from its gRPC streams, and instead waits to be terminated by the CLI. This could cause problems with things not being cleaned up correctly.

Bug Fix: Intercepts will survive deletion of intercepted pod

An intercept will survive deletion of the intercepted pod provided that another pod is created (or already exists) that can take over.

Version 2.3.2 (June 18, 2021)

Feature: Service Port Annotation

The mutator webhook for injecting traffic-agents now recognizes a telepresence.getambassador.io/inject-service-port annotation to specify which port to intercept; bringing the functionality of the --port flag to users who use the mutator webook in order to control Telepresence via GitOps.
Feature: Service Port Annotation

Feature: Outbound Connections

Outbound connections are now routed through the intercepted Pods which means that the connections originate from that Pod from the cluster's perspective. This allows service meshes to correctly identify the traffic.

Change: Inbound Connections

Inbound connections from an intercepted agent are now tunneled to the manager over the existing gRPC connection, instead of establishing a new connection to the manager for each inbound connection. This avoids interference from certain service mesh configurations.

Change: Traffic Manager needs new RBAC permissions

The Traffic Manager requires RBAC permissions to list Nodes, Pods, and to create a dummy Service in the manager's namespace.

Change: Reduced developer RBAC requirements

The on-laptop client no longer requires RBAC permissions to list the Nodes in the cluster or to create Services, as that functionality has been moved to the Traffic Manager.

Bug Fix: Able to detect subnets

Telepresence will now detect the Pod CIDR ranges even if they are not listed in the Nodes.
Bug Fix: Able to detect subnets

Bug Fix: Dynamic IP ranges

The list of cluster subnets that the virtual network interface will route is now configured dynamically and will follow changes in the cluster.

Bug Fix: No duplicate subnets

Subnets fully covered by other subnets are now pruned internally and thus never superfluously added to the laptop's routing table.

Change: Change in default timeout

The trafficManagerAPI timeout default has changed from 5 seconds to 15 seconds, in order to facilitate the extended time it takes for the traffic-manager to do its initial discovery of cluster info as a result of the above bugfixes.

Bug Fix: Removal of DNS config files on macOS

On macOS, files generated under /etc/resolver/ as the result of using include-suffixes in the cluster config are now properly removed on quit.

Bug Fix: Large file transfers

Telepresence no longer erroneously terminates connections early when sending a large HTTP response from an intercepted service.

Bug Fix: Race condition in shutdown

When shutting down the user-daemon or root-daemon on the laptop, telepresence quit and related commands no longer return early before everything is fully shut down. Now it can be counted on that by the time the command has returned that all of the side-effects on the laptop have been cleaned up.

Version 2.3.1 (June 14, 2021)

Feature: DNS Resolver Configuration

Telepresence now supports per-cluster configuration for custom dns behavior, which will enable users to determine which local + remote resolver to use and which suffixes should be ignored + included. These can be configured on a per-cluster basis.
Feature: DNS Resolver Configuration

Feature: AlsoProxy Configuration

Telepresence now supports also proxying user-specified subnets so that they can access external services only accessible to the cluster while connected to Telepresence. These can be configured on a per-cluster basis and each subnet is added to the TUN device so that requests are routed to the cluster for IPs that fall within that subnet.
Feature: AlsoProxy Configuration

Feature: Mutating Webhook for Injecting Traffic Agents

The Traffic Manager now contains a mutating webhook to automatically add an agent to pods that have the telepresence.getambassador.io/traffic-agent: enabled annotation. This enables Telepresence to work well with GitOps CD platforms that rely on higher level kubernetes objects matching what is stored in git. For workloads without the annotation, Telepresence will add the agent the way it has in the past
Feature: Mutating Webhook for Injecting Traffic Agents

Change: Traffic Manager Connect Timeout

The trafficManagerConnect timeout default has changed from 20 seconds to 60 seconds, in order to facilitate the extended time it takes to apply everything needed for the mutator webhook.
Change: Traffic Manager Connect Timeout

Bug Fix: Fix for large file transfers

Fix a tun-device bug where sometimes large transfers from services on the cluster would hang indefinitely
Bug Fix: Fix for large file transfers

Change: Brew Formula Changed

Now that the Telepresence rewrite is the main version of Telepresence, you can install it via Brew like so: brew install datawire/blackbird/telepresence.
Change: Brew Formula Changed

Version 2.3.0 (June 01, 2021)

Feature: Brew install Telepresence

Telepresence can now be installed via brew on macOS, which makes it easier for users to stay up-to-date with the latest telepresence version. To install via brew, you can use the following command: brew install datawire/blackbird/telepresence2.
Feature: Brew install Telepresence

Feature: TCP and UDP routing via Virtual Network Interface

Telepresence will now perform routing of outbound TCP and UDP traffic via a Virtual Network Interface (VIF). The VIF is a layer 3 TUN-device that exists while Telepresence is connected. It makes the subnets in the cluster available to the workstation and will also route DNS requests to the cluster and forward them to intercepted pods. This means that pods with custom DNS configuration will work as expected. Prior versions of Telepresence would use firewall rules and were only capable of routing TCP.
Feature: TCP and UDP routing via Virtual Network Interface

Change: SSH is no longer used

All traffic between the client and the cluster is now tunneled via the traffic manager gRPC API. This means that Telepresence no longer uses ssh tunnels and that the manager no longer have an sshd installed. Volume mounts are still established using sshfs but it is now configured to communicate using the sftp-protocol directly, which means that the traffic agent also runs without sshd. A desired side effect of this is that the manager and agent containers no longer need a special user configuration.
Change: SSH is no longer used

Feature: Running in a Docker container

Telepresence can now be run inside a Docker container. This can be useful for avoiding side effects on a workstation's network, establishing multiple sessions with the traffic manager, or working with different clusters simultaneously.
Feature: Running in a Docker container

Feature: Configurable Log Levels

Telepresence now supports configuring the log level for Root Daemon and User Daemon logs. This provides control over the nature and volume of information that Telepresence generates in daemon.log and connector.log.
Feature: Configurable Log Levels

For a detailed list of all the changes in past releases, please consult the CHANGELOG.