Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make GetActiveNetworkForNamespace use a controller #4662

Merged
merged 2 commits into from
Sep 20, 2024

Conversation

trozet
Copy link
Contributor

@trozet trozet commented Aug 29, 2024

Leverages the NAD Controller to also track primary networks. This should be faster than listing all NADs in a namespace and iterating through them to find the primary network for a namespace.

@trozet trozet requested a review from a team as a code owner August 29, 2024 01:54
@trozet trozet requested a review from npinaeva August 29, 2024 01:54
@github-actions github-actions bot added the area/unit-testing Issues related to adding/updating unit tests label Aug 29, 2024
@trozet trozet force-pushed the fix_active_network_tracking branch 2 times, most recently from 16c7a46 to 4765fad Compare August 29, 2024 03:14
@trozet
Copy link
Contributor Author

trozet commented Aug 29, 2024

I should probably add some unit tests for network attachment definition controller, but can at least run this through scale and see if its an improvement.

@tssurya @jcaamano @dceara fyi

@trozet trozet changed the title [WIP] Make GetActiveNetworkForNamespace use a controller Make GetActiveNetworkForNamespace use a controller Aug 29, 2024
@tssurya tssurya added area/scale&performance All issues related to scale & performance feature/user-defined-network-segmentation All PRs related to User defined network segmentation labels Aug 30, 2024
@github-actions github-actions bot added the feature/services&endpoints All issues related to the Servces/Endpoints API label Sep 3, 2024
@trozet trozet force-pushed the fix_active_network_tracking branch 4 times, most recently from aefe4f8 to 716d844 Compare September 5, 2024 02:21
Copy link
Contributor

@jcaamano jcaamano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just comments on the NAD controller first.

@qinqon
Copy link
Contributor

qinqon commented Sep 12, 2024

Hey @trozet maybe labeling the NAD when it's created by UserDefinedNetwork controller with role=primary alow for fast listing instead of iterating ?.

@trozet
Copy link
Contributor Author

trozet commented Sep 12, 2024

Hey @trozet maybe labeling the NAD when it's created by UserDefinedNetwork controller with role=primary alow for fast listing instead of iterating ?.

not sure what iterating you are referring to

@jcaamano
Copy link
Contributor

Full review, so no more comments.

@jcaamano
Copy link
Contributor

I have one extra comment, sorry.

I think it would be useful to have a method NetInfo.GetNads(namespace string) []string that guarantees it only returns a single NAD for primary networks, for the network controllers to use.

It would be directly usable there, because we already have a NetInfo, to be aware of what is the NAD serving in that namespace. Otherwise we would have to use GetActiveNetworkForNamespace just to get a NetInfo with the filtered NADs, which means we would need to be passing around NAD controller to more places, including the family of utility functions PodNadNames, GetPodIPsOfNetwork, ...

The alternative implementation for those methods would be to parse the pod annotation and find the matching NAD from NetInfo.GetNads() but would a pity if we already have that info in memory.

@jcaamano
Copy link
Contributor

jcaamano commented Sep 13, 2024

I have one extra comment, sorry.

I think it would be useful to have a method NetInfo.GetNads(namespace string) []string that guarantees it only returns a single NAD for primary networks, for the network controllers to use.

It would be directly usable there, because we already have a NetInfo, to be aware of what is the NAD serving in that namespace. Otherwise we would have to use GetActiveNetworkForNamespace just to get a NetInfo with the filtered NADs, which means we would need to be passing around NAD controller to more places, including the family of utility functions PodNadNames, GetPodIPsOfNetwork, ...

The alternative implementation for those methods would be to parse the pod annotation and find the matching NAD from NetInfo.GetNads() but would a pity if we already have that info in memory.

Wait, there is an alternative implementation which is to parse the namespace from the nad namespaced names from GetNads() and match the that to the pod namespace. I still think NetInfo.GetNads(namespace string) []string woudl be useful but not as important. However, all this depends on some consistency I asked for in #4662 (comment)

@trozet trozet force-pushed the fix_active_network_tracking branch 3 times, most recently from b38aeb3 to be9d053 Compare September 19, 2024 01:42
@trozet
Copy link
Contributor Author

trozet commented Sep 19, 2024

flake?


24-09-19T02:03:14.1070652Z �[91m�[1m[Fail] �[0m�[90mHybrid SDN Master Operations �[0m�[91m�[1m[It] handles a HO node is switched to a OVN node �[0m
2024-09-19T02:03:14.1072499Z �[37m/home/runner/work/ovn-kubernetes/ovn-kubernetes/go-controller/pkg/ovn/hybrid_test.go:1387�[0m

2024-09-19T01:58:18.6956889Z �[91m�[1m• Failure [2.579 seconds]�[0m
2024-09-19T01:58:18.6957661Z Hybrid SDN Master Operations
2024-09-19T01:58:18.6958864Z �[90m/home/runner/work/ovn-kubernetes/ovn-kubernetes/go-controller/pkg/ovn/hybrid_test.go:134�[0m
2024-09-19T01:58:18.6960288Z   �[91m�[1mhandles a HO node is switched to a OVN node [It]�[0m
2024-09-19T01:58:18.6961821Z   �[90m/home/runner/work/ovn-kubernetes/ovn-kubernetes/go-controller/pkg/ovn/hybrid_test.go:1218�[0m
2024-09-19T01:58:18.6962761Z 
2024-09-19T01:58:18.6963039Z   �[91mTimed out after 2.001s.
2024-09-19T01:58:18.6963554Z   Expected
2024-09-19T01:58:18.6964212Z       <[]*nbdb.LogicalRouterStaticRoute | len:2, cap:2>: [
2024-09-19T01:58:18.6964925Z           {
2024-09-19T01:58:18.6965906Z               UUID: "8a84a82f-eaf2-4a41-8ffe-3af1e693fef0",
2024-09-19T01:58:18.6966747Z               BFD: nil,
2024-09-19T01:58:18.6967295Z               ExternalIDs: {
2024-09-19T01:58:18.6968519Z                   "name": "hybrid-subnet-node1:node-windows",
2024-09-19T01:58:18.6969211Z               },
2024-09-19T01:58:18.6969788Z               IPPrefix: "10.1.3.0/24",
2024-09-19T01:58:18.6970484Z               Nexthop: "10.1.1.3",
2024-09-19T01:58:18.6971114Z               Options: nil,
2024-09-19T01:58:18.6971719Z               OutputPort: nil,
2024-09-19T01:58:18.6972307Z               Policy: nil,
2024-09-19T01:58:18.6972901Z               RouteTable: "",
2024-09-19T01:58:18.6973402Z           },
2024-09-19T01:58:18.6973786Z           {
2024-09-19T01:58:18.6974602Z               UUID: "4a94b693-d276-40b6-b645-403474ae8b4d",
2024-09-19T01:58:18.6975312Z               BFD: nil,
2024-09-19T01:58:18.6975871Z               ExternalIDs: {
2024-09-19T01:58:18.6976766Z                   "name": "hybrid-subnet-node1-gr",
2024-09-19T01:58:18.6977423Z               },
2024-09-19T01:58:18.6978155Z               IPPrefix: "10.1.3.0/24",
2024-09-19T01:58:18.6978874Z               Nexthop: "100.64.0.1",
2024-09-19T01:58:18.6979669Z               Options: nil,
2024-09-19T01:58:18.6980281Z               OutputPort: nil,
2024-09-19T01:58:18.6980947Z               Policy: nil,
2024-09-19T01:58:18.6981536Z               RouteTable: "",
2024-09-19T01:58:18.6982058Z           },
2024-09-19T01:58:18.6982435Z       ]
2024-09-19T01:58:18.6982916Z   to have length 0�[0m
2024-09-19T01:58:18.6983203Z 
2024-09-19T01:58:18.6984071Z   /home/runner/work/ovn-kubernetes/ovn-kubernetes/go-controller/pkg/ovn/hybrid_test.go:1387

2024-09-19T01:58:16.8105818Z E0919 01:58:16.724456   35874 obj_retry.go:671] Failed to update *v1.Node, old=node-windows, new=node-windows, error: [macAddress annotation not found for node node-windows; error: could not find "k8s.ovn.org/node-mgmt-port-mac-addresses" annotation, failed to set up hybrid overlay logical switch port for node-windows: cannot set up hybrid overlay ports, distributed router ip is nil, k8s.ovn.org/l3-gateway-config annotation not found for node "node-windows"]

https://github.com/ovn-org/ovn-kubernetes/actions/runs/10932941999/job/30350584438?pr=4662

Copy link
Contributor

@jcaamano jcaamano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments

@@ -72,6 +75,7 @@ func NewCNIServer(factory factory.NodeWatchFactory, kclient kubernetes.Interface
}

if util.IsNetworkSegmentationSupportEnabled() {
s.nadController = nadController
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still have a use for the nad lister below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont think so. I'll see if I can remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

go-controller/pkg/factory/factory.go Show resolved Hide resolved
go-controller/pkg/libovsdb/ops/switch.go Show resolved Hide resolved
}

// NetAttachDefinitionController handles namespaced scoped NAD events and
// manages cluster scoped networks defined in those NADs. NADs are mostly
// referred from pods to give them access to the network. Different NADs can
// define the same network as long as those definitions are actually equal.
// Unexpected situations are handled on best effort basis but improper NAD
// adminstration can lead to undefined behavior in referred from running pods.
// administration can lead to undefined behavior in referred from running pods.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/in/if

👼

Comment on lines 345 to 356
for _, nadName := range networkNADs {
nadNs, _, err := cache.SplitMetaNamespaceKey(nadName)
if err != nil {
klog.Errorf("Cannot parse NAD reference: %q on network info: %q", nadName, netName)
continue
}
if nadNs == namespace {
filteredNADs = append(filteredNADs, nadName)
}
}
n := util.CopyNetInfo(network)
n.SetNADs(filteredNADs...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we already had the NADs per namespace in NetInfo, which is an information we already come by naturally in sync method, we wouldn't have to do this search every time GetActiveNetworkForNamespace is called

Copy link
Contributor

@jcaamano jcaamano Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also wondering...should we disallow multiple NADs per namespace on a primary network?

So there could be multiple NADs per namespace for a primary network as long as only one of them is flagged as primary. That would be an interesting use case if, let's say, selected pods want to customize how they attach to the primary network, for example use sr-iov. The one flagged as primary would be used by default. The others would be used through the network selection element annotation.

We currently disallow the network selection element annotation for primary networks so there's actually no use for that. However we can't prevent the mere existence of multiple NADs per namespace for a primary network. So we either have a way to tell around which of the NADs is really primary or we don't make anyone aware of those NADs since they have no use for us.

Copy link
Contributor

@jcaamano jcaamano Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To disallow, we would just have to check against the ensured network instead of the nad network:

if ensureNetwork.IsPrimaryNetwork() && primaryNAD != "" && primaryNAD != key

Then in this method you would just have to do

n.SetNADs(primaryNAD)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we have a way to really tell which is the active one, since we have the primaryNADs map? Why don't i just use that?

Copy link
Contributor

@jcaamano jcaamano Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For GetActiveNetworkForNamespace that is enough.

But we also have the network controllers which have a NetInfo potentially containing references for multiple NADs per namespace for a primary network.

If GetActiveNetworkForNamespace just returns the one, the NetInfo that network controllers are using should just contain the one for the namespace, for consistency I guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make that change and then we can avoid having to loop. We can just return that primary NAD until someone complains ;)

klog.Warningf("Skipping nad '%s/%s' as active network after failing parsing it with %v", nad.Namespace, nad.Name, err)
continue
}
func NewUnknownActiveNetworkError(namespace string) *UnknownActiveNetworkError {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we using this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess not anymore after we guarantee the controller only renders a single primary network.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

func (a *PodAllocator) getActiveNetworkForPod(pod *corev1.Pod) (util.NetInfo, error) {
activeNetwork, err := util.GetActiveNetworkForNamespace(pod.Namespace, a.nadLister)
activeNetwork, err := a.nadController.GetActiveNetworkForNamespace(pod.Namespace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to flag some use cases where, if we had per-namespace NAD information in NetInfo, we probably wouldn't need access to the nad controller to query this information. This is one of them.

We would lose the protection of actually querying UDNs, true.

Just for awareness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like this was intentional for some reason:

func (a *PodAllocator) GetNetworkRole(pod *corev1.Pod) (string, error) {
	if !util.IsNetworkSegmentationSupportEnabled() {
		// if user defined network segmentation is not enabled
		// then we know pod's primary network is "default" and
		// pod's secondary network is NOT its primary network
		if a.netInfo.IsDefault() {
			return types.NetworkRolePrimary, nil
		}
		return types.NetworkRoleSecondary, nil
	}
	activeNetwork, err := a.getActiveNetworkForPod(pod)
	if err != nil {
		return "", err
	}
	if activeNetwork.GetNetworkName() == a.netInfo.GetNetworkName() {
		return types.NetworkRolePrimary, nil
	}
	if a.netInfo.IsDefault() {
		// if default network was not the primary network,
		// then when UDN is turned on, default network is the
		// infrastructure-locked network forthis pod
		return types.NetworkRoleInfrastructure, nil
	}
	return types.NetworkRoleSecondary, nil
}

It is comparing the network name of the network found for the pod and the netinfo held in the allocator. @qinqon maybe you wrote the original code and can explain why it is like this? We are wondering why you cant just use the NetInfo on the allocator itself to know whether or not this is a primary or secondary role?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trozet activeNetwork is the pod's namespace primary network so this checks if the pod allocator network is the pod's namespace primary network (default or UDN) and terturn Primary according to that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

However this can be equally achieved if we could query the controller NetInfo for the namespaces served by the network and checking if the one we are concerned about is among them.

Anyway, I will follow up with this on a different PR maybe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense too, thanks @jcaamano

return util.GetActiveNetworkForNamespace(namespace, nadLister)
// and is a wrapper around GetActiveNetworkForNamespace
func (bnc *BaseNetworkController) getActiveNetworkForNamespace(namespace string) (util.NetInfo, error) {
return bnc.nadController.GetActiveNetworkForNamespace(namespace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to flag some use cases where, if we had per-namespace NAD information in NetInfo, we probably wouldn't need access to the nad controller to query this information. This is one of them.

We would lose the protection of actually querying UDNs, true.

Just for awareness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a result of me changing this function to be a receiver for bnc rather than the common network controller (that doesn't have netInfo). However, it looks like this function is used to determine pod wiring for the default controller, so I think makes sense to have the UDN protection here.

trozet and others added 2 commits September 19, 2024 15:09
Leverages the NAD Controller to also track primary networks. This should
be faster than listing all NADs in a namespace and iterating through
them to find the primary network for a namespace.

Refactors network manager and nad controller. Changes the network cache to
only be stored in network manager, rather than having a duplicate cache in
NAD controller.

NAD Controller is modified to only hold references to network names for
NADs, and then queries network manager to get the networks. NAD Controller
and Network Manager are now thread safe.

Tests are also modified:
 - NAD Controller now implemented as an interface so that unit tests can
   implement a fake NAD controller for getting active network
 - Fully functioning OVN controller unit tests still leverage a fully
   functioning NAD controller. NCM may be faked out in these cases.
 - Fake entities moved to testing/nad package

Co-authored-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Signed-off-by: Tim Rozet <trozet@redhat.com>
Previously the function would check the informer cache which provides
(in practical terms) real time access to if there is a primary network
for this namespace as a NAD. This did not cover the case where a UDN
could be generating the NAD, but the NAD has not been created yet. This
commit addresses it by checking for a UDN (only when a NAD is not found
for the namespace).

This should help in UDN cases (the case we really care about) to ensure
that the function returns an accurate result. Especially during times of
delay where perhaps a pod handler and UDN handler are firing almost
simultaneously.

Right now the function will return an error indicating there is a
primary network that hasn't been processed yet, letting the caller know
to retry. We have seen in the past with time sensitive handlers that
they may give up after 5 or so retries before the dependent controller
has finished. In those cases, we usually add a wait in the handler
itself and have around a 200-300ms timeout while we wait for the
dependent handler (in this case the NAD handler) to finish. This commit
may be expanded on in the future to accomodate that where needed.

Signed-off-by: Tim Rozet <trozet@redhat.com>
@trozet
Copy link
Contributor Author

trozet commented Sep 19, 2024

rebased

@qinqon
Copy link
Contributor

qinqon commented Sep 20, 2024

Hey @trozet maybe labeling the NAD when it's created by UserDefinedNetwork controller with role=primary alow for fast listing instead of iterating ?.

not sure what iterating you are referring to

In case this PR is about improving performance when looking for the namespace's primary UDN, we can label de NAD to improve listing performance.

@jcaamano jcaamano merged commit 1bc6214 into ovn-org:master Sep 20, 2024
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/scale&performance All issues related to scale & performance area/unit-testing Issues related to adding/updating unit tests feature/services&endpoints All issues related to the Servces/Endpoints API feature/user-defined-network-segmentation All PRs related to User defined network segmentation
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants