We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
enbi /
OK, so Docker for Mac’s kind doesn’t do anything Spegel-y; there’s a registry mirror, but that just pulls one-way from the host’s cache. Building and loading on one Node won’t make it available to the other.
Let’s get back to setting image pulls as normal, and try to detect the case where we load the wrong arch.
OK, so going forward I think we do the following:
- create NixBuilds in enbi’s namespace, reinstate the crossowner ref.
- don’t watch objects in other namespaces.
- annotation controller can automatically create the NixBuild in enbi; maybe add an annotation to the Deployment to track the latest one (or an Event or something).
- we’ll want to generate random names for them much like replicaset pods etc.
XXX! Not setting the job owner means we don’t automatically reconcile the NixBuild when they complete.
Few things to work out here. Nonetheless, the build and load was a success :)
Cross-namespace owner references are disallowed, so we can’t put the build job
in enbi if we want it to belong to its NixBuild.
We’ll just create them in the target ns for now.
… OK but the same applies to the PVC. (I was happy to just give up and let a thousand ConfigMaps bloom, but this is more serious.)
Do we make NixBuilds cluster-scoped? Do we tell the user they must create the
NixBuilds in the enbi ns? That’s kind of meh.
Or do we perhaps just not set the owner ref? We’ll need to do teardown of any dependents ourselves, but that’s OK, we can do that.
OK, last two steps:
-
Finish baseline tests.
What can we meaningfully test in an environment without nodes, PV assignment, etc.?
I think I want to work this out a bit later; maybe e2e tests on a real deployment will have to do — or we just check that objects are progressed through their intended states and we manipulate Nodes, PVs etc. around them as expected.
-
[-] Deployable.
We have a Nix build ready to go, so it’s really just a matter of deploying. We’ll need to stick the CRDs in a Kustomization with a dependency on them etc.
-
updating the status via HTTP is not necessarily straight forward: the controller (when just
make runon seraphim) isn’t accessible to the build pod. it’s nice to be able to run locally; developing with that being an impossibility seems Bad. is there a hacky in-dev way we can support as well, or a better structure overall?-
a lot of the advice online boiled down to “stick results in S3”, which at least solves this issue (by virtue of the store being configurable).
unless i can think of a better way by the time we’ve made dinner, might as well give it a shot!
-
ideas:
- ok, put it in a configurable S3-like store
- we could also have the controller embed an S3-like store (interface) so that in prod it can point to itself
- or, the pod itself could listen, and the controller connect to it!
pretty easy to run whatever on it given we have
nix runavailable!- actually that’s still kinda messed up in kind on mac; it’s not trivial to connect to a pod (have to kubectl port-forward or whatever.)
-
looks like we’re doing it the boring way!
-
-
envtest doesn’t have Nodes, most likely won’t provision PVs, etc.
That works!
Last steps for fully usable MVP:
- The build/load distinction is a bit false since we’re streaming into load anyway. Collapse.
- Update status with results when done, including image size etc.
- Ensure the resulting image is tagged the way we want; we let the NixBuild creator specify the tag, but we’re not applying that ourselves anywhere, and are just using whatever is in the image.
- [-] Tests.
- Make it deployable, test on cassax.
After that we try moving to “annotate a Deployment” and get rid of the CRD.
OK but how do we find the ctr?
On kind: /usr/local/bin/ctr on the node.
On k3s: /nix/store/rnyp7q13cnxw9a7v0ckrxij0vc0bbd7c-k3s-1.33.1+k3s1/bin/ctr or whatever.
Let’s just pull one from Nixpkgs. k3s and containerd both have it.
Whichever’s smaller.
containerd.
Now we need to set --address correctly, which depends a little on the node.
We can just mount the whole host /run in somewhere nice, then check.
deploy-image doesn’t do anything for us.
crictl can’t load images, so we’ll use the unsupported ctr command.
ctr --namespace k8s.io images import RESULT (or -).
We still can’t do this in the controller, since, as before, the controller may not be running on the node in question. It’s another Job, albeit one that’ll need access to the containerd socket at a minimum.
On kind: /run/containerd/containerd.sock.
On k3s: /run/k3s/containerd/containerd.sock.
How are these defaults getting found?
On k3s, ctr is the k3s binary, so it’s presumably compiled with a good
default. The path isn’t directly locatable in the binary; I’m guessing it’s put
together by string formatting.
On kind, its path is visible in the binary.
We can probably assume that running the node’s ctr will locate its socket
automatically. Can we run in the host namespace?
Privileged container which mounts a hostPath volume.
Does the entire load take place over the socket?
We might be able to use local-overlay-store to have a writable store in our
PVC which doesn’t interfere/replace the one that comes with the workload image.
Alternatively, a chroot store would be faster to get going.
Yeaaaah :):):)
THOUGHTS:
If we want to get rid of the CRD: can use annotations on deployments/pods/whatever to signal how to build the target image/from where.
We don’t need to do a separate clone phase in its own PV; we can just nix build TARGETURL#blah and put the cache (and/or store) in the PV instead!
Given this, do we still care to support using the system daemon? I think we
can and should where possible — otherwise we’ll double up some fairly large
stores. We should check if there’ll be a problem using the Nix binary provided
by the nixos/nix image with a daemon from a different distribution/version.
/root/.cache/nix/ contains good stuff (e.g. ./gitv3/blah has git fetches).
NEXT STEPS:
Look into deploy-image, start reading its code to see if it does something like we want wrt. deploying the actual workload container for clone/update/build:
See also the sample project that uses it:
https://github.com/kubernetes-sigs/kubebuilder/blob/master/testdata/project-v4-with-plugins/PROJECT
Sure whatever:
https://book.kubebuilder.io/plugins/available/deploy-image-plugin-v1-alpha
Look into ENVTEST.
Open questions:
- Do we want to build on-node? This lets us reuse the system Nix daemon, store, cache, etc., but also requires the node to have Nix installed. (Hard to test on seraphim, since kind’s nodes won’t have Nix.)
The controller will, eventually, be running in its own pod. It’ll need to access the Nix daemon somehow. If we wanted to do the build in a pod, it’d need to run a Nix daemon in there? Maybe?
The thing is, we’re only running one controller per cluster, or at most per control plane node (?); e.g. if enbi is running on cass only, then we can’t do any of the build steps in the controller itself anyway; we’ll want to be building on kala too!
So we do need to run in a pod; either we pass a socket to the docker daemon
in, or run the build inside the container entirely. A socket would be nicer;
it may be enough to just mount /nix (RO) and run with NIX_REMOTE=daemon.
Forward NIX_PATH, and add the (resolved) path of the local nix binary to the
container to find it in the store.
This still won’t work in our development environment on kind. Is supporting
both foolish? Hope not. We can use nixos/nix to do the build without Nix
on-node.
Let’s start with the dev one; first we need to acquire the source.
First: where/how do we do the clone? Need to spawn a pod which does the clone; then we need to store it somewhere. Controller probably needs its own PV, to keep clones in long-term.
Ideally we can have multiple builds on the same host, targetting the same clone, and one just waits while the other does the clone.
enbi
// TODO(user): Add simple overview of use/purpose
Description
// TODO(user): An in-depth paragraph about your project and overview of use
Getting Started
Prerequisites
- go version v1.23.0+
- docker version 17.03+.
- kubectl version v1.11.3+.
- Access to a Kubernetes v1.11.3+ cluster.
To Deploy on the cluster
Build and push your image to the location specified by IMG:
make docker-build docker-push IMG=<some-registry>/enbi:tag
NOTE: This image ought to be published in the personal registry you specified. And it is required to have access to pull the image from the working environment. Make sure you have the proper permission to the registry if the above commands don’t work.
Install the CRDs into the cluster:
make install
Deploy the Manager to the cluster with the image specified by IMG:
make deploy IMG=<some-registry>/enbi:tag
NOTE: If you encounter RBAC errors, you may need to grant yourself cluster-admin privileges or be logged in as admin.
Create instances of your solution You can apply the samples (examples) from the config/sample:
kubectl apply -k config/samples/
NOTE: Ensure that the samples has default values to test it out.
To Uninstall
Delete the instances (CRs) from the cluster:
kubectl delete -k config/samples/
Delete the APIs(CRDs) from the cluster:
make uninstall
UnDeploy the controller from the cluster:
make undeploy
Project Distribution
Following the options to release and provide this solution to the users.
By providing a bundle with all YAML files
- Build the installer for the image built and published in the registry:
make build-installer IMG=<some-registry>/enbi:tag
NOTE: The makefile target mentioned above generates an ‘install.yaml’ file in the dist directory. This file contains all the resources built with Kustomize, which are necessary to install this project without its dependencies.
- Using the installer
Users can just run ‘kubectl apply -f ’ to install the project, i.e.:
kubectl apply -f https://raw.githubusercontent.com/<org>/enbi/<tag or branch>/dist/install.yaml
By providing a Helm Chart
- Build the chart using the optional helm plugin
kubebuilder edit --plugins=helm/v1-alpha
- See that a chart was generated under ‘dist/chart’, and users can obtain this solution from there.
NOTE: If you change the project, you need to update the Helm Chart using the same command above to sync the latest changes. Furthermore, if you create webhooks, you need to use the above command with the ‘–force’ flag and manually ensure that any custom configuration previously added to ‘dist/chart/values.yaml’ or ‘dist/chart/manager/manager.yaml’ is manually re-applied afterwards.
Contributing
// TODO(user): Add detailed information on how you would like others to contribute to this project
NOTE: Run make help for more information on all potential make targets
More information can be found via the Kubebuilder Documentation
License
Copyright 2025.
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.