WEBVTT

00:00.000 --> 00:11.720
Okay, we're going to start, unfortunately we have to kick people out, it's not our choice,

00:11.720 --> 00:18.160
but next one is by this guy, I don't know him personally, maybe you know him, and it's

00:18.160 --> 00:22.240
about native OCI container support in the system D.

00:22.240 --> 00:29.880
Okay, hi, I'm Lena Paduring, I work at the startup called Amirable, and yeah, I want

00:29.880 --> 00:35.280
to talk about native OCI support in the system D. As you might know, the system D is the

00:35.280 --> 00:41.680
status manager that kind of is what most of the distributions these days use. OCI, well, I'm

00:41.680 --> 00:49.800
in the container, that room I don't think I have to explain what that is, but yeah, to elaborate

00:49.800 --> 00:55.720
a little bit though, it's three different things. The first of all is the OCI image

00:55.720 --> 01:01.880
format, like the official name that is a specified an OCI image specification. There's

01:01.880 --> 01:06.840
the runtime format, which is defined in the OCI runtime specification, and then there's

01:06.840 --> 01:13.400
the invocation interface, right? So if people just say OCI, they usually mean depending

01:13.400 --> 01:17.560
on the context one of these things, but there's these three things, like, and particularly

01:17.560 --> 01:23.720
the last one is kind of annoying because there's no specification of it, but yeah, it also

01:23.720 --> 01:30.040
exists. Something I wanted to say is OCI already exists in system D, and many people know

01:30.040 --> 01:38.720
this, but basically there has been for a while, system D and Spongebondel, and you can specify

01:38.720 --> 01:44.280
OCI bundle, OCI bundle is usually not the thing that most people who deal with OCI and

01:44.280 --> 01:49.200
Dr. Stark containers come in contact with, because that is the runtime spec thing, but it

01:49.280 --> 01:53.840
has been there for years actually, and you can make it work, but we never advertise this,

01:53.840 --> 02:00.240
and yeah, so nobody really knows it. It only covers the second of these specs. Yeah,

02:00.240 --> 02:07.920
not well, not much used, but yeah, it's also not too useful, because yeah, it won't help you

02:07.920 --> 02:13.040
was actually acquiring the image that you can actually run this way, so it's kind of in a chain

02:13.040 --> 02:18.000
of things how you're going to contain it to run, it's like the middle thing, but we left

02:18.080 --> 02:28.800
the first thing open. Yeah, so it already exists, not too useful, but yeah, it's probably not

02:28.800 --> 02:32.560
even the right place to expose it, because OCI containers are mostly understood as being the

02:32.560 --> 02:40.160
single service thing, right, like where you run one demon inside of OCI container, and that

02:40.160 --> 02:45.520
set. Certainly the end spawn is this container tool, which is mostly focused on running

02:45.680 --> 02:50.960
entire systems, right, like not just one service, but many, like the kind where you SSH

02:50.960 --> 02:56.880
in the kind of the system do runs in. I mean, this is not strictly that way, this is more

02:56.880 --> 03:01.360
by convention, because that's the intended use case, you could also do the single service thing,

03:01.360 --> 03:06.080
but yeah, so it's philosophically the wrong place, it's not technically the wrong place.

03:06.960 --> 03:17.040
So, why even do this? Well, I'm not a fan of the OCI format, like I think it's like the way

03:17.040 --> 03:23.200
how this all put together, this is just tables and the reproducible and the reproducibility and

03:23.200 --> 03:30.400
the cryptographic semantics are very, let's say, uninspired, like even when it was created,

03:30.480 --> 03:36.560
it was like, I don't know, like, I live in this world where we care about it at a stage

03:36.560 --> 03:43.440
in way, we have verified everything, where you have offline secure images, where, yeah,

03:43.440 --> 03:48.960
de-embarrassing these kind of things. So, if I look at OCI, I if you don't really see much

03:48.960 --> 03:54.800
interesting, but then again, it's certainly widely used, right, like everybody who does something

03:54.800 --> 04:00.400
with IT these days and when it comes into contact with OCI sooner or later. So, might not be a

04:00.400 --> 04:06.800
great format, but it's certainly widely used. In a way, it's, you know, in system D, we deal with services,

04:06.800 --> 04:11.600
with system services, and they are mostly written, like you have a unit file, and then you have

04:11.600 --> 04:15.440
some files in disk, and that's it, so they are an alternative service format, if you still will.

04:17.440 --> 04:24.080
System D in many ways already does most of the hot parts, like of the more complex parts

04:24.160 --> 04:28.560
that you need to learn a container, because service management and container management,

04:29.280 --> 04:34.160
not that different, right? Like, it's also, you end up creating namespaces all the time,

04:34.160 --> 04:38.320
you need to put everything in a C group, you do resource management for them.

04:39.040 --> 04:44.400
There is, like, the differences between service management, which is more probably more local

04:44.480 --> 04:54.880
thing, and a service thing, like, there's, it's a blurry distinction, hence, yeah, it's actually,

04:54.880 --> 05:03.440
yeah, all the tough stuff has already been addressed anyway. By making OCI stuff natively,

05:03.440 --> 05:08.240
like, supported in system D, we get better integration with all these things, right? Like, because

05:08.240 --> 05:14.160
you can just use system D, and then you can deploy OCI stuff, and you can immediately use system

05:14.160 --> 05:18.480
control these kind of things to actually numerate the containers, and they're just going to be

05:18.480 --> 05:25.520
service like everything else. So, I also see it as a stepping stone to do new stuff that is

05:25.520 --> 05:30.080
does not exist in the OCI world, or at least does not come in the OCI world, so far.

05:30.080 --> 05:34.480
Because in system D, we are big of doing measurements of the stuff we do, like, TPM stuff,

05:35.200 --> 05:42.560
getting event logs, and things like this. So, if system D generally defines the semantics, like,

05:42.640 --> 05:47.600
in which PCRs things are being done, and how we do the measurements in the first place,

05:47.600 --> 05:53.520
if we have OCI containers as a native concept and system, we can, like, all of that opens up

05:53.520 --> 06:02.160
a completely natural. We can measure the OCI services as they happen, and can have it in the TPM,

06:02.160 --> 06:07.280
event log, and there is no, it's just that, right? Like, nobody has to to think about this.

06:07.280 --> 06:10.880
There are a couple of other things, like, you know, the way how OCI containers currently

06:10.960 --> 06:16.320
always do use a namespacing. I'm not going to explain what use a namespacing is. I hope some of

06:16.320 --> 06:21.360
you at least know it. Let's just say, I'm not a big fan how this is currently done, because it's

06:21.360 --> 06:27.760
involved suit binaries and static allocations of UID ranges. I don't think it's a scabbel way to do

06:27.760 --> 06:32.960
these things and provides, because they, because they don't dynamically allocate these things,

06:33.680 --> 06:38.400
the individual containers are not as separated, it's not isolated as they, I think they should be.

06:39.040 --> 06:46.160
And system, we have come up with different concepts around, like the foreign UID space,

06:46.160 --> 06:52.240
basically, where you can have the ownership of a container on disk, be exclusively by these foreign

06:52.240 --> 06:57.680
use IDs, and then when you actually spawn the container, you get a transient runtime use ID

06:57.680 --> 07:04.480
range of sign, and we map between those two, which in my view is much nicer, because it basically

07:04.560 --> 07:10.800
means that the UIDs are never persisted to disk. The transient ones, and you have full isolation

07:12.240 --> 07:17.840
of the runtime objects, like the way how it should be, because after all, UID-based isolation

07:17.840 --> 07:23.040
is kind of the most fundamental of isolations that we have on Unix, and hence there's a lot of value

07:23.040 --> 07:30.400
in it to be properly able to isolate the container for that as well. Also, for deployments,

07:30.480 --> 07:34.240
it's kind of relevant that you can have this level of more dynamic, right? Because you might

07:34.240 --> 07:38.800
want to be able to pack a lot of containers onto the same node, but still make sure that they're

07:38.800 --> 07:45.840
all nicely isolated, and has this in a scalable fashion. So, anyway, so this is the reason why

07:47.040 --> 07:56.560
I think this makes a lot of sense, right? Integration, isolation, and yeah, all the first

07:56.560 --> 07:59.680
it acts as stepping stone of doing so much more like measurements, and say, like this.

08:01.120 --> 08:06.480
By the way, I have very little time, so ideally, I always do my talks so that we can do questions

08:06.480 --> 08:10.400
right away, but with 20 minutes, I'm going to show how we're going to do this. I'm going to rush through

08:10.400 --> 08:17.760
this, and maybe we can do the questions afterwards in the hallway. So, yeah, so, as mentioned,

08:17.760 --> 08:21.440
like some parts of OSI, we have been doing, have been doing for years, but nobody knows about them.

08:22.400 --> 08:27.920
Something that I have implemented recently is support for downloading OSI, like the support

08:27.920 --> 08:35.440
for the OSI image format. If you are, I hope the attention is to get this merged in the current

08:35.440 --> 08:41.440
cycle, so that it shows up in the next version. What it basically does is it allows you to download

08:41.440 --> 08:47.040
OSI and have it dropped into a directory, and then you can spawn it via an spawner as a

08:47.040 --> 08:52.880
assistant to service, works unprivileged, and all these kind of nice things. Yeah, so it's very close

08:52.880 --> 08:58.640
to being merged. How will this feel like? There's this important control tool, it has been there

08:58.640 --> 09:03.600
for a while. You just specify the container name, how you would do it for Docker, and then it ends up

09:03.600 --> 09:10.480
there, and you can just run it. On the lower level, it turns the layers, layering stuff that

09:10.480 --> 09:17.760
the OSI is built from into something we call dot mstack. I would love to explain what that is,

09:17.760 --> 09:22.160
but I don't think we have the time here. Just let's say it's powerful in your future. It's

09:22.160 --> 09:29.120
independent of OSI that allows you to really nicely put together overlay of s stuff and encoded

09:29.120 --> 09:35.440
in the file system itself. Then there's the other thing the OSI runtime format. This is mentioned

09:35.440 --> 09:40.560
to think that the system the end spawn does already, but again, probably not at the right place.

09:42.000 --> 09:46.880
It's a stuff once it landed on disk, how you have the bundles then, and now you want to run them.

09:47.520 --> 09:54.880
As mentioned, the end spawn can do this. The work from A sets everything up, that end spawn works,

09:54.880 --> 09:59.840
but there's a scope, it's mismatch. The idea is that we're going to have a new tool called

09:59.920 --> 10:05.840
system the OSI. It's just going to read these bundles and turn them into system the services

10:05.840 --> 10:11.520
natively. First, it's important control. You download the thing, and then with system the OSI

10:11.520 --> 10:16.960
just run it as a regular service like any other. Then the source part is that I listed earlier

10:16.960 --> 10:22.880
was the invocation interface, like the runcic command line comma the ability. The same thing

10:22.880 --> 10:27.360
that system the OSI is supposed to be the multicoled binary. System the OSI does not exist yet.

10:27.360 --> 10:32.960
I did not put together PR, but ultimately it's going to be trivial as a nice work. It's not

10:32.960 --> 10:37.200
going to be trivial. It's going to be relatively simple because we already have, first of all,

10:37.200 --> 10:42.800
all the end spawn can already parse all this, and then we have system the run, which already

10:42.800 --> 10:47.360
can run this stuff. So it's just about putting things together, we already have in a different way.

10:47.360 --> 10:52.480
So putting these three things together, we have the complete OSI support, right? Like you can

10:52.560 --> 11:00.800
take it down, you can run it and show up as a system the service, you can do resource management

11:00.800 --> 11:04.800
logging, all of this will be integrated with the rest. The next thing we're then hooking up

11:04.800 --> 11:09.920
was Kubernetes, but my time is not there for this. So I actually managed to go through this

11:09.920 --> 11:17.680
pretty quickly, so I actually do have time for questions. So yeah, let's do questions. I'm kind of

11:17.680 --> 11:27.680
amazed that this was so quick.

11:27.680 --> 11:33.360
The email FS works, it's going on in ICI, it looks quite relevant to your interest, so you're

11:33.360 --> 11:36.640
going to be looking at supporting the email FS images.

11:36.640 --> 11:40.000
Okay, so the question was, I don't have to repeat this now, right? Like because you actually

11:40.080 --> 11:47.920
haven't like. So yeah, this is definitely interesting to us, but I think we want something

11:47.920 --> 11:53.440
even stronger because we want, like for our case, we want the variety stuff, right? Like we

11:53.440 --> 12:00.320
care about offline security and the like putting in make it eros into the OSI stuff is nice

12:01.760 --> 12:07.280
and it's pinned by the hash already, but we kind of wanted also that it can be pinned by the

12:07.360 --> 12:11.840
variety route hash that the kernel then understands because that is useful because then they

12:11.840 --> 12:16.320
kernel can do the measurements and sync like this and we get the the progress it up. So yeah,

12:16.320 --> 12:22.080
I think it's going into the right direction. I don't think it's going far enough in the way I see it

12:22.080 --> 12:27.040
is that ultimately we probably want something that we call the DDIs, which are basically disc images

12:27.040 --> 12:37.040
that are carry the eros carry a variety data thing and carry the little JSON that has the signature

12:37.120 --> 12:44.240
of it and that's kind of what I want to focus on. For OSI downloads, we'll you support all the

12:44.240 --> 12:50.400
authentication plugins. It's not in the spec, but you need to have all of these binaries for every

12:50.400 --> 13:01.120
cloud to be able to authenticate. Yeah. That's trivial, yeah.

13:02.080 --> 13:06.080
That's it.

13:17.520 --> 13:23.200
They're seem to be some cap. They're seem to be some kind of overlap with spotman. They're some

13:23.200 --> 13:27.600
synergy or some work together between podman team and what you are doing on systemy.

13:28.560 --> 13:33.120
So I think I have trouble understanding everything, but my understanding was that there was a

13:33.120 --> 13:41.280
question about podman. And if there is some synergy with podman. Yeah. Well podman exists.

13:41.280 --> 13:45.760
Certainly podman does different things, right? Like it's a docker like interface and a wrapper

13:45.760 --> 13:51.760
around runcy. The stuff that I see is mostly that whatever we're doing here is kind of replacing

13:51.840 --> 13:57.280
for the runcy part so that ideally you could run podman on top of it if you care about

13:57.280 --> 14:02.240
docker like semantics. So if that is an explanation. I don't think we care about docker like

14:02.240 --> 14:06.320
semantics at all here. Sorry. I don't think we care about docker like semantics at all here.

14:06.320 --> 14:12.800
But are you going to replace the pupil at basically in the future? Is that on your roadmap?

14:14.880 --> 14:20.400
I'm not a Cuban anti-sperson. I'm a low level or ass person. Like so there's definitely

14:20.400 --> 14:24.400
going to be a hook up to this. How this precisely looks like we'll have to see. I'm not going to

14:24.400 --> 14:28.240
work on this because I'm not a Cuban anti-sperson. Let's just say we want to make sure this is

14:28.240 --> 14:32.240
nicely integrated in the end, right? So that the end goal definitely is you have system D. You can

14:32.240 --> 14:35.440
run the containers in the lower level and then you put Cuban antisentop and you can use a classic

14:35.440 --> 14:40.720
Cuban antiseptop. But you know, need anything else, right? Like you have those two components and

14:40.720 --> 14:46.720
that's just this. Follow up on that then. Do you have any plans for integration with spiffy and or

14:46.880 --> 14:52.960
spire? No framework. We've found any companies you might know and this is certainly

14:52.960 --> 14:57.760
a topic that has come up a lot so we will have something that I guess. But I'm not going to talk

14:57.760 --> 15:02.560
too much about our plans but we're very well aware of these kind of things and we think

15:03.840 --> 15:08.480
they should be something that we can do in the west itself because they are concepts that are

15:08.480 --> 15:14.720
not specific to containers and think like this for the whole last. So I've been finding

15:14.720 --> 15:22.640
myself in situation of rebuilding a lot of these tools. I worked on Docker early on. I am now

15:23.520 --> 15:28.880
building a container management a top of system D, but system D is optional which means I'm now

15:29.440 --> 15:33.440
reproducing all the parts of system D in the main reason is because I don't want to be tied to Linux.

15:33.440 --> 15:38.160
So if you're going to support virtual machines and you're also going to support containers,

15:39.200 --> 15:43.520
what are your thoughts of the coupling system D front of Linux? No.

15:45.680 --> 15:50.800
Like I don't know like system D is using Linux API and that's the only reason why I can do what

15:50.800 --> 16:01.520
I can do it like use C groups and in the second one. I'm working with him all the time he's

16:01.520 --> 16:08.960
the longest kernel guy implement all the wishes that I have. This is not going to happen with any other

16:09.040 --> 16:16.320
operating system. I'm still not actually reasonable. Yeah. So anyway so no.

16:24.400 --> 16:30.000
I mean there are illusions in the wider community that you could do all this stuff completely abstracted

16:30.000 --> 16:35.360
that postsick so whatever is sufficient for this. I'm not a believer in this, right? Like these

16:36.320 --> 16:42.480
shortcomings of postsicks. Like I mean the basic concepts are just so terrible like a PAD.

16:44.080 --> 16:48.640
No that's just yeah you can't do this. Anyway uh something else?

17:06.320 --> 17:11.520
What about networking for the containers? Can you repeat this?

17:11.520 --> 17:16.080
So yeah you have a question how will the networking site work there inside the containers?

17:17.280 --> 17:21.760
Okay so the question was regarding the networking site. Let's a good question.

17:23.680 --> 17:30.720
We're going to add a little bit of infrastructure there. I mean we found this company and let's

17:30.800 --> 17:34.880
talk about this recently. So we'll have something there. Let's just say

17:36.480 --> 17:40.960
I mean it already like system you already have some kind of integration that you can have like

17:40.960 --> 17:45.280
your network namespace and think like this and hook it up with some certain things. It's not nice right

17:45.280 --> 17:51.840
now to hook this up with networking that is independent of the container because you have to do a lot

17:51.840 --> 17:56.160
of manual steps right now. The wheel suddenly makes this cleaner like so that there is a

17:57.120 --> 18:00.880
there's a concept for for me because I'm very recently I was looking at it.

18:02.160 --> 18:06.000
Okay my time thank you everyone and if you've heard a question, let's just do that okay.

