WEBVTT

00:00.000 --> 00:11.440
Perfect. Hello everyone. Glad you're all here. I'm Jacob Coffee. I work at the Python software

00:11.440 --> 00:17.680
foundation. On the infrastructure. Today, we're going to talk about my fake bread business

00:17.680 --> 00:24.560
called the bakery. The best I could do. But really, we're going to talk about pet 810,

00:24.560 --> 00:28.720
which I do nothing about. But someone told me the best way to learn something is to teach

00:28.720 --> 00:37.360
it. So here we are. Hang on. We're going to cover what the problem is. The start-up time

00:37.360 --> 00:44.880
is pretty rough with Python when it imports the world. We're going to explain 810 like the

00:44.880 --> 00:50.000
syntax. Probably not alive demo because I didn't realize there would be so many people here and

00:50.000 --> 00:56.160
now you made me very nervous. So real world impact measured and then use cases beyond a silly

00:56.160 --> 01:10.080
bread business CLI tool and then how to migrate. So the problem is that when you run anything in Python,

01:10.080 --> 01:17.520
it's going to import the world. So it's eager to import model, loads everything, and we can see

01:17.520 --> 01:24.800
that it loads, it takes 234 milliseconds, which isn't great. If you've used Galer Rust, CLI tools,

01:24.800 --> 01:31.840
it's very snappy. There's also problems with memory bloat. There's also some cool start tax,

01:31.840 --> 01:37.200
which is bad when you're doing serverless things. You probably don't want to use Python for serverless

01:37.200 --> 01:44.640
because of these reasons. And then you have nasty hacks like if type checking. So what is

01:44.640 --> 01:52.480
Pep 810? It's explicit lazy imports for Python, and it's basically going to say instead of

01:52.640 --> 02:00.400
bringing in the world, we have this module over here that we only call once a year or once however long,

02:00.400 --> 02:05.600
and we don't pull it in until it's first accessed, not just when it reads the import statements.

02:06.960 --> 02:17.680
Good news, it was accepted. So it's going to be in Python 3.15. It says it's 50 to 70% faster for

02:17.680 --> 02:24.320
certain workloads. 30 to 40% less memory, which is good because we like trees, and it's a nice

02:24.320 --> 02:31.520
explicit syntax. So no surprises when it happens. You can easily document it. The syntax,

02:31.520 --> 02:39.280
normally we have just from breadcuddle, which is my bread business, it's very official, import,

02:39.280 --> 02:45.040
and we have baked delivery and inventory, and it's just going to import the world, even if you

02:45.040 --> 02:50.480
don't run anything. So even if I do CLI tool, dash dash help, it's still going to import the

02:50.480 --> 02:56.240
world, it's going to take forever. It's really annoying. But with this, the new thing, we lazy import

02:56.240 --> 03:04.480
the thing, and then you only do it when it is invoked. So that is going to be a big improvement.

03:04.480 --> 03:09.360
So if you run dash dash help, you don't access baker, delivery, or inventory, or whatever,

03:09.440 --> 03:14.720
and now it just loads only when you run the one thing. So how is it going to work?

03:16.560 --> 03:24.320
So when Python sees the lazy import, like lazy import, JSON or HTTBX, it doesn't actually import

03:24.320 --> 03:32.320
HTTBX, it's going to use proxy object, it's going to fill the void there, and this proxy goes into the

03:33.280 --> 03:39.040
system modules namespace, I think I have that right, where HTTBX import would normally be.

03:40.480 --> 03:45.040
But there's no file, I think that's correct, and then no code execution happens yet.

03:46.240 --> 03:51.600
So we parse it, we say lazy import, HTTBX, creates that proxy, then you have a lazy module

03:51.600 --> 03:58.640
proxy, then we do the waiting until we want to use it. Until you do like an HTTBX

03:59.280 --> 04:07.520
dot get or whatever module you're calling. So you start up say fast, your help commands are quick,

04:07.520 --> 04:20.240
all that. Okay, so the moment you actually use HTTBX, it does the real import, the proxy object

04:20.240 --> 04:24.800
is transparently replaced in the real module, I believe I had this correct, it's the reification

04:24.880 --> 04:30.160
process of that, and your code never knows the difference. So if it never is access and

04:30.160 --> 04:34.880
ever loads, and that's the whole trick, it sounds like it's this big thing, but it's just the

04:34.880 --> 04:44.240
super simple. So boom, HTTBX is loaded, and we're all happy. So that's the three sort of phases,

04:44.240 --> 04:48.400
very, very dumb down for someone that wanted to learn it, and then share it with you all this

04:48.480 --> 04:54.880
parse way in access phase. We do the proxy, proxy is dormant, and then we do the real import.

04:54.880 --> 05:04.720
So boom, it's not super complicated. We're going to see maybe if we can do the live demo,

05:05.440 --> 05:13.360
I'm terrified of this. Actually, look at this cow, it's in Scotland, it's very pretty.

05:14.080 --> 05:20.960
We're not going to do the live demo, like we should. Oh, how are we going? Because I did this

05:20.960 --> 05:27.440
earlier to make sure that I didn't look like a goober up here, but basically on the left side,

05:28.480 --> 05:34.640
we have, I think I flipped around one of these, yeah, okay, so left side is the normal

05:35.280 --> 05:43.280
running, and then you have the right side with this dash click, which is, you just look in the

05:43.280 --> 05:48.160
code and in the slides, I have some links. You have the same thing running, it's a little bit slower,

05:48.160 --> 05:56.000
it's like seven times, or seven point, five times faster. And then I think I used Claude to help

05:56.000 --> 06:01.840
me because I'm bad at math. We can see some like real numbers here. So 30 to 40 milliseconds,

06:01.840 --> 06:05.920
faster is what the robot says, but we don't, we don't really always trust the robot, do we?

06:08.640 --> 06:13.600
So that's all you're going to get for live demo. But yeah, you can see just from like that one

06:14.720 --> 06:20.880
four letter change, we have a much faster thing. This is a very silly demo, but you can see like,

06:20.880 --> 06:25.200
for example, I maintain or help maintain light star, which is a web framework, you could do the

06:25.200 --> 06:32.320
same thing for a flask or fast API. But when you have this ginormous app, and you do some

06:32.320 --> 06:37.760
CLI command for it, it's going to take what feels like five ten seconds to load. It's not really

06:37.760 --> 06:45.360
that long, I hope. So you have other problems. But with the lazy imports explicitly, you can now do this

06:45.360 --> 06:50.160
without hacking around it. I mean, you could lazy import things. Now you could stuff things

06:50.160 --> 06:56.560
inside of functions, so they're only called when the functions invoke. That's fine. But this is

06:57.200 --> 07:01.120
part of the language now. It's official. Nice non hacky way to do that.

07:05.840 --> 07:10.880
Okay, so we have some bitchmark results here just from my silly examples. This is going to use

07:10.880 --> 07:15.920
the the cap of CLI framework. That's just one that I like, but actually if you,

07:16.880 --> 07:23.600
it's supposed to be faster, but if you compare against CLI and some others, I think capa is not

07:23.600 --> 07:31.520
the fastest, but I like the syntax. So just for me, throwing in this lazy import, I got 23% speed up

07:31.520 --> 07:38.240
for the help command, module import time is 26% faster, and then my inventory command, which does

07:38.240 --> 07:42.240
a whole lot, did not change. So that's the thing, you're not going to always see an improvement,

07:42.240 --> 07:48.240
so it's not like you should go and just find a place import this with lazy import this,

07:48.240 --> 07:56.080
so I think that's a good good solution. Some rural world impact meta. They have their own

07:56.080 --> 08:03.280
fork of CPI fund, and then they do lazy imports and they have a 70% state of 70% start a production

08:03.280 --> 08:11.200
in 40% memory savings, which is what the pet, 18, if you look at the the docs, says is what could

08:11.280 --> 08:17.520
be expected for your own projects. Same thing for HRT, they have module level as the imports,

08:17.520 --> 08:23.520
and for PICI, they have QD bindings, 35% start improvement. So that's pretty significant

08:24.240 --> 08:30.080
for your end users or yourself, whichever. And these are not experiments, there's like rural

08:30.080 --> 08:37.840
world, ginormous companies that do cool things or not cool things, but they have big production

08:37.920 --> 08:48.080
companies using the code, so it works. So PEP 18 is not just for CLI tools, that's just one

08:48.080 --> 08:57.520
use case, it's my very silly example for that. So we have type checking, so if you ever want to do

08:59.520 --> 09:04.560
if type checking and that whole block where it's just like this ugly little block, you can just

09:04.560 --> 09:12.800
instead do lazy import from whatever type thing you're doing and type checking block goes away.

09:14.800 --> 09:22.320
I think it's an error prone and a lot of our linters yell at us for leaving things out or things

09:22.320 --> 09:28.240
that maybe should go in the type checking, so it's very confusing. So and there's also some things

09:28.240 --> 09:32.800
with like I said earlier, the serverless and cold start environment, so every land of cold,

09:32.800 --> 09:39.280
every lambda, cold start costs money and use your user patience. So if we're seeing, for

09:39.280 --> 09:48.080
example meta, solve the 50 to 70% speed up, then that's that's something that will really help

09:48.080 --> 09:55.600
save some money and make users happy for your serverless runs. And also for memory constrained

09:55.600 --> 10:03.040
environments. So these are all good things. There's a simple swap out for the things that you can

10:04.240 --> 10:14.160
identify, need this and I don't know, it's just simple, it's good. But how I do it, so I can

10:14.160 --> 10:20.240
take you through this. So you want to go through in profile, probably better than I have profile

10:20.320 --> 10:31.520
here, but profile your application, whatever it may be, look for heavy things like numpy or

10:32.160 --> 10:39.840
pandas or HTTX or anything like greater than 50 milliseconds to start with and then apply the lazy

10:39.840 --> 10:48.960
import selectively. The module level in it was going to happen on the first axis and you should see

10:48.960 --> 10:53.920
an improvement, you could test this with hyperfine. All of the benchmarks that I have done have been

10:53.920 --> 10:59.840
with hyperfine, and it's just a rest of your life tool for benchmarking. It's very fast. And then

11:00.400 --> 11:06.800
your Python can finally go, almost as fast as go. Don't quote me on that, but probably.

11:08.800 --> 11:15.760
So I'm caveat and gotchas. Import time side effects are deferred, so you may want to like

11:16.480 --> 11:23.680
know our document, that things that might configure logging in import time might need to be

11:23.680 --> 11:29.760
not lazy imported. That's the don't just blindly lazy import the world. I can't wait to see

11:29.760 --> 11:35.920
some people do that, though, and get some error reports. So other things, type checkers still

11:35.920 --> 11:43.600
need updates, I think last night, I don't know if anyone has them, UI does not, has an issue open,

11:44.560 --> 11:49.920
rough for the lansing part, and then my Python empire right, I have not checked, but I assume they do not.

11:50.640 --> 12:02.320
As also, let's see, import errors move. So if before you had some optional dependency,

12:02.320 --> 12:11.120
like I had an optional, in my project at Toml that was century, and nothing happened when I ran

12:12.080 --> 12:17.040
the app, but I forgot to do the right flag or incantation or whatever to add

12:17.040 --> 12:22.560
century as an optional dependency. You're not going to know about it until the app is running,

12:22.560 --> 12:27.920
because we're not going to get an error, it seems simple, but you just have to kind of think about

12:27.920 --> 12:34.240
these things until a century is called and then throws an error because no module phone.

12:34.240 --> 12:41.200
Like it's not always faster, so it's best for conditional and optional imports,

12:42.080 --> 12:48.160
and then some tips, just test with simple things, like your help, if you're actually running a

12:48.160 --> 12:56.320
CLI, or optional dependencies, please keep your eager imports. Don't just lazy import the

12:56.320 --> 13:02.000
things that you rely on, like logging configuration, and then document things for your co-workers,

13:02.080 --> 13:08.080
they will love you for this. So they know this thing is lazy-loaded. When this air happens,

13:08.080 --> 13:13.600
you don't have to Google around or ask Cloud because Cloud's not going to be up to date on this

13:13.600 --> 13:20.800
for a while, probably. So one thing, oh, you can't, there's some docs in the official pet also.

13:21.520 --> 13:28.400
You can't import inside functions. You can't import inside, if type checking, I don't know why you do

13:28.720 --> 13:34.480
that, but you cannot do that. We can go over some caveats actually here.

13:37.200 --> 13:41.520
Yeah, can't do inside functions. Can't lazy import inside classes, because then you kind of like,

13:42.080 --> 13:46.240
it's only going to be in the global namespace at the top, but I mean, for the reason,

13:46.960 --> 13:52.960
because these were, I guess, implicit, lazily imported. This is what people do currently to

13:52.960 --> 13:59.360
try and speed up and get around this. And then this ugly thing that people like to do,

13:59.360 --> 14:04.640
sometimes, where we import star, we should not do that. So I'm glad that this is a syntax here.

14:06.160 --> 14:18.880
So, awesome, reset me. So it's not yet available. There's a reference implementation available now

14:18.880 --> 14:26.880
through the CPI-thon through a ACPI-thon fork, and there's a link in my in the notes that I'll

14:26.880 --> 14:35.040
share out that the lazy imports come all. I think it's the GitHub organization for that.

14:35.840 --> 14:39.680
It was an api-thon, I think this is good, because an api-thon explicit is better than implicit.

14:40.400 --> 14:47.600
That's, that's the most lived by. And this manifests that for us all. And let's us make good choices,

14:47.600 --> 14:54.080
not have some footgun that, well, maybe you can footgun if you do some bad things. But for the most

14:54.080 --> 15:00.800
part, it is good. No surprises here. And it's local. It's only going to affect the one thing that

15:00.800 --> 15:07.520
you tag lazy on. So it's not going to say, mess up like import system, and then lazy import JSON.

15:07.520 --> 15:13.200
It's not going to mess up system, or any kind of example. And you have that granular level of control,

15:13.280 --> 15:21.760
which is very nice. Back on this, you can make some lazy and eager,

15:21.760 --> 15:26.960
easily freely if you were wondering. So there's no limitations around that. And then some resources.

15:26.960 --> 15:33.440
There's the PEP, PEP stuff, Python.org, PEP 810. There's this very bad demo page for breadcuddle,

15:33.440 --> 15:38.560
my very official bread business. There's the CPI-thon fork. I get Hub.com slash lazy imports,

15:38.560 --> 15:47.200
Kabbal, and then hyperfine, which I've done all my benchmarking with. That's a thank you page.

15:47.200 --> 15:57.840
But I wanted to go over, let's see. So more like in in-depth things that I'm glad

15:57.840 --> 16:03.520
that I have time for. So I mentioned a reification I glance over it, but we have this lazy object.

16:03.520 --> 16:10.480
It needs to be reified or like made real. So right now we have lazy import foo. Right now foo

16:10.480 --> 16:17.840
is just like a placeholder in the system modules. Until we call that or do something that

16:17.840 --> 16:23.440
that invokes it if you call type on foo, then that would bring it into the system modules.

16:25.600 --> 16:34.480
Here we have an example of this. We don't we don't have a man. Thank you.

16:34.560 --> 16:44.560
Here we have a this is a silly example where we are not going to have this issue.

16:45.360 --> 16:51.120
It has a typo and it incurs all in the first use. That is a bad thing. You could type

16:51.120 --> 16:57.120
other thing, lazy import it, and it's not going to know that it never exists because it doesn't

16:58.080 --> 17:04.720
technically until it reifies and it exists. So that is something to kind of think about.

17:06.880 --> 17:10.720
Perfect. Now thank you. Okay, I'm Ben.

17:17.840 --> 17:23.120
If you have any questions like I said, I did this to learn it. So I'm not going to be the expert in the room here.

17:23.760 --> 17:28.320
We do a bunch of CPI time core developers. So if you ask me, I'm probably going to point

17:28.320 --> 17:35.680
them, but question away. Okay, so you can ask the questions on the chat or just raise your hand

17:35.680 --> 17:45.520
and I'll give you the mic. Any questions? Two. Two questions from the chat.

17:46.480 --> 17:51.600
Yeah. Okay. Two questions from the chat. Okay. First question. Does it deserve the

17:51.600 --> 18:00.800
problem with secular inputs? Circular imports? Yeah. Well, I think that might still be a problem,

18:00.800 --> 18:07.680
but only a problem went when the lazy import is reified and it becomes a real object. Is that

18:07.840 --> 18:12.480
sound right? You do? Yeah. You're still going to have the circular import, but it's not going

18:12.480 --> 18:23.200
to throw the exception until that is accessed. Okay. So good question. What stops me from using lazy

18:23.200 --> 18:32.720
inputs everywhere because I'm lazy? Nothing. Yeah. You can you can try it. There are probably

18:32.720 --> 18:41.440
some workloads where that might work. I mean, my example was if you did log like logging and

18:41.440 --> 18:47.280
for some reason, you lazy imported your logging not PY file and that configured all of your logging

18:47.280 --> 18:52.720
things. We're not going to have anything done until that first call. So you might miss out on

18:52.720 --> 18:58.720
some things. I don't know if I have that quite right. This one? I'm sorry. And the back? Yeah.

18:58.720 --> 19:13.040
Can we? Oh. Where's the question? Thank you for the talk and for anybody here that

19:13.040 --> 19:17.600
contributes to the paper sounds very good. Can we expect like cascading improvements? There's

19:17.600 --> 19:22.800
really expensive imports that you suggested like pandas and numpy. Can we expect that as they put

19:22.800 --> 19:27.760
in a lazy import internally? Yeah. We'll also get improvements even if we don't lazy import those

19:27.920 --> 19:34.160
branches. Yeah. So with my example, that would be a light star. We provide a CLI with the

19:34.160 --> 19:41.440
web framework that you can call. So for that example, and for yours for when numpy they bring this in

19:42.320 --> 19:49.680
when I guess 315 is well, when 315 is out and people are only using 315 in their project. Yeah,

19:49.680 --> 19:56.400
you can expect to see improvements just by pip updating their requirements. And then, you know,

19:56.400 --> 20:03.760
they'll have the speed up rates. Following on from that, is there a way to feature tests this if

20:03.760 --> 20:08.400
you're still supporting older Python versions that wouldn't have it? Yeah. So I did this

20:10.080 --> 20:16.560
the initial work here. This is hard to talk in. But of course, you go here, already wrote a blog

20:16.560 --> 20:22.720
as soon as I finished my talk, I realized that you go ahead basically this. So he tested this

20:22.880 --> 20:31.840
basically by line. So you can bring this in to your project by adding this reference implementation

20:33.040 --> 20:43.280
line to your tools about which Python you're doing. And this will be a link in my repo for this

20:43.280 --> 20:50.960
talk. And basically here we're saying that I'm not pip on 315. I'm pip on 314 because if you try this

20:51.040 --> 20:55.040
now and do like rough length or rough check or whatever, it's going to yell at you because they don't

20:55.040 --> 21:03.280
support 315. And so you compile this special fancy version, then you can pip install the packages

21:03.280 --> 21:11.120
or your project and then you can test it yourself. You go also did some hyper-fine benchmarking

21:11.120 --> 21:18.240
and then went on to do some fully lazy benchmarking. It's all even greater importance performance.

21:18.240 --> 21:24.400
So almost three times faster. So you can test it now. This is not very easily.

21:28.400 --> 21:35.200
First, thank you to all the people who actually did contribute to this. I have a question

21:35.200 --> 21:40.080
which is about the overhead. Let's suppose that you replace all your imports by the lazy import.

21:41.040 --> 21:50.240
Like the gentleman on the chat set, would that have any drawback? Like would there be any overhead?

21:50.240 --> 21:55.120
Let's suppose that it works, right? I don't think there's any run time overhead from

21:55.120 --> 22:03.120
doing it, no? I think there's like zero, right? Yeah, I'm pretty sure it's zero when time over

22:03.120 --> 22:08.640
head on that. Okay, and also for this circular imports, I might imagine that it would delay

22:09.280 --> 22:16.880
discovering that, oh, I have a bug like much later into the runtime. Yeah. Yeah, so it like it could be bad.

22:16.880 --> 22:24.240
That's why you should selectively do the lazy imports instead of doing a final place and then

22:24.240 --> 22:31.920
massively replace it in your whole project. You have more questions? Yeah, out of time. Unfortunately,

22:32.080 --> 22:36.960
if you have more questions, then please just comment and then talk to Dr. Director at Dr. Yeah.

22:36.960 --> 22:42.880
Or put it maybe into the chat. Thank you all. Thank you!

