WEBVTT

00:00.000 --> 00:16.480
Hello everybody, I hope there is still enough oxygen in the room for you to focus on this

00:16.480 --> 00:17.480
next talk.

00:17.480 --> 00:23.280
I will be talking about the GIL or the global interpreter look and it is a fact on API

00:23.280 --> 00:28.040
performance of it talking about the past, the present and the free threaded future of

00:28.040 --> 00:29.920
Python.

00:29.920 --> 00:34.920
My name is Ruben Hiels, I am a software engineer at Techwell, which is a skills intelligence

00:34.920 --> 00:41.000
skill-up based in Kent and I also have a little side project flowed-up depth.

00:41.000 --> 00:45.960
A quick refresher since it is quite crucial to the next to the following of this talk,

00:45.960 --> 00:51.720
threats versus processes, a lot of you might already know everything about it, but a process

00:51.720 --> 00:57.560
is an independent program with its own memory space which has isolated memory, which

00:57.560 --> 01:03.560
is safer but also incurs higher overhead and they can communicate with each other over IPC.

01:03.560 --> 01:08.120
Next to that, we have threats which is a lightweight execution unit contained within

01:08.120 --> 01:13.600
these processes, they have the same shared memory space of the process, which is efficient

01:13.600 --> 01:21.240
but can lead to some issues regarding data consistency and race conditions and they can

01:21.240 --> 01:25.640
directly communicate via this shared data, the process can even either have one threat

01:25.640 --> 01:28.120
or multiple threats.

01:28.120 --> 01:31.880
So now let's get into it, the skill, what is it really?

01:31.880 --> 01:37.080
Well, the name speaks for itself, it's a look that allows only one threats to execute

01:37.080 --> 01:42.600
Python by code at a time, it's a look that gives the interpreter to a threat of the global

01:42.600 --> 01:45.240
process.

01:45.240 --> 01:49.440
Many seats as a core thing in Python, where everyone needs to work around, but it's

01:49.440 --> 01:53.920
not actually a feature of Python, it's not the Python language can work without it, it's

01:54.240 --> 02:00.560
implementation detail of C Python or forks of Python don't necessarily have this, and it is

02:00.560 --> 02:06.240
there to protect its internal data structures and very specifically around the garbage collection

02:06.240 --> 02:08.480
and reference counting.

02:08.480 --> 02:13.120
Here we see, for example, a world where we didn't have something like the global interpreter

02:13.120 --> 02:17.760
look, here you might have two threats, one thing to access the same variable, they both read

02:17.760 --> 02:22.480
the current reference count, let's say four, let's say three in this case, they both

02:22.560 --> 02:25.280
incremented, they both say it's four and write it.

02:25.280 --> 02:31.120
Now in our states, the reference count is four, but a reality it should be five, and this

02:31.120 --> 02:35.360
can obviously read too many issues.

02:35.360 --> 02:41.200
Now with the skill we don't have this, since only one threat can execute at the same time,

02:41.200 --> 02:46.800
and this will have to finish writing it before it can give its access to the next one,

02:46.800 --> 02:50.240
and here we end up with a correct solution.

02:50.240 --> 02:55.680
I even went back in time to find the very commits where the global interpreter look was introduced.

02:55.680 --> 03:00.720
I might not have been born, but Hilo van Rosem was already writing Python in the good day of

03:00.720 --> 03:07.280
August 4th, 1992. I even went to find the specific codes, it might be slightly too small,

03:07.280 --> 03:10.800
but it's actually very simple, it's you launch a threat, you check, do you have the lock or

03:10.800 --> 03:12.800
not, either you execute or you don't.

03:15.200 --> 03:18.640
So you might ask yourself, why do we even have the president?

03:18.640 --> 03:23.440
If only one can execute at the same time, we might as well just live in a single threaded world.

03:24.160 --> 03:30.560
Well, the global interpreter look is the default, but applications are like the C Python implementation

03:30.560 --> 03:34.720
can still choose to release this lock in specific cases.

03:35.520 --> 03:40.640
One of these is Io, or input outputs like network, file reads, etc.

03:41.200 --> 03:46.960
Especially back in day, we see like hard drives where way different clothes was still only spinning

03:47.920 --> 03:53.440
so we had to read speeds 0.5, maybe 2 megabytes a second, network latency of hundreds of

03:53.440 --> 03:58.720
milliseconds and bandwidth of 14 or 40 kilobytes per second. So this is a really significant

03:58.720 --> 04:03.360
chunk of your application execution time and still today, so if you can have other threats

04:03.360 --> 04:08.800
executing during this time, you still got a big speed up. Secondly is the hardware limits.

04:09.440 --> 04:14.400
Like multi-treading wasn't necessarily a thing because this compact desk pull at the time,

04:14.880 --> 04:20.480
one of the pretty top-of-the-line machines only had one tread, like multi-one core,

04:20.480 --> 04:24.560
multi-core machines were only introduced way later and at least common.

04:25.920 --> 04:31.520
There were in some high-performing context. And finally, one of the things that makes

04:31.520 --> 04:36.960
spite and what it is today is its extensive ecosystem. And having a global interpreter look makes

04:36.960 --> 04:41.680
it way easier to write the C extensions like for example an empire that so many of us rely on today.

04:42.480 --> 04:45.440
So maybe we wouldn't have had this should not have been the case.

04:47.280 --> 04:53.520
Now here you see all of these numbers already tell you it's years. Does anybody have a guess

04:53.520 --> 04:55.760
what this might signify?

05:00.000 --> 05:06.800
That's very true. These were all attempts to remove the bill. Even then I was lying in 2015 there

05:06.800 --> 05:12.800
are two attempts. So it's not new that people want to get rid of this and really have the full

05:12.800 --> 05:20.320
performance of the computer through its availability. And if you saw correctly it was only introduced

05:20.320 --> 05:26.800
in 1992. It's 1996 and they are already attempting to remove this. But why were so many attempts

05:26.800 --> 05:30.960
that so many attempts fail? Why wasn't a bill removed such a long time ago?

05:31.840 --> 05:38.560
Well the initial attempt in 1996 they did this by having fine-grained locks. So instead of one

05:38.560 --> 05:46.960
big look, every data object had its own look. And by doing this it did work. You could have multiple

05:46.960 --> 05:52.400
threats executing at the same time. But the single trial performance was about three times worse.

05:52.400 --> 05:56.720
So this was really not a sensible trade-off to make in the majority of the applications.

05:56.720 --> 06:04.800
The same happened in 2007, 2015, they tried other things like atomic operations, etc.

06:05.680 --> 06:14.000
But finally the years 2021 and it's beautiful. What happened here? A new fork of CPython was introduced

06:14.000 --> 06:18.640
to the no-gill fork by Sam Gross, which was a software engineer at Microsoft at the time.

06:19.840 --> 06:25.760
He introduced the concept of Bistreferin scouting. Bistreferin scouting is quite simple if you

06:25.840 --> 06:32.480
think about it and it uses the property that while every threat can theoretically access every object,

06:33.200 --> 06:39.840
realistically a lot of the variables will be only accessed by one threat. So there is quite a

06:39.840 --> 06:47.200
lot of locality regarding the threat. And he abused this to say this variable is owned by this threat

06:47.200 --> 06:52.480
and it can access it very quickly. And if other threats want to access it they can, but that will

06:52.560 --> 06:58.720
incur performance penalty. He also introduced many other things like immortal objects,

06:58.720 --> 07:04.080
dreadstif, memory allocator also developed by Microsoft Mimalok and many other small tweaks.

07:04.080 --> 07:10.160
But this was really the core innovation and allowed it to be only have minimal over at I think

07:10.160 --> 07:18.000
10 or 30% of the current implementation. And here is also after this it got officially accepted

07:18.000 --> 07:24.720
as a Python enhancement proposal. And here it was I think set in stone that should something

07:24.720 --> 07:31.280
be made with less than 10% over at or the current single threat implementation or yeah with the

07:31.280 --> 07:38.480
gill then it would become mainstream Python. So how we go to here the gill it's not on the feature

07:38.480 --> 07:44.240
it's a C Python implementation detail to protect these data structures many items failed mainly

07:44.240 --> 07:51.440
due to performance penalties. And Python 3.13 finally introduced free-training and experimental

07:51.440 --> 07:57.680
modes mainly through the innovations of Sandrose. And you always will have to make a trade-off,

07:58.240 --> 08:02.320
you will have small single threats low down in order to enable this true parallelism.

08:03.120 --> 08:09.600
Another question maybe, how do we actually use this? Quite simple, you add a T off your

08:09.680 --> 08:16.880
Python version. This is a new standard if you add a T 3 since 3.14 you automatically get

08:16.880 --> 08:24.160
the free-trained version and UV also has built in support for this. But realistically let's say

08:24.160 --> 08:29.280
what does this do to your actual applications? Here you can see I think about the most basic

08:29.280 --> 08:37.520
version of a multi-trained program, you just do work some basic CPU calculations over a set of

08:37.520 --> 08:44.880
threats. Here in the normal non-free-trained version so with a global interpreter look enabled

08:44.880 --> 08:51.120
we see an execution time of about 2.3 seconds and now in the free-trained version we see that

08:51.120 --> 08:57.920
it goes down to 770 milliseconds which is about a 3x speed up. By the way all the benchmarks here are

08:57.920 --> 09:06.720
executed on this MacBook M3 for reference. Okay let's now look at the single-trained

09:07.200 --> 09:12.880
because we see okay there is real significant speed up possible with this new free-trained

09:12.880 --> 09:17.760
modes but what is the penalty we pay because that's the reason it was not implemented for so long.

09:18.640 --> 09:25.600
Well in the normal version we see that it's about 2.3 for seconds and then in the free-trained version

09:26.320 --> 09:33.280
2.3 7 which is really only a 1.2% overhead in this case. Note that this will of course also

09:33.280 --> 09:39.680
highly depend on your actual use case I've seen some people have seen around up to a 10%

09:39.680 --> 09:45.280
increase but it has really come down a lot and that's also what this growth indicates this was

09:45.280 --> 09:51.600
benchmark run by Miguel Greenberg and here he implemented a look across Python versions

09:52.160 --> 09:58.160
about how did the performance change and what I want to highlight here is the difference in

09:58.160 --> 10:03.520
performance penalty between the standard version and the free-trained version between 3.13 and

10:03.520 --> 10:11.040
3.14. For example in the whole left bar with Linux 3.13 we see it's maybe 4 seconds of additional

10:11.040 --> 10:17.520
time spent but in 3.14 it's yeah minimal not even a second so this really shows the hard work

10:17.520 --> 10:26.960
the C Python developers have been doing to make this a reality. So what does it mean for CPU

10:27.040 --> 10:33.360
bound applications it is now possible to have actual true CPU parallelism with about a 3x

10:33.360 --> 10:38.480
speed up I found on some machines but of course the more of course you have the higher the possibility

10:38.480 --> 10:45.680
for parallelism is it's also best for embarrassingly parallel workloads as you often call it

10:45.680 --> 10:51.280
so fully independent calculations but this is not new to Python this has been the case for all

10:51.360 --> 10:57.360
multi-trained applications. Single thread is now minimal for a lot of application

10:57.360 --> 11:02.320
once again please test it out with your own workload before you push on thing to production and

11:02.320 --> 11:09.840
everything starts breaking down. Thread over at matters like you can see I didn't explicitly highlight

11:09.840 --> 11:14.160
this but between the multi and the single-trained version did the same things but there was a

11:14.160 --> 11:21.120
performance penalty to be paid. Now for the API developers this was something I was personally

11:21.120 --> 11:27.920
very interested in and I think many of us some might do of the real scientific calculations but

11:27.920 --> 11:34.240
even they like a lot of scientific calculations is now done in NumPy but you often hear okay why are

11:34.240 --> 11:40.480
even investing so much efforts into removing this global interpreter look if it doesn't necessarily

11:40.480 --> 11:47.280
affect us that's much like it's all applications are Io bound anyways like CPU it's only like

11:47.280 --> 11:53.520
for academic reasons that we might want to remove this so that's when I got curious. In current

11:53.520 --> 11:58.720
day and age Python APIs scale by handling concurrent request of course with something like a

11:58.720 --> 12:05.440
unicorn so they require parallel execution they can do their Io etc but at some point the

12:05.440 --> 12:11.360
guild will block their thread based parallelism if they really get to the CPU bound work so

12:11.360 --> 12:17.200
the current solution today is spawn multiple worker processes so these fully independent processes

12:17.200 --> 12:24.480
no threats which operate fully independently of its owner on memory space and all of that but

12:24.480 --> 12:31.200
processes have their trade-offs so we said it's a way more heavyweight thing than a threat so for one

12:31.200 --> 12:36.640
processes don't share memory space let's say you have your multiple worker processes you

12:36.640 --> 12:42.800
initialize your API it's about 500 megabytes you have to load in some data to be able to

12:42.800 --> 12:49.920
surf things once you go up your processes you can see your memory requirements go through the roof

12:49.920 --> 12:57.280
and your cloud build will go as well next to that data sharing is quite hard let's say if you

12:57.280 --> 13:02.560
can just have a local dict which is your cache or something you can easily do that with threats

13:02.560 --> 13:10.000
with processes way harder require serialization IPC etc as well as just being slower in general

13:11.840 --> 13:17.680
but and one big but it doesn't have this global interpreter lot of content so does that make it

13:17.680 --> 13:25.520
worth it or not and this is where we need it to do some benchmarks I tried quite a lot of configurations

13:25.520 --> 13:32.720
both specifically to test retreading but also out of my own interest in a traditional way I looked at

13:32.720 --> 13:39.200
the juni-core and traditional trend scaling I also took brainy into consideration which is a

13:39.200 --> 13:44.960
web server similar to a juni-core but it's rust based and it offloads some of the network type

13:44.960 --> 13:53.120
processing to restrates instead of doing it in the Python world next how does this work with things

13:53.120 --> 14:01.120
like async processing a false API as well as g event and then finally also let's take

14:01.120 --> 14:08.000
free-treading into account with threats and how does it compare to null type processing since what

14:08.000 --> 14:13.680
the question is I really want to answer is with this free-treading enabled can free-treading match

14:13.680 --> 14:19.360
the null type process performance or even exceeded but either way it will reduce the overhead

14:19.360 --> 14:29.120
it encounters first quite a simple benchmark we'll be talking about CPU bound APIs what is

14:29.120 --> 14:37.040
really is is we basically I took the do work function and put it behind an API and here we see

14:37.040 --> 14:44.080
similar characteristics so we see like the maximum RPS it can reach with a global interpreter look

14:44.080 --> 14:49.760
it's quite static across threats it's just CPU work so no parallelism to be found there

14:49.760 --> 14:56.400
quite expected and then once we go once we go free-tread the notes then we see of course that

14:56.400 --> 15:02.000
we can scale well very further and here we basically see the same characteristics we saw earlier

15:02.000 --> 15:09.280
on the raw normal Python execution but to come back to my initial thing of the people saying

15:09.280 --> 15:15.040
oh where I don't want anyway we're just waiting for this slow API we're just waiting for our

15:15.040 --> 15:23.280
database well that's why I created more of a mixed work load so I simulated an API I looked at

15:23.280 --> 15:31.280
some research usually most real production applications have around between a 70-30 or 90

15:31.280 --> 15:40.160
10 split between IO waiting time and actual CPU work CPU work here is often parsing during

15:40.160 --> 15:47.600
calculations some of some of that type of logic so we compared the different scaling strategies

15:48.160 --> 15:54.960
and we varied the threats or worker count between 1 to 32 and we saw how it went the main

15:54.960 --> 16:00.640
measure I wanted to see here was what is the maximum request per second we can realistically serve

16:01.600 --> 16:07.920
as this is what often matters most with API performance and how efficient can you run your

16:07.920 --> 16:16.000
API given a certain load so this is where we start this is the very baseline we see

16:16.000 --> 16:21.120
junicorn with some G threats which is the basic trading implementation in junicorn

16:21.760 --> 16:27.920
initially we see that it can is basically increasing linearly until it clearly reaches some

16:28.240 --> 16:33.680
saturation point if you compare that to grainy and we see basically the same characteristics

16:33.680 --> 16:40.720
except a grainy can go a bit further because it does some of this network IO stuff in the

16:40.720 --> 16:48.400
restworld then you might ask yourself how does this compare to an async framework like a fast API

16:48.400 --> 16:54.560
or any other ISGI framework well here we see that it basically caps up caps at the exact same

16:54.880 --> 17:02.400
point which is what the which is where you see you or like the guild gets saturated and get that point

17:02.400 --> 17:10.880
you're just always waiting for CPU or time on the CPU illustrated is a bit so let's say you have two

17:10.880 --> 17:17.040
threats which is still in a linear scaling mode you just have some IO weights and some CPU time

17:17.040 --> 17:23.600
but they can easily be interleaved and you don't really have any issues but let's say you now have four threats

17:24.640 --> 17:30.400
now you can see like the actual CPU time it's getting quite overloaded I see there is already some

17:30.400 --> 17:38.400
overlap you can hear that please but once you go to 16 threats you basically are still all the time

17:38.400 --> 17:43.760
waiting for the guild and that's the saturation we see adding more threats doesn't matter at this point

17:43.760 --> 17:49.520
at this point you're just waiting for the guild to do your real Python processing and handle the

17:49.600 --> 17:57.120
request so how can we break through well as we set the traditional way to break through with these things

17:57.120 --> 18:06.960
is true multi processing so once we add once we add our our workers we see that we can scale quite a lot

18:06.960 --> 18:12.640
further this is once again you can see it's also flattening off which is similar to what we see

18:13.280 --> 18:20.160
when your actual threats get saturated now this is where free trading comes into the picture this is

18:20.160 --> 18:28.400
very one to see does free trading actually follow this line so first I tried it I run the unicorn

18:28.400 --> 18:33.760
if free trade mode mode and it gets very close to this line the one thing you can clearly see

18:33.760 --> 18:40.160
is that it breaks through this barrier of in this case about 400 requests per second which is where

18:40.240 --> 18:47.840
the guild gives up or at least the guild becomes the main bottleneck of your API now if you compare

18:47.840 --> 18:54.400
this to a gradient you might think oh there is a purple angle well it's exactly on top of it so

18:54.400 --> 19:01.200
actually it matched the full free the full multi processing perfectly and this is how we really

19:01.200 --> 19:06.800
can see that right now if you can run your application in free trade mode and you don't have any

19:07.120 --> 19:12.800
extensions that depend on the guild still being there you can do this and you can just run

19:12.800 --> 19:21.200
in free trade mode and it should just work a little bonus I added since I have not heard that

19:21.200 --> 19:27.200
many things about it online it's a bit small but it's a library called G event it's something

19:27.200 --> 19:35.840
I personally find quite interesting to use the concepts of greenlets to handle your IO so what is this

19:36.160 --> 19:44.240
is you can use it in a fully WSGI application your jungle for example but it actually

19:44.240 --> 19:52.720
monkey patches a lot of your IO to get to a similar performance so I think here this was just a WSGI

19:52.720 --> 20:02.400
server one one trep I just enabled G event and it was able to get to 93% of a full ASGI implementation

20:02.480 --> 20:09.200
which is quite impressive especially when you look at our jungle it's low you can do async stuff

20:09.200 --> 20:15.200
async also has real downsides of like going full async modes and this gives 93% of the

20:15.200 --> 20:22.560
advantages I will give a note here we did try this and sometimes like connection pools and stuff

20:22.560 --> 20:27.840
like db connections pools once again try it please first locally in a secure way before you

20:27.840 --> 20:34.640
push it to production so what is the thing what are the things I wanted to give which you

20:34.640 --> 20:41.200
you're the global interpreter look creates a scaling ceiling regardless of your server choice you

20:41.200 --> 20:47.440
can have the top of the line your your gradient very modern but at some point your CPU will get

20:47.440 --> 20:54.560
saturated and every threat will be waiting for your lock async help with IO concurrency but still can

20:54.560 --> 21:01.680
escape the CPU bank ill limits that's one thing I've seen many like mentioned a lot like

21:01.680 --> 21:07.920
just go async and you can scale to the moon well real APIs do still have actual CPU work and that's

21:07.920 --> 21:14.000
also what we see here free dreading enable this linear scaling for example in this case on this machine

21:14.000 --> 21:21.520
which is the 2.5x improvement at 32 threats but for IO heavy APIs if you really are full IO heavy

21:21.600 --> 21:25.920
and you don't want to go free treading yet or you still are depending on some libraries which don't

21:25.920 --> 21:34.000
support it pacing and g event remain excellent choices also I was quite interesting to see

21:34.000 --> 21:38.560
a modern web servers like gradient that are leveraging these threats if you're interested in

21:38.560 --> 21:43.280
gradient we're currently running in a transaction and century that like the outlook in company you

21:43.280 --> 21:50.320
might know they're also currently switching to it which might be interesting so as conclusions

21:51.040 --> 21:57.040
I think the guild was the right trade-off in 1997 but the hardware and workloads has changed

21:57.040 --> 22:03.200
so it's time to get rid of it free treading delivers true parallelisms without and many

22:03.200 --> 22:09.680
memory overheads with 2.5x scaling for mixed workloads in production environments for new

22:09.680 --> 22:16.320
treading for new projects consider free treading I know some libraries still have issues with it I think like

22:16.720 --> 22:24.080
UV query for example like there are UV loop implementations still has some issues but for existing

22:24.080 --> 22:30.480
projects that do depend on these async and multi-process work well enough but I do believe the

22:30.480 --> 22:35.280
future of Python performance is multi-traded so I encourage you to try it out and check what

22:35.440 --> 22:49.440
it works for you and let's make Python falcer so thank you very much for the very interesting talk

22:50.000 --> 22:51.120
do we have any questions?

22:51.360 --> 23:05.120
do the third party packages are there any requirements for them to meet updated to be able to work with a free

23:05.120 --> 23:12.800
threading mode? So native Python applications should just work with it I think it's mainly

23:12.800 --> 23:19.440
the compiles like actual C Python extensions that need to say hey we are actually

23:19.760 --> 23:27.120
compatible and we do allow this to be used in free treading modes but usually when you install

23:27.120 --> 23:31.440
the packages it will ever if you can't install it that's how you need to find it out

23:32.320 --> 23:49.840
hey just how the gil was kind of like well they've been took a lot with like an implementation

23:49.920 --> 24:00.400
detail of C Python um is there any Python is there anything for over the way that variables get

24:00.400 --> 24:05.120
allocated to processes for reference collection in terms of or is that just an implementation detail

24:05.120 --> 24:10.640
that there is no control over and it's just Python does its best or C Python does its best kind of

24:10.640 --> 24:16.720
gets as to which thing is most likely to have access to a given variable quickly versus slowly

24:17.680 --> 24:23.200
it's a very interesting question I personally didn't do any research into that I haven't seen

24:23.200 --> 24:28.000
a syntax that you can manipulate like in which thread it gets allocated I think it's mainly where

24:28.000 --> 24:33.200
does it get created probably but it's interesting that I said they don't have response for you

24:38.400 --> 24:43.200
so one question we got was does enabling free treading modes require any more implementation on the

24:43.200 --> 24:49.040
code site or is it a simple as changing your version meaning all packages all threaded packages

24:49.040 --> 24:54.240
will be freed by default it's indeed you don't need to do any changes on the code site

24:54.800 --> 24:59.760
it's when you yeah you need to install a new Python version and when installing a new Python version

24:59.760 --> 25:05.120
you need to reinstall your packages anyways and then it will automatically install the correct

25:05.120 --> 25:08.400
free treading version if those packages allow for it of course

25:08.480 --> 25:16.320
then we see which high profile Python modules are not working with the guillotine I what would

25:16.320 --> 25:24.640
prevent us from using 3.14T all the time I don't have an exhaustive list sadly like the one

25:24.640 --> 25:31.760
I personally encountered it was I think you you've loop had some issues or at least 3.14 but it's

25:31.760 --> 25:37.680
really something you have to try out for yourself I think it's most of the real high profile ones like

25:37.760 --> 25:44.000
a numpy will definitely have evolved by now it's more or more so the lesser known one or the

25:44.000 --> 25:48.160
deeper down in the dependency graph that will be affected

25:50.400 --> 25:55.120
is there a future where the t-bills are not special but to default and what timeline with this require

25:56.480 --> 26:02.880
the answer is I think yes but I don't remember the full timeline I think in the

26:03.840 --> 26:09.200
in the prep where they consider it switching to the free treading modes they set when it would

26:09.200 --> 26:15.200
be fully transferred but you you should you can look that up yourself insert the last question

26:15.840 --> 26:20.240
when you enable this one to directly run into race conditions with your own code if it's not

26:20.240 --> 26:27.520
explicitly protected against it so the Python protects you in the Python codes the main issue is

26:28.160 --> 26:33.200
if you're writing sea extensions like a numpy and really high performance codes and that's where

26:33.200 --> 26:39.120
it starts then you do actually need to think about this and see whether it works with this new version

26:39.840 --> 26:44.880
as well but as long as you're writing pure Python you shouldn't need to care about this

26:47.280 --> 26:48.880
thank you

26:49.440 --> 26:51.440
thank you again

26:57.520 --> 26:59.520
you

