WEBVTT

00:00.000 --> 00:12.000
But this is going to be a talk about using NYXOS for the

00:12.000 --> 00:16.000
deterministic distributed system benchmarking.

00:16.000 --> 00:19.000
And the speaker here is Bruce Gain.

00:19.000 --> 00:21.000
It's a really cool topic.

00:21.000 --> 00:24.000
I think there's a lot of applications.

00:24.000 --> 00:27.000
NYX can be leveraged for and this is a particular

00:27.000 --> 00:31.000
interesting one, so take it away.

00:31.000 --> 00:32.000
Okay.

00:32.000 --> 00:37.000
A lot of applause for this speaker, please.

00:37.000 --> 00:39.000
Yeah, hi.

00:39.000 --> 00:42.000
My name is as Martin just said, I'm Bruce Gain.

00:42.000 --> 00:45.000
I'm an analyst with consulting firm and it called

00:45.000 --> 00:47.000
the Revcom.

00:47.000 --> 00:51.000
And what we do among other things is benchmark testing.

00:51.000 --> 00:55.000
And we try to really drill down and perform its characteristics

00:55.000 --> 00:59.000
of certain software packages and tools.

00:59.000 --> 01:03.000
And we're looking at the runtime basis as much as possible.

01:03.000 --> 01:07.000
And as you could probably understand that it's quite difficult to do,

01:07.000 --> 01:12.000
especially when trying to benchmark commercial applications.

01:12.000 --> 01:16.000
And by the way, about myself, I'm a huge Linux advocate.

01:16.000 --> 01:22.000
I've been using Linux for over 25 years and I love open source.

01:22.000 --> 01:28.000
And we ran into a wall recently with really trying to gauge runtime

01:28.000 --> 01:30.000
performance of certain applications.

01:30.000 --> 01:32.000
I just mentioned.

01:32.000 --> 01:35.000
And we looked at a few things.

01:35.000 --> 01:37.000
A few alternatives to do that.

01:37.000 --> 01:39.000
And we saw geeks.

01:39.000 --> 01:40.000
I don't know.

01:40.000 --> 01:42.000
Is anybody here familiar with geeks?

01:42.000 --> 01:43.000
Kix?

01:43.000 --> 01:44.000
Yeah, cool.

01:44.000 --> 01:45.000
Yeah, great.

01:45.000 --> 01:46.000
I love geeks.

01:46.000 --> 01:49.000
But right now, we're not getting it to function as we'd like.

01:49.000 --> 01:51.000
It's really hard for us.

01:51.000 --> 01:55.000
So what we started out with next.

01:55.000 --> 01:57.000
We're doing a lot of work with next now.

01:57.000 --> 02:01.000
And we're initially looking at trying to gauge the performance

02:01.000 --> 02:05.000
of skill-a-db and Cassandra.

02:05.000 --> 02:08.000
You know, these database applications and platforms.

02:08.000 --> 02:09.000
Who's here?

02:09.000 --> 02:12.000
Who here is familiar with skill-a-db?

02:12.000 --> 02:13.000
That's database.

02:13.000 --> 02:15.000
OK, Cassandra, probably.

02:15.000 --> 02:16.000
Yeah, Cassandra.

02:16.000 --> 02:18.000
Yeah, everybody is Cassandra almost right.

02:18.000 --> 02:26.000
So it became notoriously hard for just for some reason.

02:26.000 --> 02:28.000
That might be obvious.

02:28.000 --> 02:36.000
The skill-a-db versus Cassandra for the performance based on the different

02:36.000 --> 02:42.000
benchmarks for that, latency, et cetera.

02:42.000 --> 02:46.000
It was just very skewed.

02:46.000 --> 02:48.000
That's not the right word.

02:48.000 --> 02:52.000
But as far as the performance and the benchmarks go of skill-a-db,

02:52.000 --> 02:56.000
those on file are very, the performance and skill-a-db,

02:56.000 --> 03:00.000
according to those benchmarks, are like 2 to 3x.

03:00.000 --> 03:03.000
And we'll go to that later, specific benchmarks.

03:03.000 --> 03:07.000
But compared to Cassandra.

03:07.000 --> 03:12.000
So what we wanted to do is just compare the two and a more apples

03:12.000 --> 03:15.000
way, nicks are sorry, excuse me.

03:15.000 --> 03:18.000
A Cassandra versus skill-a-db.

03:18.000 --> 03:23.000
And we didn't, it's difficult because right now we don't have access

03:23.000 --> 03:29.000
to a skill-a-db package to do that to put it on nicks.

03:29.000 --> 03:33.000
So we're asking, we're trying to get the skill-a-db folks to help us with that.

03:33.000 --> 03:36.000
So hopefully we'll be able to do that soon one day.

03:36.000 --> 03:39.000
But at the meantime, we were looking at how to do this, how the gauge

03:39.000 --> 03:42.000
of benchmarks with Cassandra.

03:42.000 --> 03:46.000
And we covered this with, for example,

03:46.000 --> 03:49.000
Docker, oh yeah, the other issue is we wanted to share our work,

03:49.000 --> 03:53.000
to share our benchmarks so you can reproduce those and see for yourself.

03:53.000 --> 04:00.000
With the president with Docker, I love Docker as much as everybody else.

04:00.000 --> 04:02.000
We use it every day.

04:02.000 --> 04:06.000
But reproducing that with Docker is problematic.

04:07.000 --> 04:09.000
It's not accurate as you know.

04:09.000 --> 04:15.000
The performance of St. Cassandra differs according to the,

04:15.000 --> 04:18.000
you know, the runtime, the operating system, et cetera.

04:18.000 --> 04:24.000
I mean, you just can't port directly replicate those environments

04:24.000 --> 04:25.000
with Docker.

04:25.000 --> 04:26.000
We know that.

04:26.000 --> 04:28.000
So nicks, I think, is everybody else.

04:28.000 --> 04:33.000
Here knows that there's one of the beautiful things that nicks does

04:34.000 --> 04:36.000
is that reproduce ability aspect.

04:36.000 --> 04:40.000
And I find it quite amazing actually.

04:40.000 --> 04:43.000
So yeah, here are the benchmarks I was referring to.

04:43.000 --> 04:45.000
Cassandra versus a skill of DB.

04:45.000 --> 04:48.000
These are provided by skill of DB.

04:48.000 --> 04:51.000
Now, you know, these different, you know,

04:51.000 --> 04:56.000
latency, et cetera, the rewrite frequency, et cetera.

04:56.000 --> 05:00.000
Yeah, skill or Cassandra gets killed.

05:00.000 --> 05:03.000
But, you know, this is probably.

05:03.000 --> 05:04.000
Yeah, I mean, I don't know.

05:04.000 --> 05:06.000
I would like to put this on nicks.

05:06.000 --> 05:08.000
That's what I want to do.

05:08.000 --> 05:10.000
And hopefully we can make that happen.

05:10.000 --> 05:12.000
Make that comparison happen.

05:12.000 --> 05:14.000
And going back to the Docker issue, you know,

05:14.000 --> 05:19.000
the leaky abstractions, the containers on the host, et cetera.

05:19.000 --> 05:22.000
You know, those are, you know,

05:22.000 --> 05:25.000
and not Docker is just not, we're often said,

05:25.000 --> 05:26.000
okay, just look at Docker.

05:26.000 --> 05:27.000
Just how we share a work.

05:27.000 --> 05:29.000
We'll put it on a Docker container.

05:30.000 --> 05:32.000
No, that's not really.

05:32.000 --> 05:35.000
That doesn't work as far as reproducibility goes.

05:35.000 --> 05:38.000
You know, for a different, a number of different reasons.

05:38.000 --> 05:41.000
I mean, it's even contingent on the, you know, operating system.

05:41.000 --> 05:44.000
Of course, your laptop, whatever.

05:44.000 --> 05:47.000
With, you know, Docker for those, you know,

05:47.000 --> 05:50.000
reproducing results or reproducing environments.

05:50.000 --> 05:53.000
It's just not cut, cut out for that.

05:53.000 --> 05:58.000
So that's, again, that's what I was just mentioned.

05:58.000 --> 06:02.000
Well, at the beginning was that, you know, we're frustrated

06:02.000 --> 06:03.000
in our journey.

06:03.000 --> 06:06.000
I hate that word, but for lack of a better word,

06:06.000 --> 06:08.000
our journey to figure out, you know,

06:08.000 --> 06:11.000
how are we going to act early, you know, compare,

06:11.000 --> 06:15.000
not just Cassandra and, and, and skill of DB,

06:15.000 --> 06:19.000
but, you know, other runtime performances or different applications.

06:19.000 --> 06:23.000
And, you know, for right now, it just looks like Nick's

06:23.000 --> 06:24.000
is the way to go for that.

06:24.000 --> 06:26.000
If anybody has any alternatives,

06:26.000 --> 06:29.000
I don't know, geeks shows promise, but right now,

06:29.000 --> 06:31.000
I can't think of anything better than with Nick's.

06:31.000 --> 06:34.000
And it's pretty fun to set up.

06:34.000 --> 06:37.000
I mean, but that won't get to that in a second.

06:37.000 --> 06:43.000
I think most people are probably familiar already

06:43.000 --> 06:47.000
with, you know, why, you know, the next functionality,

06:47.000 --> 06:49.000
you know, how that works.

06:49.000 --> 06:53.000
You know, I've learned recently that, oh, yeah,

06:53.000 --> 06:57.000
sorry, to go back to one thing about Nick's,

06:57.000 --> 07:01.000
I've learned recently, is that Debbie and actually,

07:01.000 --> 07:06.000
for a while, was trying to get over that reproducibility hump

07:06.000 --> 07:10.000
with, you know, with that doctor, you know,

07:10.000 --> 07:12.000
to solve that doctor issue for that quote,

07:12.000 --> 07:13.000
drift.

07:13.000 --> 07:15.000
It's another word I hate using, but, you know,

07:15.000 --> 07:18.000
adding, you know, reproducibility to something like

07:18.000 --> 07:20.000
doctor and Debbie and stop that.

07:20.000 --> 07:22.000
In fact, they started using Nick.

07:22.000 --> 07:23.000
I learned that recently.

07:23.000 --> 07:24.000
I thought it was interesting.

07:24.000 --> 07:26.000
Everybody here does Debbie and Linux.

07:26.000 --> 07:27.000
Yep.

07:27.000 --> 07:28.000
Yeah.

07:28.000 --> 07:29.000
I don't know.

07:29.000 --> 07:34.000
Anyway, so, so, going back to Nick's, you know,

07:34.000 --> 07:37.000
the reproducibility, which I found was fascinating,

07:37.000 --> 07:40.000
it was, you know, on the level with the output,

07:40.000 --> 07:43.000
it was the hash functionality of, you know,

07:43.000 --> 07:47.000
the flake configuration, FL, AKE,

07:48.000 --> 07:53.000
and that, for me, just on a computational level,

07:53.000 --> 07:57.000
it's fascinating, I thought, because that,

07:57.000 --> 08:02.000
the reproducibility hinges on that hash functionality,

08:02.000 --> 08:08.000
where if that hash sees one single digit in the code

08:08.000 --> 08:13.000
that is different from the build, it will not function.

08:13.000 --> 08:14.000
It just stops.

08:14.000 --> 08:16.000
It's just not, will not read to that.

08:16.000 --> 08:20.000
And so, that's what I thought was, you know,

08:20.000 --> 08:22.000
computationaly, interesting,

08:22.000 --> 08:25.000
main aspect of Nick's, which I found fascinating.

08:25.000 --> 08:31.000
I guess you could call it the power of hash, if you'd like.

08:31.000 --> 08:36.000
So, so again, you know, we put this through, you know,

08:36.000 --> 08:39.000
set this up with, well, an engineer he did at first.

08:39.000 --> 08:42.000
One of, one of the engineers on our team,

08:42.000 --> 08:45.000
who's credited at the end of this, he's in the US.

08:45.000 --> 08:49.000
And he, you know, we, you know, this configuration

08:49.000 --> 08:51.000
evolved, you know, Sanders, I mentioned, you know,

08:51.000 --> 08:56.000
we loaded up Nick's, we, you know, looked at,

08:56.000 --> 08:58.000
we did, we pulled it off GitHub, you know,

08:58.000 --> 09:00.000
we did the standard thing of putting Nick's onto the machine,

09:00.000 --> 09:04.000
getting that flake file and integrating the package,

09:04.000 --> 09:08.000
the, um, the standard package with Nick's.

09:08.000 --> 09:12.000
And that proved a little difficult sometimes.

09:12.000 --> 09:18.000
You know, the, um, sometimes the, you know,

09:18.000 --> 09:22.000
it was at one point, it said that we,

09:22.000 --> 09:25.000
if I remember correctly, the cache was not

09:25.000 --> 09:28.000
considered correctly, whether the Java was reading

09:28.000 --> 09:31.000
into the wrong file or the wrong place.

09:31.000 --> 09:35.000
And that, um, and it kept failing.

09:35.000 --> 09:39.000
So that was, um, I had a look and dig into the documentation

09:39.000 --> 09:42.000
that took a few hours just that one part.

09:42.000 --> 09:45.000
But we figured it out, um, and just the big,

09:45.000 --> 09:48.000
quite candid, I just cut and pasted from the documentation

09:48.000 --> 09:49.000
and it works now.

09:49.000 --> 09:50.000
All right.

09:50.000 --> 09:51.000
I hope it will.

09:51.000 --> 09:52.000
And I do the demo.

09:52.000 --> 09:54.000
So again, going back to the, you know,

09:54.000 --> 09:58.000
summary of the, um, you know,

09:58.000 --> 10:00.000
you know, for the, you know, how Nick's works with the

10:00.000 --> 10:02.000
standard, particularly, you're just in general,

10:02.000 --> 10:05.000
you know, you have to tick at the hash, um,

10:05.000 --> 10:07.000
it's calculating, you know, ensuring that the,

10:07.000 --> 10:11.000
the code is, um, completely has not changed.

10:11.000 --> 10:14.000
It's immutable as, as a term, uh,

10:14.000 --> 10:15.000
checks everything.

10:15.000 --> 10:17.000
You can put, you know, thoughts, say, some fridge.

10:17.000 --> 10:20.000
Download the, the pre-build environment,

10:20.000 --> 10:22.000
which I did once.

10:22.000 --> 10:25.000
So at the end of this, I'll show you the GitHub link.

10:25.000 --> 10:29.000
You should just be able to clone the GitHub repository

10:30.000 --> 10:31.000
and run this benchmark.

10:31.000 --> 10:33.000
Uh, it'll save you probably,

10:33.000 --> 10:35.000
according to my engineer.

10:35.000 --> 10:37.000
He spent 10 hours getting this set up.

10:37.000 --> 10:39.000
So that way you can just do that.

10:39.000 --> 10:41.000
Um, I hope.

10:41.000 --> 10:43.000
Let me know if it doesn't work.

10:43.000 --> 10:45.000
Or put it in a pull request and get up.

10:45.000 --> 10:46.000
If you like.

10:46.000 --> 10:48.000
So anyway, going back to this workflow, uh,

10:48.000 --> 10:50.000
you know, a download environment,

10:50.000 --> 10:53.000
and you spin it up and you start looking at your benchmark.

10:53.000 --> 10:55.000
Uh, that's, that's essentially it.

10:55.000 --> 10:58.000
Um, you know, until now,

10:58.000 --> 11:00.000
let's see if I can have any questions right now.

11:00.000 --> 11:02.000
Nope.

11:02.000 --> 11:03.000
Nope.

11:03.000 --> 11:04.000
Okay.

11:04.000 --> 11:05.000
Great.

11:05.000 --> 11:06.000
Okay.

11:06.000 --> 11:08.000
I see how this works.

11:29.000 --> 11:30.000
Okay.

11:30.000 --> 11:31.000
Okay.

11:31.000 --> 11:32.000
Okay.

11:32.000 --> 11:33.000
Okay.

11:33.000 --> 11:34.000
Okay.

11:34.000 --> 11:36.000
Okay.

11:36.000 --> 11:37.000
Okay.

11:37.000 --> 11:38.000
Okay.

11:38.000 --> 11:39.000
Okay.

11:39.000 --> 11:40.000
Okay.

11:40.000 --> 11:41.000
Okay.

11:41.000 --> 11:42.000
Okay.

11:42.000 --> 11:43.000
Okay.

11:43.000 --> 11:44.000
Okay.

11:44.000 --> 11:45.000
Okay.

11:45.000 --> 11:46.000
Okay.

11:46.000 --> 11:47.000
Okay.

11:47.000 --> 11:48.000
Okay.

11:48.000 --> 11:49.000
Okay.

11:49.000 --> 11:50.000
Okay.

11:50.000 --> 11:51.000
Okay.

11:51.000 --> 11:52.000
Okay.

11:52.000 --> 11:53.000
Okay.

11:53.000 --> 11:54.000
Okay.

11:54.000 --> 11:55.000
Okay.

11:55.000 --> 11:56.000
Okay.

11:56.000 --> 11:57.000
Okay.

11:57.000 --> 11:59.000
Okay.

11:59.000 --> 12:00.000
Okay.

12:12.000 --> 12:13.000
So, that's like what I'm getting.

12:13.000 --> 12:14.000
I see you.

12:14.000 --> 12:15.000
Okay.

12:15.000 --> 12:16.000
I apologize.

12:16.000 --> 12:18.000
We had a problem with that cable.

12:18.000 --> 12:19.000
Yeah.

12:19.000 --> 12:21.000
Is well, the old cable.

12:21.000 --> 12:25.000
I'm not sure what we can do about it.

12:26.000 --> 12:28.000
Well, excuse me, sir.

12:28.000 --> 12:29.000
Okay.

12:29.000 --> 12:30.000
Okay.

12:30.000 --> 12:31.000
Okay.

12:31.000 --> 12:33.000
Okay.

12:33.000 --> 12:34.000
Okay.

12:34.000 --> 12:35.000
Okay.

12:35.000 --> 12:36.000
Okay.

12:36.000 --> 12:37.000
Okay.

12:37.000 --> 12:38.000
Okay.

12:38.000 --> 12:39.000
Okay.

12:39.000 --> 12:41.000
I apologize.

12:41.000 --> 12:42.000
Okay.

12:42.000 --> 12:44.000
Actually we already had this one cable.

12:44.000 --> 12:45.000
Okay.

12:45.000 --> 12:46.000
Okay.

12:50.000 --> 12:51.000
Okay.

12:51.000 --> 12:52.000
Yeah.

12:52.000 --> 12:53.000
Yeah.

12:53.000 --> 12:54.000
That seems to be a bit flagging.

12:54.000 --> 12:58.000
Where did it start working when I was pulling it or when it was?

13:04.000 --> 13:06.000
I don't have a different screen.

13:07.000 --> 13:09.000
Oh yeah.

13:24.000 --> 13:50.000
I understand the impulse of trying to, you know, have, like, chatting a bit, but let's not lean into that too much because the room gets really, you know, a bit too energetic.

13:51.000 --> 13:52.000
Can I go?

14:04.000 --> 14:05.000
Okay.

14:05.000 --> 14:06.000
Sorry.

14:06.000 --> 14:09.000
So we got that figured out.

14:12.000 --> 14:14.000
Alright, everybody, please quiet down.

14:14.000 --> 14:18.000
We're continuing after just a bit of technical difficulty.

14:19.000 --> 14:21.000
Okay, now it's not going to work.

14:21.000 --> 14:22.000
It's not going to work.

14:24.000 --> 14:29.000
Anyway, so if you pull this down, if you have, I've seen you have Python.

14:29.000 --> 14:31.000
You have Java.

14:31.000 --> 14:34.000
You have what's necessary, you know, to do this.

14:34.000 --> 14:37.000
Just pull it off and get up and I'll show you the link at the end of this.

14:37.000 --> 14:40.000
And once you're in, you're just going to the directory.

14:40.000 --> 14:45.000
Are you cloned into and of just hopefully a work.

14:49.000 --> 14:50.000
Alright.

14:57.000 --> 15:00.000
Just takes like 20, 15, 20 seconds.

15:09.000 --> 15:11.000
Anybody have any questions so far?

15:11.000 --> 15:12.000
Nope.

15:12.000 --> 15:19.000
It's always great to have talks with live demos.

15:19.000 --> 15:21.000
I'm sorry.

15:21.000 --> 15:26.000
It's always great to have talks where you really see something happening, mate.

15:28.000 --> 15:30.000
Tick, tick, tick, tick.

15:30.000 --> 15:33.000
Alright, there we go.

15:33.000 --> 15:36.000
Okay, let's run the benchmark.

15:43.000 --> 15:48.000
Here we go.

15:48.000 --> 15:49.000
That's it.

15:49.000 --> 15:51.000
We got our benchmarks.

15:51.000 --> 15:55.000
Yep.

15:55.000 --> 15:58.000
That's not the easy part actually.

15:58.000 --> 15:59.000
Are two things.

15:59.000 --> 16:04.000
If you look at the statistics, so I mean, if you look at the other benchmarks we did,

16:04.000 --> 16:07.000
there's a variation of 20 to 30%.

16:08.000 --> 16:14.000
So you would say we failed, but we didn't because this is running my laptop.

16:14.000 --> 16:20.000
So as this is scaled, if we were using a very, if we scaled at thousands of, you know,

16:20.000 --> 16:26.000
to thousands of X, that variation would be maybe one to two percent.

16:26.000 --> 16:31.000
So the fact I'm doing in my laptop, there's a lot of Paris Titoz or Paris Heights,

16:31.000 --> 16:36.000
which would contribute to that 20 to 30% difference between the different benchmarks we ran.

16:37.000 --> 16:39.000
So that's it.

16:39.000 --> 16:43.000
And then what I found particularly interesting too,

16:46.000 --> 16:51.000
is that we just, it's very, you know,

16:51.000 --> 16:57.000
state was, I don't think it's the right word, but we just, you know,

16:57.000 --> 17:00.000
killed all and start over.

17:00.000 --> 17:04.000
That's it.

17:04.000 --> 17:05.000
It's done.

17:05.000 --> 17:08.000
It's gone.

17:08.000 --> 17:15.000
That's, that's out of the hard part.

17:15.000 --> 17:25.000
So hopefully my slide will be working again.

17:26.000 --> 17:29.000
Okay, so here's the shout outs to, you know,

17:29.000 --> 17:33.000
here's, if you want to, if you want to use the, you know, do this.

17:33.000 --> 17:37.000
Again, the easy part is just should be pulled this off from the GitHub and clone it.

17:37.000 --> 17:41.000
You can start doing your benchmarks on Cassandra.

17:41.000 --> 17:42.000
Hmm.

17:42.000 --> 17:45.000
The, you know, the acknowledgements, obviously Nick's next to us.

17:45.000 --> 17:48.000
And then Shaiid Khan, he's facing the US, he was working for us.

17:48.000 --> 17:50.000
Now he's working for Deloitte.

17:50.000 --> 17:53.000
And that's not so great, but anyway.

17:53.000 --> 17:56.000
And come give us a shout if you like.

17:56.000 --> 17:58.000
We love doing science and testing.

17:58.000 --> 17:59.000
That's what we like to do.

17:59.000 --> 18:00.000
Like to do.

18:00.000 --> 18:01.000
Thank you.

18:01.000 --> 18:03.000
Thank you very much for this game.

