Async, Sync, and Nodejs streams

This video addresses how to make your own Nodejs stream without a file and without an http stream. It also talks about consuming a Nodejs stream via http.

The goal is to compare and contrast the difference between just using asynchronous promises for large amounts of data or Nodejs streams.

Transcript:

Okay, here we are Jordan. I’m talking to just tinkering a little bit here with some streams. I didn’t know how to use node JS streams very well. And so I went through them and kind of played around. Now my goal with this was I wanted to be able to handle, I wanted to be able to send a bulk amount of stuff to a function.

And then as that function completed it, I wanted it to handle it and send it back to me. And I know. Um, so it’s the functions over there. And as that stuff came through, we would send it out. Okay. Now my conclusion is that if I am trying to do that, uh, it’s better just to use promises and just handle them, handle them asynchronously, and I’ll show how to do that.

So we’re going to go over async and synchronous promises and then streams. Okay. So here we go. We’re going to go like this. This is the problem, right? We have, let’s say we have this function over here. And I get that even bigger. Yeah. Let’s make it even bigger. Easy to read. Okay. Now, if I run this right here, what we have here’s we handle this individual and this is something that finishes at a random time, right?

So sometime it’s going to make some random number and it’s going to wait between 10 seconds and like half a second. It’s going to go through here and it’s going to log this stuff out. See it finished number zero. And that took six seconds. The second one only took two sets. Uh, there we go. See what I’m saying?

So this is going through, so you have, so this is now we had 50, this, this could possibly take a long time. Now watch the difference when we handle these handle, these asynchronously, this is going in order, right? We do a weight between each one. Now, if I go over here and I do this asynchronously, all it’s going to do is going to feed it into here and then it’s going to end the, then it’s going to handle it, which means it’s not blocking the rest of these.

Now, watch, this is cool. I like this. This is kind of fun stuff right here.

Okay. They’re all completely as they complete. So now pretty much the longest you’re going to wait is about 10 seconds because that’s our longest one where it SI nine 17. Because it handles it 32, 25, 29, 0 19. See how that’s just going through and just as they complete, then you do your work. And I kind of feel like this is the better way to do it with a few exceptions.

Um, the better way to handle a large amount of data, lots of like a big array of data. This I think is the best way to do it. Um, now let’s go through streams and talk about a little bit how they work now, a stream, the main goal with it is to.

Is to handle a large amount of data. And so I was like, oh, this is going to be perfect. But really what it is is it’s a large amount of like one that chunk of data, like a large, too much to load into memory. As in, like, if you have an array that’s so big, it can’t even fit like a file. That’s like a gigabyte, then you probably want to stream it because otherwise it’s just too much for their memory to handle.

So it’s really to protect your memory. If you’re not, if it’s not, you know, if you’ve got like 10,000 or 20,000 records, that’s a lot to go. Let’s just handle them asynchronously and they can go at the same time. Um, but it’s not too much to load into memory, but anyway, here’s how to do it. This took me a little bit to figure out how to just build a stream just based on, um, just with an array now, some stack Oracle stuff, whatever.

So here it is. We’re doing the stream stuff function. We call this stream stuff. We come over here. We create a readable. And in this readable stream, what we’re going to do is we’re going to move through it. And then as we go through, we push the data into there. So we do the same thing where we handle the individual and we do it asynchronously because we want to handle that same thing where it’s random, which one goes in at a time.

And then it’s because we don’t want to stop. We want to build a return the stream immediately. We don’t want it to wait for all this to finish. We want the student to be returned out here. So then we can start watching videos. And so once we have the stream that we can watch for specific events and the main ones are data and enclosed, and I don’t know the difference between ending close.

Um, so far as I can tell, they kind of happen at the same time. And it’s when you push no one there, but we can talk about that in a second. I don’t want the video to be too long. So I was going through here. It’s going to go through here where see the first one finished at eight. And the, oh, sorry. That was the second one that finished.

There’s a first C. So it did it just the same kind of way, but it was a lot more complicated than just handling it that way. Um, the cool thing is we can return this stream, but I don’t know how we know it was finished because if we put this in here, this will punish push immediately before this finish.

And we can’t wait for all of these to finish because then the stream won’t return and we can’t subscribe to all the events. Um, now again, I think streaming the main purpose is to handle if you had a huge amount of data that you didn’t want to load into your, into your memory, when you loading this, that’s when you’d want to use this.

And I have used them before for that. And it worked really well. It just wasn’t right for the use I had, which was, oh, we can just have this function. That’ll just send stuff back as it is. And that can do that, but why don’t in the function? I just handle this asynchronously like this and just handle the data inside that.

Okay. That’s the thing. I just don’t think it, I mean, I just think there’s a better way to do it now while I was here, though, I did go through and tested, um, a stream, an HTTP stream. Cause I thought that was kind of a cool idea. I was like, okay, well, how does that work? Even in my mind, didn’t really even know what to do.

Picture that, like, what does that look like from a server to a client? And so I kind of built a little express app right here. I’m going to boot it up. We’re going to start it here. And it’s this an express app it’s just waiting, listening on a route call. And when it goes right here, it, uh, it broke.

Let’s try it again. Maybe. I don’t know why that broke.

No, I’m going to give it another port, I guess please work with birds a bunch of times before it broke again. Okay.

2001, this works like crazy yesterday. I’m not sure why I left it on all night. Maybe something that connection didn’t close.

Okay. Dang it. This Rosebank video though.

I don’t want to debug this right now. Okay. Well you just have to trust me. So let’s say we have our server hosted here. In fact, I have examples over here. Okay. So we have our extra ever hosted here and you make a call to it. So we said, test HTP, HTTP stream would come over here. And in here we send it. But what it’s returning here.

We use, I used to fetch here in this case and fetch by default, it returns a stream. It says response, but it’s really a stream. So you go respond to that body and this is the stream there, no JS readable stream. And then we can do the same thing we can say on data. And that’s what it’s returning. It says response to that.

Right. And that’s writing it to the stream and then it just depends this end. Now I, we turned this response to that end here. It ends right here and in the end of it anyway. So what happens is we can. And apparently automatically we get the stream and as it writes through, it’ll wait a second between each one of these and it’ll run through.

So you look at it as 500 times, I’d run through here and it would just, the data would be logged out over here. Okay. So essentially it connected, it made a tunnel, it just connected. And then it just keeps sending data through this thing. Now that’s cool. But you still have the problem with HTTP requests are going to time out, especially if you’re on AWS API gateway, you have 2,900.

API gateway has tweeted a second before time’s up. And so you can’t handle a piece of data that takes more than 20 seconds to get to you. So if it’s huge, it’s still not the right way to do it. You got to do a different way. Um, it’s kind of cool. I think it’s a cool idea. If you control both sides. I mean, in theory you could maybe have it last forever, but it’s just risky having a connection open that long to me is a scary thing.

Now, if you use like a socket, it’s going to have. And I’ll take care of it, but anyway, so there’s streams. Um, this is the cool part to me. I didn’t know how to do it, to create my own stream like this. For some reason, you have to create your own read method. Otherwise it’s though some error, but then after that, as soon as you push this causes the data event, and then whoever’s subscribed to the stream like this, see that they’re listening for data, the data event, it’ll just pop.

And, uh, but still, I think if you’re handling a large array of data, that’s not like too big to fit in your memory. You just go asynchronously and just have them complete. You’ve got to execute whatever you want to do on the then instead of using a weight. So that’s it async sync and streams done.