2 Beginner Puppeteer waitFors to Avoid

The worst part about Puppeteer is handling the timing.

Here I list some Puppeteer timing functions that I love and some that I hate (but still sometimes have to use).

Puppeteer docs – https://pptr.dev/#?product=Puppeteer&version=v13.4.1&show=api-pagewaitfornavigationoptions

New York Secretary of State Testing Page – https://apps.dos.ny.gov/publicInquiry/EntityListDisplay

Oklahoma Secretary of State Testing Page – https://www.sos.ok.gov/corp/corpInquiryFind.aspx

Transcription:

Hello there I’m Jordan Hanson from cobalt intelligence. Today. We’re going to talk about a puppeteer, wait fours. Sometimes they’re really great. And sometimes they don’t work as you expect. And I want to just go through some of those things. And maybe how had to do them better or understand a little bit better, how they work and which ones I use, which ones I think are the ones I use that my bread and butter I’ll use them all the time and which ones can be useful.

And sometimes they’re just a pain too. So maybe not worth the effort. So I’m going to talk about three, four main ones right now. This is the dev. This is the documentation. I’m going to be talking about the main one I use all the time is wait for a selector. Oh no. See, I’ve lent to set up in this little test project.

I used to wait for selector. It’s my number one. It is the most reliable. It’s going to wait for the page for that selector to show up and be visible. Now you can also add parameters. I’ve talked about this in a previous video, you can say like visible, true, or visible, false or whatever, so you can make sure everything’s how you want it.

But we’re going to start here and we’re gonna go to New York and show how this will work. So we’re going to come over here. We’re going to fill it out. And then we’re going to throw out a page and I have a timeout here, so it’s gonna look really fast, but it’s going to look something like this. How that filters K returned to search.

So come over here, it’s going to type in pizza here. It’s going to go like this, this, and it’s going to go to the page. It’s going to click corporations, click limited liability. It’s going to type in this and it’s going to click this and we want to wait for our data to show up. So we just set up a selector.

We say, wait for a selector. And we wait for this table right here, this selector, as soon as it shows up, then it says, okay, we’re done. Well, then we’re done waiting right there. So we’ll see this little console log pop up and then we’ll wait for five seconds. Just so we have. Even processed, otherwise it would just be so fast.

So it goes over here, fills in all the information done waiting just like that. Bam. The cool thing about this is wait for a selector is it’s it’s going to be probably the most responsive and the one you’re gonna want to use the most. Now there’s these other two right here. That to me are kind of tricky.

They sound like they’d be the best way for network idle. Wait for navigation. What does this need? Oh, I see. Well, we won’t talk about that. Then this is a new one waiting for network idle, wait for navigation. This one sounds like it would be amazing. And sometimes it works great. Like it here, it works pretty.

Okay. You can wait for navigation, which means it should just wait for the page to finish loading. That’s what you think it comes over here and DNC has done waiting on. There it’s done. It’s waiting. It’s five seconds. They’ll turn off now where this gets tricky is that websites often make little pings in the background.

So they’ll still be Ajax requests will be happening back and forth. And so they added this cool thing where you could say, wait and tell, and you can pass enough, certain parameter, weight, and tone network idle a zero or two. It’s better because they’re saying, okay, maybe there’s one ping happening until network out to we’ll just handle it.

And if again, for this page, it works great. It works really well. And it’s easier because then you don’t have to find a selector every time. Right? You just put this in between every request, every time, every click, and you’re like, oh, this will handle it. But my experience has been often, this will give you more headache than it’s worth wait for a selector is almost always going to be more reliable that’s so an example of when it could be traumatic pain and it’s going to be Amazon, we’re going to go over here to Amazon right here, and we’re gonna do that same thing.

We’re going to go over to Amazon and we’re just going to wait for navigation network idle to. To mean that, Hey, we’re gonna wait for, there should be that many requests happening. I look not done waiting, like never because look, look what Amazon does over here. Network. When we’re on fetch requests, all look right here, I guess, all even look there’s constantly things there’s pings happening all the time, going back and forth, doing different things.

And so this network has never. And it WebSockets anything like that. Things can trigger it. It’ll if there’s activity on that, on this wire right here, this will never be triggered. And this happens more than you’d think even when you look through and you see, I don’t see anything that should be moving. It still can be.

So that’s why I think wait for selector is almost always, should be your go-to. Now there’s also this other one called wait for timeout. I think this is the worst one. Don’t use it if you can avoid it because what this is going to do, this is going to guarantee, I mean, I’m using it here just for pausing so we can look at things, but you’ll probably end up having to use this more often than you want, but if you can avoid it always, you should.

Because the fact of the matter is it’s adding time to your script. It’s going to make it slower and unnecessarily because you are not going to be able to predict how long that request takes or how long, whatever you’re supposed to wait for. So it takes one and a half seconds to get here. Or sometimes, sometimes it’s going to take two seconds.

Sometimes it’ll take three. You see? I said, okay, I’ll just say five seconds. We were adding two seconds of time on the longest one. And you’re adding three and a half, four seconds on the shortest one. So wait for timeout. I still use it sometimes because sometimes you just can’t figure out what the freak is waiting for your freaking thing to load.

And so you had to use, wait for time out just for like a little bit to do it, try to avoid it. Now there’s one time when you want to use this other one called wait for response. A really good example of that is over here in Oklahoma. And you look over here at pizza, right? We search right here. We’re going to search this works.

Great. We’re waiting. We’re going to wait for this selector for this table. That shows up. Awesome. Wait for a selector works great. Now we can’t use wait for a selector, always when we hit pagination, why anyone you in the back? Okay. That’s right, because this table is going to be here. We’re not, there’s no new selector.

We paginated through. It makes another request right here. But it’s his passing and he looks through three there’s another one, but there’s no news known a table getting, getting made. And so we can’t wait for a selector here because the selector is already on the page. If we wait for your selector it’ll load, it’ll hit.

It will satisfy the condition of waiting for the selector before this is actually loaded. So this is when we’re going to use, wait for response. Now the response is really cool. It’s really good when you’re doing Ajax requests or things like that in the background. So here we go. We’re going to navigate to the page.

We’re going to click in here. We’re going to type pizza, and then we’re going to hit the search button right there. See. Then we’re going to wait some for selector, which we’re going to wait for this table. And we had logged out when I’m done waiting for the search results. We’re going to log out the first business we find, which is the one at the top.

Then we’re going to click the pagination. So pagination a Anthem type one, which will be this one page two, because if we refresh here, watch pizza right there. I’ll come over here. Loading. This is going to be a type two right there. And. Then we click that. And so what we’re gonna do, what we’re gonna do then is wait for response and we’re gonna wait for this thing to respond.

Now you can do other cool things with this. Look at this, wait for response, or here you can also pass in things like, oh, I want the response test to be 200. So you want to make sure it’s a successful response. You want to make sure to include the HTML, all these cool things that are in the documentation that you can also include in your, oh, wait, where it’s going to do a simple wait for response to that thing to be loading.

And we’ll say, don’t wait for the next page and we’ll find the first business on the next. Okay. Ready? Let’s try it.

Coming to that. Yeah, I did. Okay. So it should pop up. Here we go. It’s pizza. It’s searching right now. See it’s holding. Okay. Bam. Now what happens if we come over here and say, I don’t want to be proven wrong here. I know let’s just put this wait for a select. We’ll say, oh, well, let’s wait for this page three.

Then now

this should fail immediately. This it’ll come down to this and it’ll try to resolve and it should fail or it should get to the same. It should get pizza out twice because it won’t finish resolving. Now the caveat is that loaded pretty fast. So maybe it’s quick enough, but I don’t think it looks see the problem there went to the second page, but because this was done loading immediate.

Like it didn’t wait for that pagination to happen. And so it just looked at the same distance twice versus this one, they got pizza out and pizza made it on. There we go. So there are the weight fours that I use. So again, wait for a selector. I is my bread and butter. I recommend it. Wait for response is good for certain situations.

Wait for timeout. Avoid it whenever you can. Wait for net network idle. What is that? Wait for navigation, use it with care. Most of the time, it doesn’t always work as happily as you want. Same with wait for response. These seams can be trickier. So your bread and butter should be waiting for a selector.

Wait for time out when you have to, and you hate it every minute and then wait for a response when you really need it. Wait for a response. Wait for navigation. So many times it has not worked Taiwan. So, okay. That’s it. That’s the way for us. So I think also with wait for selector we’ll conduct about that.

Another time. There we go. There’s puppeteers wait for is, and which ones I think you should use and which ones I think you should maybe avoid. Thank you.