To the Missouri Secretary of State business search! https://bsd.sos.mo.gov/BusinessEntity/BESearch.aspx?SearchType=0

Today we get document information for Missouri. And it was a toughie! The documents are paginated and so not only do we have to get the information but we also need to handle pagination.



Hello there. My name is Jordan Hanson. I’m here with cobalt intelligence. We’re working on getting the secretary of state data. We already have a setup for Missouri and today we’re going to try and get document information as in their filings. So I think this one might be tricky. I think there’s pagination that might be involved at an LLC.

So if I go over here and I search for pizza, well, you still have that.

So I don’t use the sizing habit pass small, but zoomed in as well. Oh my gosh. I don’t know if this is me or this is the internet everywhere. There’s a lot of stuff that seems like it’s down. I’ve already started my modem and speed tests feels fine. Look, I go over here and it’s like

crazy fast. Maybe, maybe not look at this right here. I would like one that doesn’t have this weird, like this one let’s go to normal on Camry. Like this be. Oh yeah. Okay. So we go over here to filings. What the heck?

Okay. I think it’s a load speed test. Okay. Maybe it’s me. I don’t know what’s going on anyway. Look at that, look at that bounce. Okay. We’re going to come over here and we’ll get these now. I think if it’s an older business, this is what I’m afraid of. There is going to be a lot. Anyway, let’s go over here. We’re going to come down here to, this is the page we’re on.

I hear we’re going to go over here and it’s talking about getting all the business stuff. Now we get documents, good document.

And I think what’s going to happen first. We need to get that.

Right here. So I think this is what I need to do. So like, if you could be documented, it opens it up here and it gets the document ID. So I think we can get the document ID and we can just put them whenever we want and then we can get it. Cool. Okay.

Alright, there we go. Okay. So I think this is you are all we want and we’re going to come over here. The problem is only going to be if there’s multiple. So for now, we’re going to go like this. We’re going to say cons document for rows equals await page dot Dala, Dala, which gets an array of selectors. So we’re going to come over here and say table Yeah, this is one look at that huge ID.

And then we go to body and then go tr Peabody, T R K. There we go. Then we lived through want, almost say four I document rows document row. And then each one

we create a document that said con. Documented I document that’s not what I wanted. I document index document, whatever. Okay. Important here. If I go like this now after yeah, there we go. Okay. And we need the name of it and we can get that from a weight document, row dot dollar Val. And it’s going to be TD and type what three.

We can go check T R wait. I need to look in this tier and they have three, one. It was this one. Okay. See you not three is hidden stuff that may be want one, a two is this 2, 3, 4, 5. Wow. Close. So the element element that text content, there we go. And then we go date and I’m going to say a wave document road.

Eva is going to be TV and type what? Like six, maybe it’s name. Yeah, six, their date element dot text content. And then we need URL. I don’t think this is going to be as bad as I think. I think this is about five. No, this is not screaming. This, and this is the thing we’ll just replace. But this document ID equals now the only concern is a pagination and that is a concern.

Okay. So document ID is what was that one? Like four or something? Let’s take this. Okay. Didn’t do it. It’s documented who was at 4 9, 4, 4 8 1. When you find that number

and knowing there. Okay, so it’s not this number, this hidden ones out they w what is that? Was that.

Okay. Not that one. It’s going to be TV into type three and it’s got, we’re gonna have to split that I think. Okay. We can do that. It’s going to be wait, document, row dot evil, and it’s going to be TV and of type three, I think. Right.

1, 2, 3, 3. Yep. And then we’re going to go into the input. How would I say element element get attribute? And the attribute is going to be not the ID down, click click, like that

document ID, click. Okay. I got a const document ID, and now it’s going to say, well, look something like this, this one like that. So we’re going to say document ID and click, or we’re going to split on.

That like that. And then we just get this first part of it and we split on this and then we get the last part. It, does that make sense? Oh, no. I used to speak to that in the comma. Okay. Does that make sense? What we’re gonna do is just split on this. So it’d be two parts. This will be one. This will be the other part.

And we just take the second part one right there and we spend the comma and we get the first part right there. And then we go with like this document ID. So it looks good. Now we go like this. If business stuff. Is it business, business documents. If it exists, are ready, then we’ll push it into business documents.

So that’s like the second one, right? Let me push document. And it, if it’s the first one we’re going to go at this business dot documents, we’re going to say equal to document like that.

Okay, now we’ll send it up and then we’ll look for some examples more than once. I’m pretty sure I saw it. Paginated documents Bush or the master while this is building.

Okay. So we need to do some more. I happened here. See, upload speed. Okay. How about let’s be zero

software. Oh, look at that bounce that it’s not fun. I’m glad I didn’t build. Okay.

Good standing. All right. Digital wave. How about we got digital wave. It was filed in this series. What I’m worried about now we can hold. What are the chances that all these rows are in there and it just, Nope, I got a pagination, which means

same page, but post.

what was this company named anyway, positive, whatever, right? Yeah.

What’s just happened here.

Yeah. Just post there with a payload of. But where’s our Padgett. Where’s our second page.

We’re afraid we’re going to wrap this in a loop that clicks the buttons. Now, does this go away though? Okay, please. I want that too.

Hey, this is always a pain. I’ve done many a time. Okay. Well, let’s see if it’ll work with whatever with this. I like that name. That’s going to SDK out so we can test it.

I don’t go over here to a Missouri what’s Missouri emo look right here, right? Yeah. There we go.

oh man, I just close this example.

Almost done. Still building. Okay. Let’s think how we’re going to do this though.

I have to be just like a loop where it goes,

man, look how slow that is. I hate that could be me. I don’t know what’s going on with band net, but it’s not doing well. That’s for sure. Not doing well.

I think we’re going to have to get this. If we assume 10 per right, we get this number. Maybe

that’s not an easy number to get.

I wonder if this is actually done is done. Okay. Let’s go over to here. We’re running this one and we’ll see if we get documents, but if we want to get all documents, we gotta do this. That’s means paginating through this thing. So we had to get the total number of documents

get total number of documents. So you know how long to paginate,

what if we just went and tell. Document Rosedale length

is less than 10 when that work.

So we’re calling this.

I think in a while actually.

Hey, what happened over here? Say who had document?

Oh, that’s good. Nice. Okay. So it work. This part works great.

We just had to figure out how to do the pagination part.

Now I’m just gonna do it this way. So I need to loop through this now, the first one.

Okay. What was it like this gauze, total documents equals away page dot something. We should be able to find it somewhere. We, I think the com filing, I didn’t click on filings.

Does that mean it’s just there.

I ain’t click I’m filing this.

Oh my gosh.

okay. Even though you can’t see it, it finds.

Okay, hold on. Let’s try something here. What will happen? This is try digital wave Inc.

Gosh, you see that long, that took a long time.

Okay. So we’ve got this one. That’s 11. So you look, there’s not more than that. Ooh. An error happened even. Ooh, it’s exciting. What’s the air. I don’t know what could have failed.

okay. Pages stuff. Okay. Now we need to get how many, well, first we got to click over here. I don’t think that’s. I think it’s silly to not click on.

so I think it’s like this

L I.

That’s it, the air here

paginated and grills. Right. Am I right? It’s got to click here. I’m going to go here and I’m going to see one of these two appear, but they’re going to be there the whole time, which is the hard part.

Can we click this without being on.

Good try. I guess it’s there

in theory, I should be able to find this. Oh gosh, it takes so long. This right here.

Ooh, the pagination row, but that’s where it failed. I bet that was the, where it broke.

Let’s watch it like this. And there’s 11. See, I’m trying to parse this 12th one or this 11th one. Okay.

okay, this is good, right? It goes through this now. 11, 11 I think is our, no, this is good too. What’s the first one. Then the first one’s Apache. Okay. So we got to check that to say, Hey,

I don’t think it has it on the second one. Right? The second row.

It’s a different table.

Okay. So we’re going to be over here.

I don’t know what’s going on here, but the first one is when I’m need to worry about, so I need to do is if

I just skip it. Oh, if I just like this.

Yeah. I couldn’t find the. I couldn’t find this cause it didn’t exist. Okay. So if I first row,

I guess hetero or it’s a hetero paginated row, nobody not, may not always be that. There’s no pagination. I can’t do that. Okay. That means. That means we got to go here and do this, then we’ve got to see this. Okay. So we have our input. It’s hidden. You can’t see it. So we’re just working by the force. That’s the word appointments, and it’s going to be longer for sure.

So now we go at this, we say con. Eight zero, wait, document a row evil.

I think, I think I thought with this,

I skip it.

Now this let’s come about that for now. And we’ll run it, turn it back up again and see if it’ll work for that one. The meantime, which I click this crap.

Okay. So I got to find it. That’s right.


And then I go strong and type 4, 5 23.

Now what happens if it doesn’t exist? I think we’ve got to live this.

I think that I’m guessing it’s not gonna exist over here.

He doesn’t find it. Okay. If multiple document pages,

that total documents

number, let me say total documents equals await page that evil. We go element. I know this is long, probably drag. This is good. You’re just seeing some of the pain here. You’d go through sometimes when you’re doing landscaping. Okay. We have our total documents and it’s going to be a number string

because I already have it here. Yep. That’s going to be a string. We’re going to get total. But it’s common kind of a string. I mean, we’re really gonna parse it, this thing.

Okay. So we’ve got total documents.

Yeah, this, we say total pages, and then we’re going to loop right. This do this. I’m like this

shouldn’t I be updating. Yeah. Okay. Let’s go back to SDK. Should I run this guy in the meantime? Okay. So now we know how many pages, right. We want to go through this. And when I go through total documents divided by three.

That should work, even if it gives a decimal who cares. So dig not abide by three divided by 10, because there’s going to be 10 of each page.

So if we had 23, we divided by 10. I, if I could type 10, 2.3, so this is going to go well, that’s fine. The page index will end the increment every time. And now I have to go in here, but I have to know.

what if there’s not

okay. I’m just going to put this, I think its own thing where you factor. I can’t come on.

Hey sync function, handle documents.

What I’m going to need is page and business.

Now, do we have any errors? What’s the errors now.

Doesn’t like something here.

What does it not like you can’t see cause I’m so zoomed in.

And then I go, I think I’m gonna put this inside there because I’ve told lockups exists like that. And then we say await handle document, age, business, like that should be handled documents that.

I think that now else, wait, handle document. We just do this once, right? Paige business, I think. Yeah. Baby, look at that.

How many do we have? 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. All right. Now the. Okay. So let’s walk through this really quick. So we’re going to get total pages. We’re going to check to see if there are multiple, are there multiple pages? And if there are, then we’re going to get total dockets. I can, now I can go like this

and we’re going to get the total amount there are, so it would get like 23, the string parsed it, and then we divide it by 10, which means we’re going to now loop through this. Yeah, three times zero one and two. What to all be less than 2.3 and then we’ll handle documents for each one. It’ll send in a page.

Oh, I haven’t clicked anything. I need to do that here. Wait, page, click. And we should wait. Also, that’s going to be a hard part. The waiting part will be not.

And can we click things?

Fuck this. What is this even is huge ID. Oh my gosh.

I don’t know about that.

I just want to click this one title. Next page. That’s click.

Like that till the next page, I’m gonna click that one. And how do I know how long it’s going to take to load through there? So go to filings. How long, how was it going to know?

I do wait, wait for navigation. Can I just do that please? It never works, but I’m going to try. We handled documents. We clicked to go to the next page. So we got the first 10. Then we click, then we get the next 10. Then we click. Then we get the next, however many there are, which is three. And actually we don’t handle documents.

It’s going to be on here. It’s going to check for Pedro so it shouldn’t get it. And then it’ll click again. And who cares? I’ll click the next page, but we’re not going to do it anymore. So we’re not going to get more done. All right. So in theory, this should work, not the easiest one for sure. We’ve ever done, but in theory should work

that this is kind of crummy, pagination stuff. That kind of sucks. Right. But the cool thing is we’re getting paid. We’re getting documents. Okay. See, I definitely got some internal problems with. I don’t like that.

Is it is it me? Or was it just, can’t be the world, right? It’s not everywhere. It’s gotta be me

as be, to say again. Okay. Struggle. I can’t do it even. Okay. Well, I guess we’re done. We’re going to assume I got it right, because right now I’m not gonna make you sit here and wait while my internet doesn’t push. So I can’t test that. The frustrating avenue. Okay. That’s it. I feel pretty confident about where we’re at.

There might be a few things we have to bugs to fix, but I think the general premise is good. And

Hey, did it popped up? Let’s try it one more time.

Hey, there went so apparently I’m getting in and out stuff. I don’t know, whatever I’m done. It’s taken too long.