1 (edited by v_pozidis 2021-02-15 06:15:24)

Topic: Search through internet and get results in our mvbd

How can i give the isbn  from a book in a text box to search it through the internet and give us the results in our mvdb software so we xan have our database from our books?  If its possible an example will help

Re: Search through internet and get results in our mvbd

What you're looking for is called "Web Scrapping".

You will need to perform HTTP POST in order to perform Web Search and then HTTP GET to get the entire data returned by the Web Server. Once you have the data in text format, you may start parsing it for the Book Information. Usually RegEx is used for this; but if the text is not too complex, you can always use the String Operation functions to extract the desired chunks of the information.

Having said that, I'm not sure how to do this in MVD, but this may give you get started and look for the solutions coded in Pascal Script.

Good Luck.

Re: Search through internet and get results in our mvbd

Can someone give an example??

Re: Search through internet and get results in our mvbd

Hello v_pozidis,

There is an old post of mine that might help you a little to get started.
Of course, this post from 2015 is a little outdated and full of things I would not do like that today.


In general :
1- An HTTPGet request will give you the source code of the page : this is just text
2- Using specific HTML tag you will have to identify, you will look for the info you want embedded in the web page
3- Using Pos, Copy and so on, you can extract the data you need and save that data into variables
4- You can also use RegExp for finding your data, but it's a little tricky if you are not used to it, and I advise you not to use regExp on large chunck of text, that can have a significant impact on performances.



My old post on webscraping : http://myvisualdatabase.com/forum/viewtopic.php?id=1851
A good online regular expression tester : https://rubular.com



A STEP FURTHER
Most of the big websites like Goodreads, google books and so on, use REST API to deliver data to their pages and to third party application.


Using the REST API, you can request book data and get it from the server without all the html garbage you will have to deal with if you use html scraping.
REST API are just url that you query with parameters and that send data you in return, generally under xml or json format.
This is way faster and safer than webscraping because :
- the amount of data you receive is small
- data is organised in tree
- you don't have to worry about the web page format changing and ruining all your scraping procedures.


As an example of free API to get data here is what I tried :
https://restcountries.eu/rest/v2/name/france

Here, the parameter is the country name and I got in return data about my home country. (try it and see what data is displayed on screen after you followed the url - it is way shorter than a full web page).


All API works the same, and in book API cases, you  pass author names or titles or ISBN numbers in the URL as parameters, and get your results sorted by relevance. (most of them request that you register to obtain a personnal token as and identifier, but it is free most of the time).


I strongly advise you look in this direction because you will save a lot of time and build a scraping system that will be independent from the web page format that can change without notice.


I don't know if Dmitry implement the JSON unit in MVD, but it is very easy to parse and get data from the json format with Delphi and, if not available yet in MVD, this could be a great addition.


Hope this helped a little


Cheers

Math

I'm a very good housekeeper !
Each time I get a divorce, I keep the house

Zaza Gabor

Re: Search through internet and get results in our mvbd

Thank you Math. The trough is that I can not understand the routine.I tried it but no hope. Anyway thank you

6 (edited by mathmathou 2021-02-20 03:01:15)

Re: Search through internet and get results in our mvbd

OK, let's do a little example for you :


On a brand new form put :

  • an edit box

  • a button

  • a richedit

NOTE : Richedit could be replaced with a Memo, I use it just to preserve the structure of the response you will get from the server : the memo will output just one line of text, whether the richedit will keep the nested structure of the data you receive.


Create the Onclick event of the button and put this code :


procedure Form1_Button1_OnClick (Sender: TObject; var Cancel: boolean);
var
    ISBN : String;
    APIResponse : String;
begin
    ISBN := Form1.Edit1.Text;
    //APIResponse := HTTPGet('https://www.googleapis.com/books/v1/volumes?q=isbn:'+ISBN,True);
    Form1.RichEdit1.Text := APIResponse;
end;

Remove the comment before the second line of code, the forum keeps wanting the url tag


You will see the data in the Richedit after a short while ( a few seconds).


Notice the tree like structure of the data ? This is a json you received but I am not sure MVD is equipped to parse json file so you will have to use regular expressions or Pos and Copy functions to extract what you need.


NOTE : input the ISBN without the dashes (-), and watch out because I you "hammer" the API server (too many requests per seconds) you will get banned.
This is just a simple example to show you how it works. Other API might use more parameters and even need a personal key that you can get by registering. Some are free, others (like  cost 5$ per year).


Once you get it working, do not output the result to the richedit, just output the parts you extracted.


Hope this helps


Cheers


Mathias

I'm a very good housekeeper !
Each time I get a divorce, I keep the house

Zaza Gabor

Re: Search through internet and get results in our mvbd

Thank's Mathias. I give it a try...