Website data scraping?

I’m not sure if something like this is feasible and how complicated it is, but let’s see if anyone can help.

I’m interested in whether it’s possible to create a script that would forward a link from 1 web page (always the same) to product details, and it would take all the information about that product and save it in my database, and possibly translate it from English to the local language.
I know about HTML scraping, but their HTML seems a bit complicated for that, so is there another solution?

Most web sites that allow others to use their data provide an API (Application Programming Interface) that returns JSON encoded data.

1 Like

No matter how “complicated” the html might be, if the browser can parse it and display a page, a script can do that as well.
The only requirement would be, that the output structure is always somewhat the same for all products.

For machine-translation you can either use a professional (paid) API like google or yandex or you may wanna take a look at the LibreTranslate open source project.

However, all of those translators are mostly trained for longer text and may produce strange results with short product names or attributes.
Test quality before you go for a paid plan though.

Sponsor our Newsletter | Privacy Policy | Terms of Service