QlikView - Web File
QlikView can process files from the web, which are in the HTML format. It can extract data from HTML tables. The URL of the web file to be processed is given as an input and QlikView fetches both, the structure and content of the file. Then it analyzes the structure of the page extracting the relevant data from the HTML tables present in the page. We choose the Web files option from the Data from files section under the Data tab of script Editor.
Give the URL as Input
On selecting the Web files option, we get a new window to give the URL as input. In this example, we are choosing the List of sovereign states and dependent territories in Asia as the input page from Wikipedia. Mention the URL and click Next.
Select the Table from the Web File
On opening the selected Web file, the window shown below comes up. Here we can see the various tables present in the webpage labeled as @1, @1, @3 and so on. Choose the first table and click Next twice.
Select the Columns of the Table
From the above table, we can choose only the columns we need by removing the unwanted columns using the cross sign.
The loading of the file into QlikView is done through the load script, which can be seen in the screen shot given below. Hence, when we use any delimited file, we can tweak the below given script as per the file format.
Now the script wizard prompts to save the file in the form of *.qvw file extension. It asks to select a location where you need to save the file. Click "Next step" to proceed. Now it is time to see the data that is loaded from the web file. We use a Table Box sheet object to display this data.
Create Table Box
The Table Box is a sheet object to display the available data as a table. It is invoked from the menu Layout → New Sheet Object → Table Box.
On clicking Next, we get the option to choose the fields from the Table Box. You can use the Promote or Demote buttons to rearrange the fields.
Table Box Data
On completing the above step, the Table Box Sheet Object appears, which shows the data that is read from the Web file. Mark the Non-English characters !!