Save HTML table into CSV

Download an HTML table as a CSV file.

Source code and partial documenation

Status: The plugin is in an experimental stable state.

Todo:

Requirement

Resources

Standards:

Other resource:

Feature overview

There is two part of this feature request:

  1. Providing a UI (User Interface) to let the user to initiate the action of downloading the HTML table as a CSV format
    • Add a new action in the action manager. This seem the easiest and fastest solution and allow web publisher to use the doaction plugin to trigger the action.
  2. Create the CSV file.
    • Collecting the tabular information from the HTML table and/or from data table
    • Generating the content of the CSV file.
  3. Add a download functionality
    • Will be limited to supported browser by WET.

Structure

The new action will be named “tocsv”. The Action “tocsv” will be responsible to collect the information from the HTML table, to structure the CSV file and then initiate the download. The download will be moved outside the action as it is at higher risk to be reused by other component for other purpose, like saving a user state or something else.

Collecting the tabular information

Gathering the information from the HTML table is pretty streat forward, but not from the data table plugin. The chalenge with the data table plugin is it only show, in it’s HTML form, a sub-set of the tabular data. Although the tabular data can be retreived from API call. Integrating the support for data table obfuscate more the readibility of code as a completly different technique is needed to access to equivalent data. In the array of rows taken from data table do not include the heading row compared to the rows properties from the <table> element which include heading row too.

When iterating through rows by using the data table API, it is not all the row that have an associated HTML element. So the cells information is retreived through the cell( rowIndex, colIndex ).data() API. Then a test is required to know if the content of that cell is HTML, for the CSV we are only to provide textual content, not HTML markup unless if that explicitly display at the screen.

Complex table support

The collecting of tabular are limited to simple table. Complex table, like the one supported by the WET table parser are considered out of scope for now. Supporting those will require to define several strategy to support each variation. Like grouping and reversing the axes. That might end up to create a complicated configuration file for the “tocsv” action. Also, at the time of writing this, it is noted that the WET table parser need to be fully reviewed and rewritten. It was proposed to follow a structure similar to Data Cube.

Download

The download function was inspired by the FileSave.js code. But I removed all the support for unsuported browser by WET which reduced a lot the code. Also, we didn’t implemented the fix which force to add the BOM to text file as no file in WET include the BOM flag.

Need to consider to move the download function into the WET core.

Developer notes