Skip to content
This repository was archived by the owner on Jan 2, 2019. It is now read-only.
This repository was archived by the owner on Jan 2, 2019. It is now read-only.

HTML file validation fails on valid page #225

Open
@mzeddd

Description

@mzeddd

Hi,

I'm trying to fetch data from HTML page and read it as an Excel table

This is what I have (code from different examples combined together)


$url = "http://paraplan.ru/forum/jonatan.php";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url); // set url to post to
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);// allow redirects
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); // return into a variable
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_TIMEOUT, 3); // times out after 4s
$result = curl_exec($ch); // run the whole process
curl_close($ch);

$outputFile = 'var/counter.html';

$fp = fopen($outputFile, 'w');
$test = fwrite($fp, $result);
fclose($fp);

$objReader = PHPExcel_IOFactory::createReader('HTML');
$objPHPExcel = $objReader->load($outputFile);


With this code PHPExcel stops loading file because it is not valid HTML file.

I think you should check validation function. Possibly it is not working due to Russian content of the page (Cyrillic).

Currently in my code I disabled it to proceed with my experiments.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions