HTML file validation fails on valid page #225
Description
Hi,
I'm trying to fetch data from HTML page and read it as an Excel table
This is what I have (code from different examples combined together)
$url = "http://paraplan.ru/forum/jonatan.php";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url); // set url to post to
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);// allow redirects
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); // return into a variable
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_TIMEOUT, 3); // times out after 4s
$result = curl_exec($ch); // run the whole process
curl_close($ch);
$outputFile = 'var/counter.html';
$fp = fopen($outputFile, 'w');
$test = fwrite($fp, $result);
fclose($fp);
$objReader = PHPExcel_IOFactory::createReader('HTML');
$objPHPExcel = $objReader->load($outputFile);
With this code PHPExcel stops loading file because it is not valid HTML file.
I think you should check validation function. Possibly it is not working due to Russian content of the page (Cyrillic).
Currently in my code I disabled it to proceed with my experiments.