Skip to content

feature:low memory to export Excel #2524

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
124 changes: 124 additions & 0 deletions docs/topics/low-memory-to-export-excel.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
#Low memory way to export Excel

When we need to export a big Excel, it usually needs much memory to finish it.Memory saving method can
help us use less memory to create an Excel.However, this way need external storage and cost more time.
Using "Low memory way" can make you create an Excel with very low memory, and the same time as you use `save()`
to export, or maybe better.

##How to use
Control the data save

```
$objPHPExcel = new Spreadsheet();
$objWriter = new Writer\Xlsx($objPHPExcel);

//before you start load data, or set other style,comment,image etc,
// call method prepareBeforeSave
$objWriter->prepareBeforeSave('/path/to/save.xlsx');

//before ever sheet you're new
//you need to call saveSheetHeader once
$objActSheet = $writer->saveSheetHeader();

//now, you can load you data
for ($r = 0,$ii = 0; $ii < 5; ++$ii) {
$objActSheet->setCellValue('C' . $r, $r);
$objActSheet->setCellValue('D' . $r, $r);
//the key to save memory is here, when you want to release memory, just call saveSheetFormData
$objActSheet = $objWriter->saveSheetFormData();
}
//any special data need to be set is ok
$objActSheet->setCellValue('C1', 'some data');

//you can also call saveSheetFormData to release memory right now,
// but you don't need to call it if you don't want to release.
$objActSheet = $objWriter->saveSheetFormData();
...
...

//when you make sure a sheet data is no more data to load, call afterSaveSheetFormData once
//remember the method :saveSheetHeader(), afterSaveSheetFormData() can only be called once of each sheet
$objWriter->afterSaveSheetFormData();

//when you need to finish, and need to export Excel, call finish()
$objWriter->finish();
```

Create many sheets
--
Easy, just new a sheet after method `afterSaveSheetFormData()` like this:
```
$objWriter->afterSaveSheetFormData();

$objPHPExcel->createSheet(1);
$objPHPExcel->setActiveSheetIndex(1);
//repeat call saveSheetHeader() after new a sheet
$objActSheet = $writer->saveSheetHeader();

//开始表单循环
for ($r = 0,$ii = 0; $ii < 5; ++$ii) {
$objActSheet->setCellValue('C' . $r, $r);
$objActSheet->setCellValue('D' . $r, $r);
//call saveSheetFormData() same
$objActSheet = $objWriter->saveSheetFormData();
}
//here is also the same
$objWriter->afterSaveSheetFormData();

$objWriter->finish();
```
Yes, what we do is to repeat calling `saveSheetHeader()`,`saveSheetFormData()`, `afterSaveSheetFormData`.
Compare with `save()`, it seems a little complex, but it is still easy to export.

If you want to save sheets data in turn is ok:
```
//save sheet 0 first
$objPHPExcel->setActiveSheetIndex(0);
$objActSheet = $objPHPExcel->getActiveSheet();
$objActSheet->setCellValue('E4', 1);

//save sheet 1 second
$objPHPExcel->setActiveSheetIndex(1);
$objActSheet = $objPHPExcel->getActiveSheet();
$objActSheet->setCellValue('E4', 1);

$objWriter->afterSaveSheetFormData();//save sheet 1

//write sheet 0 again
$objPHPExcel->setActiveSheetIndex(0);
$objActSheet = $objPHPExcel->getActiveSheet();
$objActSheet->setCellValue('E5', 1);
$objWriter->afterSaveSheetFormData();//save sheet 0
```
If you don't call `afterSaveSheetFormData()`, sheet won't be closed and you can `setActiveSheetIndex` to keep write.


##What needs to be focus
When you set a cell's value, it must be in order of row.
```
$objActSheet->setCellValue('E4', 1);
$objActSheet->setCellValue('E5', 10);
$objActSheet = $objWriter->saveSheetFormData();

//after you call saveSheetFormData(), and you set a cell row number is less than
//the largest one you have ever set, it will be useless,
$objActSheet->setCellValue('E4', 10);//E4 is still 1
$objActSheet->setCellValue('A4', 10);//do not work!
```

You must call `saveSheetHeader()` and `afterSaveSheetFormData()` in pair, or the format is fault.

If you want a better export performance, such as low memory allocated, you need to call `saveSheetFormData()`
before you set many cells.<br>
Meanwhile, you shouldn't call `saveSheetFormData()` frequently when you just set several cells because it will cost more time.<br>
Luckily, it just cost several seconds more.

##Performance comparison
hardware :<br>
MacBook Pro (16-inch, 2019)<br>
2.6GHz 6-core Intel Core i7 processor<br>
16 GB 2667 MHz DDR4

create an Excel with 50000 rows of sheet 0, 100 rows of sheet 1.
1. old way to use `save()`, the amount of memory allocated : 191369216 bytes, time cost : 54 seconds
1. new way of this low memory, the amount of memory allocated : 10485760 bytes, time cost : 47.473015069962 seconds
18 changes: 15 additions & 3 deletions src/PhpSpreadsheet/Shared/XMLWriter.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

namespace PhpOffice\PhpSpreadsheet\Shared;

use PhpOffice\PhpSpreadsheet\Writer\Exception as WriterException;

class XMLWriter extends \XMLWriter
{
public static $debugEnabled = false;
Expand All @@ -17,13 +19,20 @@ class XMLWriter extends \XMLWriter
*/
private $tempFileName = '';

/**
* Whether temporary file will be unlink when class is destructed.
*
* @var bool
*/
public $needUnlink = true;

/**
* Create a new XMLWriter instance.
*
* @param int $temporaryStorage Temporary storage location
* @param string $temporaryStorageFolder Temporary storage folder
*/
public function __construct($temporaryStorage = self::STORAGE_MEMORY, $temporaryStorageFolder = null)
public function __construct($temporaryStorage = self::STORAGE_MEMORY, $temporaryStorageFolder = null, bool $specifyPath = false)
{
// Open temporary storage
if ($temporaryStorage == self::STORAGE_MEMORY) {
Expand All @@ -33,7 +42,10 @@ public function __construct($temporaryStorage = self::STORAGE_MEMORY, $temporary
if ($temporaryStorageFolder === null) {
$temporaryStorageFolder = File::sysGetTempDir();
}
$this->tempFileName = @tempnam($temporaryStorageFolder, 'xml');
$this->tempFileName = $specifyPath ? $temporaryStorageFolder : @tempnam($temporaryStorageFolder, 'xml');
if (empty($this->tempFileName)) {
throw new WriterException('can not open an empty file');
}

// Open storage
if ($this->openUri($this->tempFileName) === false) {
Expand All @@ -54,7 +66,7 @@ public function __construct($temporaryStorage = self::STORAGE_MEMORY, $temporary
public function __destruct()
{
// Unlink temporary files
if ($this->tempFileName != '') {
if ($this->tempFileName != '' && $this->needUnlink) {
@unlink($this->tempFileName);
}
}
Expand Down
9 changes: 6 additions & 3 deletions src/PhpSpreadsheet/Spreadsheet.php
Original file line number Diff line number Diff line change
Expand Up @@ -342,9 +342,9 @@ public function setRibbonBinObjects($BinObjectsNames, $BinObjectsData): void
* List of unparsed loaded data for export to same format with better compatibility.
* It has to be minimized when the library start to support currently unparsed data.
*
* @internal
*
* @return array
*
* @internal
*/
public function getUnparsedLoadedData()
{
Expand Down Expand Up @@ -653,14 +653,17 @@ public function addSheet(Worksheet $worksheet, $sheetIndex = null)
*
* @param int $sheetIndex Index position of the worksheet to remove
*/
public function removeSheetByIndex($sheetIndex): void
public function removeSheetByIndex($sheetIndex, bool $disconnectedAll = false): void
{
$numSheets = count($this->workSheetCollection);
if ($sheetIndex > $numSheets - 1) {
throw new Exception(
"You tried to remove a sheet by the out of bounds index: {$sheetIndex}. The actual number of sheets is {$numSheets}."
);
}
if ($disconnectedAll) {
$this->workSheetCollection[$sheetIndex]->__destruct();
}
array_splice($this->workSheetCollection, $sheetIndex, 1);

// Adjust active sheet index if necessary
Expand Down
5 changes: 5 additions & 0 deletions src/PhpSpreadsheet/Worksheet/AutoFilter.php
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,11 @@ public function __construct(string $range = '', ?Worksheet $worksheet = null)
$this->workSheet = $worksheet;
}

public function disconnectWorksheet(): void
{
$this->workSheet = null;
}

/**
* Get AutoFilter Parent Worksheet.
*
Expand Down
32 changes: 32 additions & 0 deletions src/PhpSpreadsheet/Worksheet/Worksheet.php
Original file line number Diff line number Diff line change
Expand Up @@ -399,6 +399,8 @@ public function __destruct()

$this->disconnectCells();
$this->rowDimensions = [];
//make sure worksheet's reference is all disconnected, or it will cause memory leak
$this->autoFilter->disconnectWorksheet();
}

/**
Expand Down Expand Up @@ -543,6 +545,16 @@ public function getDrawingCollection()
return $this->drawingCollection;
}

/**
* Set collection of drawings.
*
* @param ArrayObject<int, BaseDrawing> $drawingCollection
*/
public function setDrawingCollection($drawingCollection): void
{
$this->drawingCollection = $drawingCollection;
}

/**
* Get collection of charts.
*
Expand All @@ -553,6 +565,16 @@ public function getChartCollection()
return $this->chartCollection;
}

/**
* Set collection of charts.
*
* @param ArrayObject<int, Chart> $chartCollection
*/
public function setChartCollection($chartCollection): void
{
$this->chartCollection = $chartCollection;
}

/**
* Add chart.
*
Expand Down Expand Up @@ -2960,6 +2982,16 @@ public function getHyperlinkCollection()
return $this->hyperlinkCollection;
}

/**
* Set collection of hyperlinks.
*
* @param Hyperlink[] $hyperlinkCollection
*/
public function setHyperlinkCollection($hyperlinkCollection): void
{
$this->hyperlinkCollection = $hyperlinkCollection;
}

/**
* Get data validation.
*
Expand Down
25 changes: 25 additions & 0 deletions src/PhpSpreadsheet/Writer/BaseWriter.php
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,13 @@ abstract class BaseWriter implements IWriter
*/
private $diskCachingDirectory = './';

/**
* disk stored path.
*
* @var string
*/
private $fileStorePath = '';

/**
* @var resource
*/
Expand Down Expand Up @@ -74,6 +81,14 @@ public function getUseDiskCaching()
return $this->useDiskCaching;
}

/**
* @return string
*/
public function getFileStorePath()
{
return $this->fileStorePath;
}

public function setUseDiskCaching($useDiskCache, $cacheDirectory = null)
{
$this->useDiskCaching = $useDiskCache;
Expand All @@ -89,6 +104,16 @@ public function setUseDiskCaching($useDiskCache, $cacheDirectory = null)
return $this;
}

/**
* @param bool $pValue
* @param string $pFilePath
*/
public function setUserDiskFileStore($pValue, $pFilePath): void
{
$this->useDiskCaching = $pValue;
$this->fileStorePath = $pFilePath;
}

public function getDiskCachingDirectory()
{
return $this->diskCachingDirectory;
Expand Down
Loading