Skip to content

Japanese or Chines language utf-8 not supporting in doc reading  #1817

Closed
@Tarun-developer

Description

@Tarun-developer

Japanese or Chines language utf-8 not supporting in doc reading
example 1 にっぽんこく、にほんこ
example 2 監警會決定不在司法覆核裁決前發布首份階段性報告
such kind of languages is not reading in doc format

Steps to Reproduce

function phpReader($source) {
    $phpWord = \PhpOffice\PhpWord\IOFactory::load($source, 'MsDoc');
    $text = ExtractText($phpWord);
  return $txt ;
}
function ExtractText($obj, $nested = 0) {
    $txt = "";
    if (method_exists($obj, 'getSections')) {
        foreach ($obj->getSections() as $section) {
            $txt .= " " . ExtractText($section, $nested + 1);
        }
    } else if (method_exists($obj, 'getElements')) {
        foreach ($obj->getElements() as $element) {
            $txt .= " " . ExtractText($element, $nested + 1);
        }
    } else if (method_exists($obj, 'getText')) {
        $txt .= $obj->getText();
    } else if (method_exists($obj, 'getRows')) {
        foreach ($obj->getRows() as $row) {
            $txt .= " " . ExtractText($row, $nested + 1);
        }
    } else if (method_exists($obj, 'getCells')) {
        foreach ($obj->getCells() as $cell) {
            $txt .= " " . ExtractText($cell, $nested + 1);
        }
    } else if (get_class($obj) != "PhpOffice\PhpWord\Element\TextBreak") {
        $txt .= "(" . get_class($obj) . ")"; # unknown object, you need to add it
    }
    return $txt;
}

example1 responce is "k0c0}0�0S0 O k { � S "
example2 responce is " ãvf��gzl�[ N(WøSÕl��8hÁ �z"
example1
example2

  • PHPWord Version: 2020

$phpWord response in example 2 .
like

` ```

PhpOffice\PhpWord\PhpWord Object
(
    [sections:PhpOffice\PhpWord\PhpWord:private] => Array
        (
            [0] => PhpOffice\PhpWord\Element\Section Object
                (
                    [container:protected] => Section
                    [style:PhpOffice\PhpWord\Element\Section:private] => PhpOffice\PhpWord\Style\Section Object
                        (
                            [orientation:PhpOffice\PhpWord\Style\Section:private] => portrait
                            [paper:PhpOffice\PhpWord\Style\Section:private] => PhpOffice\PhpWord\Style\Paper Object
                                (
                                    [sizes:PhpOffice\PhpWord\Style\Paper:private] => Array
                                        (
                                            [A3] => Array
                                                (
                                                    [0] => 297
                                                    [1] => 420
                                                    [2] => mm
                                                )

                                            [A4] => Array
                                                (
                                                    [0] => 210
                                                    [1] => 297
                                                    [2] => mm
                                                )

                                            [A5] => Array
                                                (
                                                    [0] => 148
                                                    [1] => 210
                                                    [2] => mm
                                                )

                                            [B5] => Array
                                                (
                                                    [0] => 176
                                                    [1] => 250
                                                    [2] => mm
                                                )

                                            [Folio] => Array
                                                (
                                                    [0] => 8.5
                                                    [1] => 13
                                                    [2] => in
                                                )

                                            [Legal] => Array
                                                (
                                                    [0] => 8.5
                                                    [1] => 14
                                                    [2] => in
                                                )

                                            [Letter] => Array
                                                (
                                                    [0] => 8.5
                                                    [1] => 11
                                                    [2] => in
                                                )

                                        )

                                    [size:PhpOffice\PhpWord\Style\Paper:private] => A4
                                    [width:PhpOffice\PhpWord\Style\Paper:private] => 11905.511811024
                                    [height:PhpOffice\PhpWord\Style\Paper:private] => 16837.795275591
                                    [styleName:protected] => 
                                    [index:protected] => 
                                    [aliases:protected] => Array
                                        (
                                        )

                                    [isAuto:PhpOffice\PhpWord\Style\AbstractStyle:private] => 
                                )

                            [pageSizeW:PhpOffice\PhpWord\Style\Section:private] => 11905.511811024
                            [pageSizeH:PhpOffice\PhpWord\Style\Section:private] => 16837.795275591
                            [marginTop:PhpOffice\PhpWord\Style\Section:private] => 1440
                            [marginLeft:PhpOffice\PhpWord\Style\Section:private] => 1440
                            [marginRight:PhpOffice\PhpWord\Style\Section:private] => 1440
                            [marginBottom:PhpOffice\PhpWord\Style\Section:private] => 1440
                            [gutter:PhpOffice\PhpWord\Style\Section:private] => 0
                            [headerHeight:PhpOffice\PhpWord\Style\Section:private] => 720
                            [footerHeight:PhpOffice\PhpWord\Style\Section:private] => 720
                            [pageNumberingStart:PhpOffice\PhpWord\Style\Section:private] => 
                            [colsNum:PhpOffice\PhpWord\Style\Section:private] => 1
                            [colsSpace:PhpOffice\PhpWord\Style\Section:private] => 720
                            [breakType:PhpOffice\PhpWord\Style\Section:private] => 
                            [lineNumbering:PhpOffice\PhpWord\Style\Section:private] => 
                            [vAlign:PhpOffice\PhpWord\Style\Section:private] => 
                            [borderTopSize:protected] => 
                            [borderTopColor:protected] => 
                            [borderTopStyle:protected] => 
                            [borderLeftSize:protected] => 
                            [borderLeftColor:protected] => 
                            [borderLeftStyle:protected] => 
                            [borderRightSize:protected] => 
                            [borderRightColor:protected] => 
                            [borderRightStyle:protected] => 
                            [borderBottomSize:protected] => 
                            [borderBottomColor:protected] => 
                            [borderBottomStyle:protected] => 
                            [styleName:protected] => 
                            [index:protected] => 
                            [aliases:protected] => Array
                                (
                                )

                            [isAuto:PhpOffice\PhpWord\Style\AbstractStyle:private] => 
                        )

                    [headers:PhpOffice\PhpWord\Element\Section:private] => Array
                        (
                        )

                    [footers:PhpOffice\PhpWord\Element\Section:private] => Array
                        (
                        )

                    [footnoteProperties:PhpOffice\PhpWord\Element\Section:private] => 
                    [elements:protected] => Array
                        (
                            [0] => PhpOffice\PhpWord\Element\Text Object
                                (
                                    [text:protected] => ãvf��gzl�[
                                    [fontStyle:protected] => PhpOffice\PhpWord\Style\Font Object
                                        (
                                            [aliases:protected] => Array
                                                (
                                                    [line-height] => lineHeight
                                                    [letter-spacing] => spacing
                                                )

                                            [type:PhpOffice\PhpWord\Style\Font:private] => text
                                            [name:PhpOffice\PhpWord\Style\Font:private] => Songti SC
                                            [hint:PhpOffice\PhpWord\Style\Font:private] => 
                                            [size:PhpOffice\PhpWord\Style\Font:private] => 
                                            [color:PhpOffice\PhpWord\Style\Font:private] => 000000
                                            [bold:PhpOffice\PhpWord\Style\Font:private] => 1
                                            [italic:PhpOffice\PhpWord\Style\Font:private] => 
                                            [underline:PhpOffice\PhpWord\Style\Font:private] => none
                                            [superScript:PhpOffice\PhpWord\Style\Font:private] => 
                                            [subScript:PhpOffice\PhpWord\Style\Font:private] => 
                                            [strikethrough:PhpOffice\PhpWord\Style\Font:private] => 
                                            [doubleStrikethrough:PhpOffice\PhpWord\Style\Font:private] => 
                                            [smallCaps:PhpOffice\PhpWord\Style\Font:private] => 
                                            [allCaps:PhpOffice\PhpWord\Style\Font:private] => 
                                            [fgColor:PhpOffice\PhpWord\Style\Font:private] => 
                                            [scale:PhpOffice\PhpWord\Style\Font:private] => 
                                            [spacing:PhpOffice\PhpWord\Style\Font:private] => 
                                            [kerning:PhpOffice\PhpWord\Style\Font:private] => 
                                            [paragraph:PhpOffice\PhpWord\Style\Font:private] => 





```

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions