2012年8月29日 星期三

How can XDocReport (to PDF) supply the Chinese character

延續上一篇 XDocReport API 將 Word 套表轉換成 Word 或 PDF的說明,這裡將著重在轉 PDF  時中文的呈現,官網有針對此議題的討論 Issue 81:how can xdocreport supply the Chinese character.,但截至目前沒獲得解決,文章會針對 API 的修改及修改後的測試做說明,但不建議採用此修改建置系統,因時間有限修改是用 hard code 的方式沒有完整的 Interface,要採用還是以官網的版本為主。

官網有提供說明 XWPFDocument 2 PDF 時,採用 OS 的字型及編碼,但中文還是無法顯示。Sample 還是採用上一篇的 DocxProjectWithVelocity2PDF.java 執行步驟將不再重覆。此文章的標題是直接把 Issue 81 的標題拿來用。



準備執行環境

測試時會用到的環境
  • JDK (Java Development Kit) version 1.6+
  • Apache Maven 2.2.1+
  • XDocReport 0.9.8
  • org.apache.poi.xwpf.converter-0.9.8-sources

下載org.apache.poi.xwpf.converter-0.9.8-sources.jar,這裡還是採用上一篇的目錄結構,因此將下載的 jar 檔解壓縮至 docxandvelocity.converters-0.9.8\src\main\java


Build & Test Apache POI XWPF Converter



  1. 修改 pom.xmlline-49以下為 complier Apache POI XWPF Converter API 新增加的,line-91以下為執行 Sample,因執行 Sample 時 import 的對象是接下來修改的API,所以改為參照個別的 librarys,並將 Apache POI XWPF Converter API 移除,如 line-162

  2. <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
      <modelVersion>4.0.0</modelVersion>
    
      <groupId>docxConverters</groupId>
      <artifactId>docxConverters</artifactId>
      <version>1.0-SNAPSHOT</version>
      <name>docxConverters</name>
      <url>http://maven.apache.org</url>
    
        <build>
            <plugins>
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-compiler-plugin</artifactId>
                    <configuration>
                        <source>1.6</source>
                        <target>1.6</target>
                        <encoding>UTF-8</encoding>
                    </configuration>
                </plugin>
    
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-resources-plugin</artifactId>
                    <configuration>
                        <encoding>UTF-8</encoding>
                    </configuration>
                </plugin>
                <!--            -->
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-surefire-plugin</artifactId>
                    <configuration>
                        <skipTests>true</skipTests>
                    </configuration>
                </plugin>
            </plugins>
    
            <resources>
                <resource>
                    <directory>src/main/resources</directory>
                </resource>
            </resources>
    
        </build>
    
      <dependencies>
        <!--     complier org.apache.poi.xwpf.converter-0.9.8       -->
        <dependency>
            <groupId>bouncycastle</groupId>
            <artifactId>bcmail-jdk14</artifactId>
            <version>139</version>
        </dependency>
        <dependency>
            <groupId>bouncycastle</groupId>
            <artifactId>bcprov-jdk14</artifactId>
            <version>140</version>
        </dependency>
        <dependency>
            <groupId>bouncycastle</groupId>
            <artifactId>bctsp-jdk14</artifactId>
            <version>138</version>
        </dependency>
        <dependency>
            <groupId>commons-fileupload</groupId>
            <artifactId>commons-fileupload</artifactId>
            <version>1.2.2</version>
        </dependency>
        <dependency>
            <groupId>log4j</groupId>
            <artifactId>log4j</artifactId>
            <version>1.2.9</version>
        </dependency>
        <dependency>
            <groupId>org.osgi</groupId>
            <artifactId>org.osgi.core</artifactId>
            <version>4.3.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.tomcat</groupId>
            <artifactId>servlet-api</artifactId>
            <version>6.0.29</version>
        </dependency>
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-api</artifactId>
            <version>1.6.4</version>
        </dependency>
    
        <!--     docxandvelocity.converters-sample-0.9.8       -->
        <dependency>
            <groupId>commons-codec</groupId>
            <artifactId>commons-codec</artifactId>
            <version>1.5</version>
        </dependency>
        <dependency>
            <groupId>commons-collections</groupId>
            <artifactId>commons-collections</artifactId>
            <version>3.2.1</version>
        </dependency>
        <dependency>
            <groupId>commons-lang</groupId>
            <artifactId>commons-lang</artifactId>
            <version>2.4</version>
        </dependency>
        <dependency>
            <groupId>dom4j</groupId>
            <artifactId>dom4j</artifactId>
            <version>1.6.1</version>
        </dependency>
        <dependency>
            <groupId>fr.opensagres.xdocreport</groupId>
            <artifactId>fr.opensagres.xdocreport.converter.docx.xwpf</artifactId>
            <version>0.9.8</version>
        </dependency>
        <dependency>
            <groupId>fr.opensagres.xdocreport</groupId>
            <artifactId>fr.opensagres.xdocreport.converter</artifactId>
            <version>0.9.8</version>
        </dependency>
        <dependency>
            <groupId>fr.opensagres.xdocreport</groupId>
            <artifactId>fr.opensagres.xdocreport.core</artifactId>
            <version>0.9.8</version>
        </dependency>
        <dependency>
            <groupId>fr.opensagres.xdocreport</groupId>
            <artifactId>fr.opensagres.xdocreport.document.docx</artifactId>
            <version>0.9.8</version>
        </dependency>
        <dependency>
            <groupId>fr.opensagres.xdocreport</groupId>
            <artifactId>fr.opensagres.xdocreport.document</artifactId>
            <version>0.9.8</version>
        </dependency>
        <dependency>
            <groupId>fr.opensagres.xdocreport</groupId>
            <artifactId>fr.opensagres.xdocreport.itext.extension</artifactId>
            <version>0.9.8</version>
        </dependency>
        <dependency>
            <groupId>fr.opensagres.xdocreport</groupId>
            <artifactId>fr.opensagres.xdocreport.template.velocity</artifactId>
            <version>0.9.8</version>
        </dependency>
        <dependency>
            <groupId>fr.opensagres.xdocreport</groupId>
            <artifactId>fr.opensagres.xdocreport.template</artifactId>
            <version>0.9.8</version>
        </dependency>
        <dependency>
            <groupId>com.lowagie</groupId>
            <artifactId>itext</artifactId>
            <version>2.1.7</version>
        </dependency>
        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>ooxml-schemas</artifactId>
            <version>1.1</version>
        </dependency>
        <!-- dependency>
            <groupId>fr.opensagres.xdocreport</groupId>
            <artifactId>org.apache.poi.xwpf.converter</artifactId>
            <version>0.9.8</version>
        </dependency -->
        <dependency>
           <groupId>oro</groupId>
           <artifactId>oro</artifactId>
           <version>2.0.8</version>
        </dependency>
        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi</artifactId>
            <version>3.8</version>
        </dependency>
        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-ooxml</artifactId>
            <version>3.8</version>
        </dependency>
        <dependency>
            <groupId>stax</groupId>
            <artifactId>stax-api</artifactId>
            <version>1.0.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.velocity</groupId>
            <artifactId>velocity</artifactId>
            <version>1.7</version>
        </dependency>
        <dependency>
            <groupId>xml-apis</groupId>
            <artifactId>xml-apis</artifactId>
            <version>1.0.b2</version>
        </dependency>
        <dependency>
            <groupId>org.apache.xmlbeans</groupId>
            <artifactId>xmlbeans</artifactId>
            <version>2.3.0</version>
        </dependency>
    
      </dependencies>
    </project>
    


  3. 修改 src\main\java\org\apache\poi\xwpf\converter\internal\itext\PDFMapper.java ,以下僅節錄出修改的部份,line-9 import BaseFont ,line-58 ~ 63 採用作業系統的字型檔並重新設定字型相關參數。
  4. ....
    
    import com.lowagie.text.Chunk;
    import com.lowagie.text.Element;
    import com.lowagie.text.Font;
    import com.lowagie.text.Image;
    import com.lowagie.text.Rectangle;
    import com.lowagie.text.pdf.PdfPCell;
    import com.lowagie.text.pdf.BaseFont;
    
     ....
    
    
        @Override
        protected void visitRun( XWPFRun run, IITextContainer pdfContainer )
            throws Exception
        {
            CTR ctr = run.getCTR();
            // Get family name
            // Get CTRPr from style+defaults
            CTString rStyle = getRStyle( run );
            CTRPr runRprStyle = getRPr( super.getXWPFStyle( rStyle != null ? rStyle.getVal() : null ) );
            CTRPr rprStyle = getRPr( super.getXWPFStyle( run.getParagraph().getStyleID() ) );
            CTRPr rprDefault = getRPr( defaults );
    
            // Font family
            String fontFamily = getFontFamily( run, rprStyle, rprDefault );
    
            // Get font size
            float fontSize = run.getFontSize();
    
            // Get font style
            int fontStyle = Font.NORMAL;
            if ( isBold( run, runRprStyle, rprStyle, rprDefault ) )
            {
                fontStyle |= Font.BOLD;
            }
            if ( isItalic( run, runRprStyle, rprStyle, rprDefault ) )
            {
                fontStyle |= Font.ITALIC;
            }
    
            // Process color
            Color fontColor = null;
            String hexColor = getFontColor( run, runRprStyle, rprStyle, rprDefault );
            if ( StringUtils.isNotEmpty( hexColor ) )
            {
                if ( hexColor != null && !"auto".equals( hexColor ) )
                {
                    fontColor = ColorRegistry.getInstance().getColor( "0x" + hexColor );
                }
            }
            // Get font
            Font font =
                XWPFFontRegistry.getRegistry().getFont( fontFamily, options.getFontEncoding(), fontSize, fontStyle,
                                                        fontColor );
    
            // 繁體中文
            BaseFont bfChinese = BaseFont.createFont("c:/Windows/Fonts/arialuni.ttf",                            BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
            Font fontChinese = new Font(bfChinese, font.getSize(), font.getStyle(), font.getColor());
            if (fontFamily != null)
               fontChinese.setFamily(fontFamily);
            font = fontChinese;
    
            UnderlinePatterns underlinePatterns = run.getUnderline();
    
            boolean singleUnderlined = false;
            switch ( underlinePatterns )
            {
                case SINGLE:
                    singleUnderlined = true;
                    break;
    
                default:
                    break;
            }
    
            List<ctbr> brs = ctr.getBrList();
            for ( @SuppressWarnings( "unused" )
            CTBr br : brs )
            {
                pdfContainer.addElement( Chunk.NEWLINE );
            }
    
            List<cttext> texts = run.getCTR().getTList();
            for ( CTText ctText : texts )
            {
    
                Chunk aChunk = new Chunk( ctText.getStringValue(), font );
                if ( singleUnderlined )
                    aChunk.setUnderline( 1, -2 );
    
                pdfContainer.addElement( aChunk );
            }
    
            super.visitPictures( run, pdfContainer );
    
            // <w:lastrenderedpagebreak>
            List<ctempty> lastRenderedPageBreakList = ctr.getLastRenderedPageBreakList();
            if ( lastRenderedPageBreakList != null && lastRenderedPageBreakList.size() > 0 )
            {
                // IText Document#newPage must be called to generate page break.
                // But before that, CTSectPr must be getted to compute pageSize,
                // margins...
                // The CTSectPr <w:ppr><w:sectpr w:rsidr="00AA33F7" w:rsidsect="00607077"><w:pgsz w:h="11906" w:orient="landscape" w:w="16838">...
                Stack<ctsectpr> sectPrStack = getSectPrStack();
                if ( sectPrStack != null && !sectPrStack.isEmpty() )
                {
                    CTSectPr sectPr = sectPrStack.pop();
                    applySectPr( sectPr );
                }
                for ( CTEmpty lastRenderedPageBreak : lastRenderedPageBreakList )
                {
                    pdfDocument.newPage();
                }
            }
        }
    
    ....
    
    


  5. 修改 DocxProjectWithVelocity2PDF.java 將套印的變數改為中文輸出,並修改 DocxProjectWithVelocity.docx 如下圖。


  6. 切換到 docxandvelocity.converters-0.9.8 路徑執行下列指令

  7. mvn compile

    mvn exec:java -Dexec.mainClass=fr.opensagres.xdocreport.samples.docxandvelocity.DocxProjectWithVelocity2PDF
    


  8. 執行成功,可看到產出的檔案 DocxProjectWithVelocity_Out.pdf 如下圖,對照後可發現有些字型無法正確顯示,不過這是修改API後造成的,官網的版本是可以正確顯示這些字型。


相關問題說明

針對 XDocReport API 將截至目前的測試做如下整理(非全面性的):

  1. Word to Word:文件格式、字型 、 中文都可正常套印。
  2. Word to PDF: 直式儲存格無法正常顯示、中文無法顯示;修改後的API中文可以顯示,但字型會被轉為標準字體



相關設定可參考:
XDocReport API
Issue 81: how can xdocreport supply the Chinese character.
XWPFDocument 2 PDF
org.apache.poi.xwpf.converter-0.9.8-sources
Google JMesa、Flying Saucer、iText 的中文問題

2 則留言:

  1. XDocReport 1.0.0 SNAPHOT 已修改為可自行定義字型,因尚未release,須自行下載source code才能測試,繞了一大圈才測試成功,將測試步驟記錄如下:
    1.Git clone source code 至本機
    a.https://code.google.com/p/xdocreport/
    b.https://code.google.com/p/xdocreport.samples/

    2.cd %XDOCREPORT_HOME%/1.0.0/
    3.mvn install -Dmaven.test.skip=true
    4.Test docx and odt chinese characters for pdf converter(reference Issue 81)
    a.cd %XDOCREPORT_HOME%/1.0.0/thirdparties-extension\org.apache.poi.xwpf.converter.pdf
    b.mvn package
    5.上一點即可執行PdfConvertChineseTestCase.java可看出docx to pdf中文的轉換結果

    回覆刪除
  2. XDocReport v1.0.0 己解決大部份問題,可參考http://acai-hsieh.blogspot.tw/2013/02/test-xdocreport-v100.html

    回覆刪除