1.簡介
我們在上一章介紹瞭如果想實現將markdown內容轉換為word的話, 如果想要轉換後的word內容排版好看的話, 就需要將其轉換過程分為兩步
markdown→htmlhtml→ooxml(Office Open XML) word內容,word元信息本身就是個xml)
上一章節我們使用flexmark將markdown內容轉換為html內容, 完成了第一步, 本章節我們將介紹如何將html轉換為ooxml
2. 環境信息
為了兼容更多的場景, 所以並沒有用一些高版本的SDK, 信息如下
Java: 8
Docx4j: 8.3.10
3. Maven
<properties>
<docx4j.version>8.3.10</docx4j.version>
<jaxb2.version>1.11.1</jaxb2.version>
</properties>
<dependencies>
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j-JAXB-Internal</artifactId>
<version>${docx4j.version}</version>
</dependency>
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j-ImportXHTML</artifactId>
<version>${docx4j.version}</version>
</dependency>
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j-JAXB-MOXy</artifactId>
<version>${docx4j.version}</version>
</dependency>
<dependency>
<groupId>org.jvnet.jaxb2_commons</groupId>
<artifactId>jaxb2-basics</artifactId>
<version>${jaxb2.version}</version>
</dependency>
</dependencies>
4. Html轉Docx
import lombok.SneakyThrows;
import org.docx4j.Docx4J;
import org.docx4j.convert.in.xhtml.XHTMLImporterImpl;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import org.docx4j.openpackaging.parts.WordprocessingML.MainDocumentPart;
import org.docx4j.wml.Body;
import java.io.File;
/**
* html 2 docx
*
* @author ludangxin
* @since 2025/10/14
*/
public class HtmlToDocx {
@SneakyThrows
public static void convertHtmlToDocx(String htmlContent, String outputFilePath) {
// 創建 Word 文檔包
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();
MainDocumentPart mainDocumentPart = wordMLPackage.getMainDocumentPart();
// 設置 XHTML 導入器
XHTMLImporterImpl XHTMLImporter = new XHTMLImporterImpl(wordMLPackage);
// 將 HTML 內容導入到 Word 文檔中
Body body = mainDocumentPart.getJaxbElement().getBody();
body.getContent().addAll(XHTMLImporter.convert(htmlContent, null));
// 保存 Word 文檔
Docx4J.save(wordMLPackage, new File(outputFilePath), Docx4J.FLAG_NONE);
}
public static void main(String[] args) {
String html = "<html><head></head><body><h2>嘉文四世</h2>\n" + "<blockquote>\n" + "<p>德瑪西亞</p>\n" + "</blockquote>\n" + "<p><strong>給我找些更強的敵人!</strong></p>\n" + "<table>\n" + "<thead>\n" + "<tr><th>列1</th><th>列2</th></tr>\n" + "</thead>\n" + "<tbody>\n" + "<tr><td>數據1</td><td>數據2</td></tr>\n" + "</tbody>\n" + "</table>\n" + "</body></html>";
convertHtmlToDocx(html, "demo.docx");
}
}
測試結果如下:
在根目錄生成了docx文件, 文件內容如下
生成的文檔內容是有樣式的, 只不過像字體樣式, 包括表格都是默認樣式
但如果項目上要求輸出的制式文檔或者是模板文件, 對文字標題甚至是表格都有其樣式要求的話, 那就得用一些高階用法了
5. 自定義樣式
如果想要自定義輸出的內容樣式, 其實就兩個思路:
- 從輸出的內容出發: 畢竟是html轉的ooxml, 那麼可以給html添加css樣式給docx4j進行渲染, 但前提是一些簡單的css樣式
- 從word出發: word文件本身就有內置樣式並且也可以自定義樣式, 所以可以先在模板文件中定義好樣式, 然後和輸入的內容進行映射
5.1 Html添加Css
比如給表格添加樣式, 讓表格有邊框並且有一定的樣式,css樣式如下:
table{border-collapse:collapse;border-spacing:0;width:100%;margin:1em 0;background-color:transparent;}table th{background-color:#f7f7f7;border:1px solid #ddd;padding:8px 12px;text-align:left}table td{border:1px solid #ddd;padding:8px 12px}
public static void main(String[] args) {
String html = "<html><head><style>table{border-collapse:collapse;border-spacing:0;width:100%;margin:1em 0;background-color:transparent;}table th{background-color:#f7f7f7;border:1px solid #ddd;padding:8px 12px;text-align:left}table td{border:1px solid #ddd;padding:8px 12px}</style></head><body><h2>嘉文四世</h2>\n" + "<blockquote>\n" + "<p>德瑪西亞</p>\n" + "</blockquote>\n" + "<p><strong>給我找些更強的敵人!</strong></p>\n" + "<table>\n" + "<thead>\n" + "<tr><th>列1</th><th>列2</th></tr>\n" + "</thead>\n" + "<tbody>\n" + "<tr><td>數據1</td><td>數據2</td></tr>\n" + "</tbody>\n" + "</table>\n" + "</body></html>";
convertHtmlToDocx(html, "demo.docx");
}
測試結果如下:
此時其實如果想要輸出的內容樣式好看, 通過定義css基本可以滿足了, 但如果是制式文檔對行間距,字間距,字體型號,標題,等有嚴格的要求, 如果這些都通過css定義的話 有點麻煩, 畢竟人家制式的文檔本身已經定義好了, 那麼就可以使用下面的方式
5.2 Html映射WordStyleId
我們可以先看一下word的內置樣式, 我這裏使用的是mac office,windows 和 wps 有些許差異
從上圖中可以看到, word其實是有很多內置樣式的, 並且可以新建樣式, 下面也可以篩選列表
我們經常在快捷樣式列表中選擇的樣式其實就是從這裏來的
我們先手動新增一個自定義的樣式 如下:
然後通過docx4j獲取所有的wordstyle列表 如下:
private static WordprocessingMLPackage wordMLPackage;
@BeforeAll
@SneakyThrows
public static void init_mainDocumentPart() {
File templateFile = new File("demo.docx");
wordMLPackage = WordprocessingMLPackage.load(templateFile);
}
@Test
@SneakyThrows
public void given_doc_template_when_extract_style_then_return_style_list() {
final StyleDefinitionsPart sdp = wordMLPackage.getMainDocumentPart().getStyleDefinitionsPart();
List<Style> styles = sdp.getContents().getStyle();
log.info("docx styles length: {}", styles.size());
for (Style style : styles) {
String styleId = style.getStyleId();
String name = style.getName().getVal();
final String type = style.getType();
log.info("styleId: {}, name: {}, type: {}", styleId, name, type);
}
}
測試結果如下: 除了內置的樣式如一級標題id=1, 最後的兩個自定義樣式是我們新加的
為什麼手動添加了一個, 而出現兩個樣式記錄: 可能是在選在樣式類型的時候選擇的是“鏈接段落和字符”導致出現了一對多的情況
[main] INFO html2docx.DocxStyleTest -- docx styles length: 25
[main] INFO html2docx.DocxStyleTest -- styleId: a, name: Normal, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: 1, name: heading 1, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: 2, name: heading 2, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: 3, name: heading 3, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: 4, name: heading 4, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: a0, name: Default Paragraph Font, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: a1, name: Normal Table, type: table
[main] INFO html2docx.DocxStyleTest -- styleId: a2, name: No List, type: numbering
[main] INFO html2docx.DocxStyleTest -- styleId: a3, name: header, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: a4, name: 頁眉 字符, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: 10, name: 標題 1 字符, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: 20, name: 標題 2 字符, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: 30, name: 標題 3 字符, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: 40, name: 標題 4 字符, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: a5, name: Normal Indent, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: a6, name: Subtitle, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: a7, name: 副標題 字符, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: a8, name: Title, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: a9, name: 標題 字符, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: aa, name: Emphasis, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: ab, name: Hyperlink, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: ac, name: Table Grid, type: table
[main] INFO html2docx.DocxStyleTest -- styleId: ad, name: caption, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: customBodyText, name: customBodyText, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: customBodyText0, name: customBodyText 字符, type: character
測試的時候發現一個奇怪的問題, 如果沒有手動添加樣式的話輸出的內容如下:
[main] INFO html2docx.DocxStyleTest -- docx styles length: 22
[main] INFO html2docx.DocxStyleTest -- styleId: Normal, name: Normal, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: Heading1, name: heading 1, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: Heading2, name: heading 2, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: Heading3, name: heading 3, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: Heading4, name: heading 4, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: DefaultParagraphFont, name: Default Paragraph Font, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: Header, name: header, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: HeaderChar, name: Header Char, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: Heading1Char, name: Heading 1 Char, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: Heading2Char, name: Heading 2 Char, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: Heading3Char, name: Heading 3 Char, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: Heading4Char, name: Heading 4 Char, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: NormalIndent, name: Normal Indent, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: Subtitle, name: Subtitle, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: SubtitleChar, name: Subtitle Char, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: Title, name: Title, type: paragraph
[main] INFO html2docx.DocxStyleTest -- styleId: TitleChar, name: Title Char, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: Emphasis, name: Emphasis, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: Hyperlink, name: Hyperlink, type: character
[main] INFO html2docx.DocxStyleTest -- styleId: TableGrid, name: Table Grid, type: table
[main] INFO html2docx.DocxStyleTest -- styleId: TableNormal, name: Normal Table, type: table
[main] INFO html2docx.DocxStyleTest -- styleId: Caption, name: caption, type: paragraph
可以發現內置的style前後不一致,未添加自定義樣式前一級標題的id為“Heading1”, 修改後就成了“1”, 可能是默認生成的文檔還是英文的, 當修改保存了之後, 就被系統篡改成中文的了
ok, word樣式我們定義好了
現在就通過docx4j應用一下自定義的word樣式, 實現思路: 通過html標籤的class屬性 映射wrod的styleId
首先給html加上class信息如下圖:
private static WordprocessingMLPackage wordMLPackage;
@BeforeAll
@SneakyThrows
public static void init_mainDocumentPart() {
File templateFile = new File("demo.docx");
wordMLPackage = WordprocessingMLPackage.load(templateFile);
}
@Test
@SneakyThrows
public void given_doc_template_and_class_when_mapping_custom_style_then_render_doc() {
final String html = "<html><head><style>table{border-collapse:collapse;border-spacing:0;width:100%;margin:1em 0;background-color:transparent}table th{background-color:#f7f7f7;border:1px solid#ddd;padding:8px 12px;text-align:left}table td{border:1px solid#ddd;padding:8px 12px}</style></head><body><h2 class=\"1\">嘉文四世</h2><blockquote><p class=\"customBodyText\">德瑪西亞</p></blockquote><p class=\"customBodyText\"><strong>給我找些更強的敵人!</strong></p><table><thead><tr><th>列1</th><th>列2</th></tr></thead><tbody><tr><td>數據1</td><td>數據2</td></tr></tbody></table></body></html>";
final MainDocumentPart mainDocumentPart = wordMLPackage.getMainDocumentPart();
XHTMLImporterImpl importer = new XHTMLImporterImpl(wordMLPackage);
// CLASS_TO_STYLE_ONLY:只認 class,不管 style 和 <strong>/<em> 等標籤,相當於「純 CSS 類驅動樣式」
// CLASS_PLUS_OTHER:class 是基礎樣式,style 和內聯標籤是補充 / 覆蓋,相當於「類樣式 + 局部微調樣式」
// IGNORE_CLASS: 忽略class樣式
importer.setParagraphFormatting(FormattingOption.CLASS_TO_STYLE_ONLY);
importer.setRunFormatting(FormattingOption.CLASS_TO_STYLE_ONLY);
importer.setTableFormatting(FormattingOption.CLASS_TO_STYLE_ONLY);
// html轉ooxml
final List<Object> docxContent = importer.convert(html, null);
final List<Object> docxOldContent = mainDocumentPart.getContent();
// 清空模板內容 並 添加新的內容
docxOldContent.clear();
docxOldContent.addAll(docxContent);
Docx4J.save(wordMLPackage, new File("newDemo.docx"), Docx4J.FLAG_NONE);
}
測試結果如下:
一級標題和自定義樣式都映射完成了
6. 佔位符替換
結合上述的功能, 已經能很好的輸出html到word中了, 但項目上又有新的需求了, 不光是將markdown→html→word, 還需要將大模型識別到的內容一同輸出到模板文件中去, 也就是最終輸出的內容有兩部分
- 大模型提取到的人員信息, 如姓名工作住址等
- 大模型總結的人員描述信息(markdown)
其實第二步內容使用doc4j已經實現了, 現在需要通過佔位符的方式輸出人員基本信息到word中, 這個其實很好處理, 可以使用poi-tl實現佔位符的替換
6.1 Maven
<dependency>
<groupId>com.deepoove</groupId>
<artifactId>poi-tl</artifactId>
<version>1.12.0</version>
</dependency>
6.2 實現
@Test
public void given_template_doc_and_content_when_replace_then_replace() {
final Configure templateEngineConfigure = Configure.builder().build();
File templateFile = new File("demo.docx");
File outputFile = new File("newDemo.docx");
Map<String, Object> data = new HashMap<>();
data.put("user", "嘉文四世");
data.put("summoner", "張鐵牛");
data.put("position", "打野");
data.put("dialogue", "給我找些更強的敵人");
try (XWPFTemplate template = XWPFTemplate.compile(templateFile, templateEngineConfigure)) {
template.render(data).writeToFile(outputFile.getAbsolutePath());
}
catch (IOException e) {
log.error("failed to replace template word placeholder", e);
throw new RuntimeException(e);
}
}
模板內容如下:
測試結果如下:
不僅實現了佔位符的替換, 而且也保留了佔位符本身的樣式, 這就很舒服了
7. 封裝工具類
為了更方便的調用docx4j和poi-tl操作word, 我們可以封裝一個工具類去更方便的調用, 比如可以通過傳入一個map對象然後實現自動替換佔位符和markdown內容渲染, 最好是通過鏈式調用一行代碼就解決戰鬥, 沒錯 它來了
import com.deepoove.poi.XWPFTemplate;
import com.deepoove.poi.config.Configure;
import lombok.SneakyThrows;
import lombok.extern.slf4j.Slf4j;
import org.docx4j.convert.in.xhtml.FormattingOption;
import org.docx4j.convert.in.xhtml.XHTMLImporterImpl;
import org.docx4j.openpackaging.exceptions.Docx4JException;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import org.docx4j.openpackaging.parts.WordprocessingML.MainDocumentPart;
import org.docx4j.wml.Body;
import java.io.File;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.nio.file.Files;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.function.BiFunction;
/**
* doc操作工具類
*
* @author ludangxin
* @since 2025/10/14
*/
@Slf4j
public class Docs {
@SneakyThrows
public static DocBuilder builder() {
return new DocBuilder().wordMLPackage(WordprocessingMLPackage.createPackage());
}
@SneakyThrows
public static DocBuilder builder(File file) {
return new DocBuilder().templateInputStream(Files.newInputStream(file.toPath()))
.wordMLPackage(WordprocessingMLPackage.load(file));
}
@SneakyThrows
public static DocBuilder builder(InputStream inputStream) {
return new DocBuilder().templateInputStream(inputStream)
.wordMLPackage(WordprocessingMLPackage.load(inputStream));
}
@SneakyThrows
public static DocBuilder builder(String filePath) {
return new DocBuilder().templateInputStream(Files.newInputStream(new File(filePath).toPath()))
.wordMLPackage(WordprocessingMLPackage.load(new File(filePath)));
}
public static class DocBuilder {
private InputStream templateInputStream;
private WordprocessingMLPackage wordMLPackage;
private XHTMLImporterImpl importer;
private FormattingOption paragraphFormatting;
private FormattingOption runFormatting;
private FormattingOption tableFormatting;
private String staticResourceBaseUri;
private String[] placeHolderPreSuffix = new String[]{"{{", "}}"};
private Configure templateEngineConfigure;
private boolean useHtmlDefaultStyle = true;
private boolean autoCloseStream = true;
private String globalCss = "table{border-collapse:collapse;border-spacing:0;width:100%;margin:1em 0;background-color:transparent;}table th{background-color:#f7f7f7;border:1px solid #ddd;padding:8px 12px;text-align:left}table td{border:1px solid #ddd;padding:8px 12px}";
/**
* <String, String, String>: htmlContent htmlKey resultHtmlContent
*/
private BiFunction<String, String, String> htmlContentProcessor;
private DocBuilder templateInputStream(InputStream templateInputStream) {
this.templateInputStream = templateInputStream;
return this;
}
private DocBuilder wordMLPackage(WordprocessingMLPackage wordMLPackage) {
this.wordMLPackage = wordMLPackage;
return this;
}
public DocBuilder importer(XHTMLImporterImpl importer) {
this.importer = importer;
return this;
}
public DocBuilder paragraphFormatting(FormattingOption paragraphFormatting) {
this.paragraphFormatting = paragraphFormatting;
return this;
}
public DocBuilder runFormatting(FormattingOption runFormatting) {
this.runFormatting = runFormatting;
return this;
}
public DocBuilder tableFormatting(FormattingOption tableFormatting) {
this.tableFormatting = tableFormatting;
return this;
}
public DocBuilder useHtmlDefaultStyle(boolean useHtmlDefaultStyle) {
this.useHtmlDefaultStyle = useHtmlDefaultStyle;
return this;
}
public DocBuilder staticResourceBaseUri(String staticResourceBaseUri) {
this.staticResourceBaseUri = staticResourceBaseUri;
return this;
}
public DocBuilder placeHolderPreSuffix(String placeHolderPrefix, String placeHolderSuffix) {
this.placeHolderPreSuffix = new String[]{placeHolderPrefix, placeHolderSuffix};
return this;
}
public DocBuilder templateEngineConfigure(Configure templateEngineConfigure) {
this.templateEngineConfigure = templateEngineConfigure;
return this;
}
public DocBuilder autoCloseStream(boolean autoCloseStream) {
this.autoCloseStream = autoCloseStream;
return this;
}
public DocBuilder globalCss(String globalCss) {
this.globalCss = globalCss;
return this;
}
public DocBuilder htmlContentProcessor(BiFunction<String, String, String> htmlContentProcessor) {
this.htmlContentProcessor = htmlContentProcessor;
return this;
}
public List<Object> buildWordML(String html) {
return this.buildWordML(html, null);
}
public void buildWord(String html, String outputFile) {
this.buildWord(html, new File(outputFile));
}
public void buildWord(String html, File outputFile) {
try {
this.getMainContent()
.addAll(this.buildWordML(html));
wordMLPackage.save(outputFile);
}
catch (Exception e) {
log.error("failed to build word file", e);
throw new RuntimeException(e);
}
}
public void buildWord(String html, OutputStream outputStream) {
try {
this.getMainContent()
.addAll(this.buildWordML(html));
wordMLPackage.save(outputStream);
}
catch (Exception e) {
log.error("failed to build word file", e);
throw new RuntimeException(e);
}
finally {
try {
if (autoCloseStream) {
outputStream.close();
}
}
catch (IOException ignored) {
}
}
}
public void buildWord(Map<String, Object> placeHolderData, OutputStream outputStream) {
try {
// 替換模板中的普通佔位符
if (this.checkPlaceHolderDataType(placeHolderData) == 1) {
this.replacePlaceHolder(placeHolderData, outputStream);
}
// 替換模板中包含的html
if (this.checkPlaceHolderDataType(placeHolderData) == 2) {
this.replaceHtmlPlaceHolder(placeHolderData, outputStream);
}
// 替換普通/html佔位符
if (this.checkPlaceHolderDataType(placeHolderData) == 3) {
final File tempDocFile = DocUtils.createTempDocFile();
this.replaceHtmlPlaceHolder(placeHolderData, tempDocFile);
this.replacePlaceHolder(placeHolderData, tempDocFile, tempDocFile);
DocUtils.writeAndDeleteFile(tempDocFile, outputStream);
}
}
catch (Exception e) {
log.error("failed to build word file", e);
throw new RuntimeException(e);
}
finally {
try {
if (autoCloseStream) {
outputStream.close();
}
}
catch (IOException ignored) {
}
}
}
public void buildWord(Map<String, Object> placeHolderData, File outputFile) {
// 替換模板中的普通佔位符
if (this.checkPlaceHolderDataType(placeHolderData) == 1) {
this.replacePlaceHolder(placeHolderData, outputFile);
}
// 替換模板中包含的html
if (this.checkPlaceHolderDataType(placeHolderData) > 1) {
this.replaceHtmlPlaceHolder(placeHolderData, outputFile);
}
// 追加替換普通佔位符
if (this.checkPlaceHolderDataType(placeHolderData) == 3) {
this.replacePlaceHolder(placeHolderData, outputFile, outputFile);
}
}
private List<Object> buildWordML(String html, String htmlKey) {
final XHTMLImporterImpl importer = this.getImporterOrDefault();
try {
if (globalCss != null && !globalCss.isEmpty()) {
html = DocUtils.addHtmlStyles(html, globalCss);
}
if (htmlContentProcessor != null) {
html = htmlContentProcessor.apply(html, htmlKey);
}
return importer.convert(html, staticResourceBaseUri);
}
catch (Exception e) {
log.error("failed to convert HTML to XHTML", e);
throw new RuntimeException(e);
}
}
private void replaceHtmlPlaceHolder(Map<String, Object> placeHolderData, File outputFile) {
this.doReplaceHtmlPlaceHolder(placeHolderData);
try {
// 替換html
wordMLPackage.save(outputFile);
}
catch (Docx4JException e) {
log.error("failed to build word file", e);
throw new RuntimeException(e);
}
}
private void replaceHtmlPlaceHolder(Map<String, Object> placeHolderData, OutputStream outputStream) {
this.doReplaceHtmlPlaceHolder(placeHolderData);
try {
// 替換html
wordMLPackage.save(outputStream);
}
catch (Docx4JException e) {
log.error("failed to build word file", e);
throw new RuntimeException(e);
}
finally {
try {
if (autoCloseStream) {
outputStream.close();
}
}
catch (IOException ignored) {
}
}
}
private void doReplaceHtmlPlaceHolder(Map<String, Object> placeHolderData) {
final List<Object> mainContent = this.getMainContent();
List<Object> newContent = new ArrayList<>();
for (Object p : mainContent) {
String text = DocUtils.extractText(p);
Optional<String> matchedKey = placeHolderData.keySet()
.stream()
.filter(key -> DocUtils.matchPlaceHolder(text, key, placeHolderPreSuffix[0], placeHolderPreSuffix[1]))
.findFirst();
if (matchedKey.isPresent()) {
String key = matchedKey.get();
Object value = placeHolderData.get(key);
if (DocUtils.isHtml(value)) {
final List<Object> wordFragment = this.buildWordML((String) value, key);
newContent.addAll(wordFragment);
}
else {
newContent.add(p);
}
}
else {
newContent.add(p);
}
}
// 替換模板內容
mainContent.clear();
mainContent.addAll(newContent);
}
private void replacePlaceHolder(Map<String, Object> data, File templateFile, File outputFile) {
final Configure templateEngineConfigure = this.getTemplateEngineConfigureOrDefault();
try (XWPFTemplate template = XWPFTemplate.compile(templateFile, templateEngineConfigure)){
template.render(data)
.writeToFile(outputFile.getAbsolutePath());
}
catch (IOException e) {
log.error("failed to replace template word placeholder", e);
throw new RuntimeException(e);
}
}
public void replacePlaceHolder(Map<String, Object> data, File outputFile) {
final Configure templateEngineConfigure = this.getTemplateEngineConfigureOrDefault();
if (templateInputStream == null) {
throw new NullPointerException("template file can not be null");
}
XWPFTemplate template = XWPFTemplate.compile(templateInputStream, templateEngineConfigure);
try {
template.render(data)
.writeToFile(outputFile.getAbsolutePath());
}
catch (IOException e) {
log.error("failed to replace template word placeholder", e);
throw new RuntimeException(e);
}
}
public void replacePlaceHolder(Map<String, Object> data, String outputFileAbsolutePath) {
final Configure templateEngineConfigure = this.getTemplateEngineConfigureOrDefault();
if (templateInputStream == null) {
throw new NullPointerException("template file can not be null");
}
XWPFTemplate template = XWPFTemplate.compile(templateInputStream, templateEngineConfigure);
try {
template.render(data)
.writeToFile(outputFileAbsolutePath);
}
catch (IOException e) {
log.error("failed to replace template word placeholder", e);
throw new RuntimeException(e);
}
}
public void replacePlaceHolder(Map<String, Object> data, OutputStream outputStream) {
final Configure templateEngineConfigure = this.getTemplateEngineConfigureOrDefault();
if (templateInputStream == null) {
throw new NullPointerException("template file can not be null");
}
try {
XWPFTemplate template = XWPFTemplate.compile(templateInputStream, templateEngineConfigure);
final XWPFTemplate render = template.render(data);
render.write(outputStream);
}
catch (IOException e) {
log.error("failed to replace template word placeholder", e);
throw new RuntimeException(e);
}
finally {
try {
if (autoCloseStream) {
outputStream.close();
}
}
catch (IOException ignored) {
}
}
}
private XHTMLImporterImpl getImporterOrDefault() {
if (importer == null) {
if (paragraphFormatting != null || runFormatting != null || tableFormatting != null) {
XHTMLImporterImpl importer = new XHTMLImporterImpl(wordMLPackage);
importer.setParagraphFormatting(paragraphFormatting == null ? FormattingOption.CLASS_PLUS_OTHER : paragraphFormatting);
importer.setRunFormatting(runFormatting == null ? FormattingOption.CLASS_PLUS_OTHER : runFormatting);
importer.setTableFormatting(tableFormatting == null ? FormattingOption.CLASS_PLUS_OTHER : tableFormatting);
return importer;
}
else {
return this.defaultImporter();
}
}
else {
return this.importer;
}
}
private Configure getTemplateEngineConfigureOrDefault() {
if (templateEngineConfigure == null) {
return this.defaultTemplateEngineConfigure();
}
else {
return this.templateEngineConfigure;
}
}
private Configure defaultTemplateEngineConfigure() {
return Configure.builder()
.buildGramer(placeHolderPreSuffix[0], placeHolderPreSuffix[1])
.build();
}
private XHTMLImporterImpl defaultImporter() {
XHTMLImporterImpl importer = new XHTMLImporterImpl(wordMLPackage);
if (useHtmlDefaultStyle) {
importer.setParagraphFormatting(FormattingOption.CLASS_PLUS_OTHER);
importer.setRunFormatting(FormattingOption.CLASS_PLUS_OTHER);
importer.setTableFormatting(FormattingOption.CLASS_PLUS_OTHER);
}
else {
importer.setParagraphFormatting(FormattingOption.CLASS_TO_STYLE_ONLY);
importer.setRunFormatting(FormattingOption.CLASS_TO_STYLE_ONLY);
importer.setTableFormatting(FormattingOption.CLASS_TO_STYLE_ONLY);
}
return importer;
}
private List<Object> getMainContent() {
MainDocumentPart mainDocumentPart = wordMLPackage.getMainDocumentPart();
if (globalCss != null && !globalCss.isEmpty()) {
mainDocumentPart.getStyleDefinitionsPart()
.setCss(globalCss);
}
Body body = mainDocumentPart.getJaxbElement()
.getBody();
return body.getContent();
}
/**
* 判斷佔位符數據類型
*
* @param placeHolderData 佔位符數據
* @return 1: 數據不包含html 2: 數據全是html 3: 都包含
*/
private int checkPlaceHolderDataType(Map<String, Object> placeHolderData) {
boolean hasHtmlValFlag = false;
boolean hasCommonValFlag = false;
for (Object value : placeHolderData.values()) {
if (DocUtils.isHtml(value)) {
hasHtmlValFlag = true;
}
else {
hasCommonValFlag = true;
}
}
if (!hasHtmlValFlag && hasCommonValFlag) {
return 1;
}
if (hasHtmlValFlag && !hasCommonValFlag) {
return 2;
}
return 3;
}
}
}
8. 測試示例
import lombok.SneakyThrows;
import org.junit.jupiter.api.BeforeAll;
import org.junit.jupiter.api.Test;
import java.io.File;
import java.io.OutputStream;
import java.nio.file.Files;
import java.util.HashMap;
import java.util.Map;
/**
* docs test
*
* @author ludangxin
* @since 2025/11/4
*/
public class DocxTest {
private static final File TEMPLATE_FILE = new File("demo.docx");
private static final File OUTPUT_FILE = new File("output.docx");
private static final Map<String, Object> DATA = new HashMap<>();
@BeforeAll
public static void given_data() {
DATA.put("user", "嘉文四世");
DATA.put("summoner", "張鐵牛");
DATA.put("position", "打野");
DATA.put("dialogue", "給我找些更強的敵人");
final String markdownContent = "- **背景故事**:嘉文四世是德瑪西亞國王嘉文三世的獨生子,其母凱瑟琳女士因難產而死。嘉文在宮廷中長大,接受了良好的德瑪西亞式教育,並結識了趙信,向其學習戰爭藝術。他與蓋倫年齡相仿,結為好兄弟。嘉文曾率軍前往邊境對抗諾克薩斯,卻因戰力分散而戰敗,幸得希瓦娜相救。後來,德瑪西亞國內搜魔人兵團搜捕魔法師引發起義,嘉文三世慘遭弒殺,嘉文四世接掌了議會,之後他登基成為德瑪西亞國王。\n" + "\n" + "- **角色定位**:在遊戲中,嘉文四世的定位是坦克、戰士,他常常需要帶頭衝入敵方陣地,因此相比輸出更加需要增強防禦能力。\n" + "\n" + "- 技能介紹\n" + " :\n" + " - **被動技能 - 戰爭律動**:普攻命中時,會對目標造成 8% 當前生命值的額外物理傷害,該效果作用於同一目標的冷卻時間為 6 秒。\n" + " - **一技能 - 巨龍撞擊**:用長矛穿透路徑上的敵人,對其造成物理傷害,並減少其護甲,持續 3 秒。若長矛觸及 “德邦軍旗”,嘉文四世會被引向軍旗,並擊飛沿途敵人 0.75 秒。\n" + " - **二技能 - 黃金聖盾**:釋放出一道帝王光環,使周圍敵人減速,持續 2 秒,同時提供一個可以吸收傷害的護盾,持續 5 秒,附近每多一名敵方英雄,吸收傷害增加。\n" + " - **三技能 - 德邦軍旗**:投擲一柄軍旗,對敵人造成魔法傷害,並將軍旗置於原地 8 秒,使附近隊友獲得攻擊速度加成。在 “德邦軍旗” 附近再次點擊施放該技能,將會朝軍旗施放 “巨龍撞擊”。\n" + " - **終極技能 - 天崩地裂**:躍向敵方英雄,對目標及其附近的敵人造成物理傷害,並在目標周圍形成環形障礙,持續 3.5 秒,再次點擊施放可使障礙倒塌。\n" + "\n" + "- **皮膚信息**:嘉文四世擁有多款皮膚,包括孤膽英豪、暗星、福牛守護者等。";
// markdown 2 html (上一章博客的內容)
final String htmlContent = Markdowns.builder(markdownContent)
.buildHtmlContent();
DATA.put("description", htmlContent);
}
@Test
public void given_template_doc_and_content_when_replace_then_complete() {
Docs.builder(TEMPLATE_FILE).buildWord(DATA, OUTPUT_FILE);
}
@Test
@SneakyThrows
public void given_template_doc_and_content_when_replace_and_output_stream_then_complete() {
final OutputStream fileOutputStream = Files.newOutputStream(OUTPUT_FILE.toPath());
// 接收輸出流
Docs.builder(TEMPLATE_FILE).autoCloseStream(true).buildWord(DATA, fileOutputStream);
}
}
模板內容如下:
測試結果如下:
9. 小結
本章使用docx4j和poi-tl實現將普通佔位符內容和html文本內容轉換為word, 並介紹瞭如何使用其特性實現自定義樣式渲染, 最後封裝鏈式調用的工具類和對應的單元測試代碼, 結合上一章內容能夠將各種形式的內容通過一行代碼即可實現word的渲染
10. 源碼
測試過程中的代碼已全部上傳至github, 歡迎點贊收藏 倉庫地址: https://github.com/ludangxin/markdown2docx