Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xlsx读取时和展示看到的一样,部分数据有随机前缀 #3823

Closed
skyline-liu opened this issue Jun 5, 2024 · 7 comments · May be fixed by #3861
Closed

xlsx读取时和展示看到的一样,部分数据有随机前缀 #3823

skyline-liu opened this issue Jun 5, 2024 · 7 comments · May be fixed by #3861
Assignees
Labels
help wanted Extra attention is needed

Comments

@skyline-liu
Copy link

触发场景描述

读取xlsx内容时,护照号码部分数据与文件显示值不一致
1、读取用户提供的excel时,护照号码,部分数据与文件显示值不一致
2、拷贝用户提供的excel文件,不做任何修改直接保存,读取内容与显示值一致
image
image

工具版本信息

JDK:1.8.0_361
SpringBoot:2.7.18
easyexcel版本: 3.2.1 / 3.3.3
客户excel 版本: wps表格版本号【12.1.0.16929】
本地excel 版本:office 2021,wps 2024 16929
Hutool:5.8.25

触发Bug的代码

   public class EasyExcelTest {
    public static void main(String[] args) {
        Console.log("============= 客户文件读取内容 ================");
        String filename2 = "C:\\6-3 E83307--6.xlsx";
        easyExcelRead(filename2);
        Console.log("============= 拷贝保存后读取内容 ===============");
        String filename21 = "C:\\6-3 E83307--6 - 副本.xlsx";
        easyExcelRead(filename21);
    }

    protected static void easyExcelRead(String filename) {
        EasyExcel.read(filename, DemoData.class, new AnalysisEventListener<DemoData>() {
            @Override
            public void invoke(DemoData data, AnalysisContext context) {
                if (data.ignoreEmpty()) {
                    return;
                }

                data.verify();

                Console.log("读取到旅客信息: {}", JSON.toJSONString(data));
            }

            @Override
            public void doAfterAllAnalysed(AnalysisContext context) {

            }
        }).sheet().doRead();
    }

    @Getter
    @Setter
    @Accessors(chain = false)
    @EqualsAndHashCode
    public static class DemoData {
        private String num;
        private String name;
        private String country;
        private String sex;
        private String licence;
        private String packageNum;
        private String packageWeight;
        private String notes;

        public boolean ignoreEmpty() {
            return CharSequenceUtil.isAllBlank(name, country, sex, licence);
        }
    }
}

excel 文件

客户提供文件 :

6-3 E83307--6.xlsx

本地拷贝保存后文件:

6-3 E83307--6 - 副本.xlsx

@skyline-liu skyline-liu added the bug Something isn't working label Jun 5, 2024
@psxjoy
Copy link
Collaborator

psxjoy commented Jun 6, 2024

It is not a bug, change to the help wanted tag. @zhuangjiaju

@psxjoy
Copy link
Collaborator

psxjoy commented Jun 6, 2024

原始文件本身格式有问题,让客户,或者你自己提供一个正常的表格文件即可。

@skyline-liu
Copy link
Author

原始文件本身格式有问题,可以详说下什么原因/什么插件导致的吗?

@skyline-liu
Copy link
Author

通过历史数据查询,4,5个用户都有这个问题,使用的WPS12.1.0.16929

让客户更换WPS,不知是否可以解决此类问题吗?

@skyline-liu
Copy link
Author

POI SXSSF方式导入 发现 E2 列 输出两次值

E2-11
E2-EJ7336793
E3-759363084
E3-E55155204
public static void excelReadSXSSFTest(String filename) throws Exception {
        //1.创建工作簿,使用excel能操作的这边都看看操作
        OPCPackage opcPackage = OPCPackage.open(filename);
        XSSFReader xssfReader = new XSSFReader(opcPackage);
        StylesTable stylesTable = xssfReader.getStylesTable();
        ReadOnlySharedStringsTable sharedStringsTable = new ReadOnlySharedStringsTable(opcPackage);
        // 创建XMLReader,设置ContentHandler
        XMLReader xmlReader = SAXHelper.newXMLReader();
        xmlReader.setContentHandler(new XSSFSheetXMLHandler(stylesTable, sharedStringsTable, new XSSFSheetXMLHandler.SheetContentsHandler() {
            @Override
            public void startRow(int rowNum) {
            }

            @Override
            public void endRow(int rowNum) {
            }

            @Override
            public void cell(String cellReference, String formattedValue, XSSFComment comment) {
                if (ListUtil.of("E2", "E3").contains(cellReference)) {
                    Console.log("{}-{}", cellReference, formattedValue);
                }
            }
        }, false));
        // 解析每个Sheet数据
        Iterator<InputStream> sheetsData = xssfReader.getSheetsData();
        while (sheetsData.hasNext()) {
            try (InputStream inputStream = sheetsData.next();) {
                xmlReader.parse(new InputSource(inputStream));
            }
        }
    }

@psxjoy
Copy link
Collaborator

psxjoy commented Jun 6, 2024

通过历史数据查询,4,5个用户都有这个问题,使用的WPS12.1.0.16929

让客户更换WPS,不知是否可以解决此类问题吗?
Snipaste_2024-06-06_17-15-34

E2列被设置成了“文本”格式。看了下excel的xml文件,有一部分隐藏了数字,但是解析的时候按照EMC规范会解析出来。
最简单的就是把E2列设置成“常规”。

@skyline-liu
Copy link
Author

@psxjoy 好的,我修改下导入模板E2列格式,后续再观察下~

@psxjoy psxjoy added help wanted Extra attention is needed and removed bug Something isn't working labels Jun 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants