POI:
1.userModel模式
一种是使用最多的,像用的HSSFWorkBook、XSSFWorkBook、SXSSFWorkBook,这里我们称它为(也是内存消耗较大的模式)。为什么内存占用大呢?直接看读取源码,拿XSSFWorkbook加载为例,最终会调用这个方法:
public ZipInputStreamZipEntrySource(ThresholdInputStream inp) throws IOException {
zipEntries = new ArrayList<FakeZipEntry>();
boolean going = true;
while(going) {
ZipEntry zipEntry = inp.getNextEntry();
if(zipEntry == null) {
going = false;
} else {
FakeZipEntry entry = new FakeZipEntry(zipEntry, inp);
inp.closeEntry();
zipEntries.add(entry);
}
}
inp.close();
}
这里就可以看到,会一直读流,每行都会生成一个对象,加载到zipEntries中去,这就是内存占用大的根本原因。
2.eventModel模式
也就是SAX模式,easyexcel就是重写了poi的这个方法,达到更小的内存占用。通过将流一行行的读取,加载到内存,达到节约内存的目的。
EasyExcel:
XlsxSaxAnalyser将数据读取成inputStream流,缓存到了sheetMap
public XlsxSaxAnalyser(AnalysisContext analysisContext, InputStream decryptedStream) throws Exception {
...
//将每sheet数据读取成inputStream流,缓存sheetMap
XSSFReader xssfReader = new XSSFReader(pkg);
analysisUse1904WindowDate(xssfReader, readWorkbookHolder);
stylesTable = xssfReader.getStylesTable();
sheetList = new ArrayList<ReadSheet>();
sheetMap = new HashMap<Integer, InputStream>();
XSSFReader.SheetIterator ite = (XSSFReader.SheetIterator)xssfReader.getSheetsData();
int index = 0;
if (!ite.hasNext()) {
throw new ExcelAnalysisException("Can not find any sheet!");
}
while (ite.hasNext()) {
InputStream inputStream = ite.next();
sheetList.add(new ReadSheet(index, ite.getSheetName()));
sheetMap.put(index, inputStream);
index++;
}
}
拿到一行数据后,会去调用dealData方法,dealData方法会去调用之初始化的监听器,执行业务处理,readListener.invoke方法。
public class RowTagHandler extends AbstractXlsxTagHandler {
@Override
public void startElement(XlsxReadContext xlsxReadContext, String name, Attributes attributes) {
XlsxReadSheetHolder xlsxReadSheetHolder = xlsxReadContext.xlsxReadSheetHolder();
int rowIndex = PositionUtils.getRowByRowTagt(attributes.getValue(ExcelXmlConstants.ATTRIBUTE_R),
xlsxReadSheetHolder.getRowIndex());
Integer lastRowIndex = xlsxReadContext.readSheetHolder().getRowIndex();
while (lastRowIndex + 1 < rowIndex) {
// 每次都会拿新的一行
xlsxReadContext.readRowHolder(new ReadRowHolder(lastRowIndex + 1, RowTypeEnum.EMPTY,
xlsxReadSheetHolder.getGlobalConfiguration(), new LinkedHashMap<Integer, Cell>()));
// 调用invoke方法
xlsxReadContext.analysisEventProcessor().endRow(xlsxReadContext);
xlsxReadSheetHolder.setColumnIndex(null);
xlsxReadSheetHolder.setCellMap(new LinkedHashMap<Integer, Cell>());
lastRowIndex++;
}
xlsxReadSheetHolder.setRowIndex(rowIndex);
}
@Override
public void endRow(AnalysisContext analysisContext) {
if (RowTypeEnum.EMPTY.equals(analysisContext.readRowHolder().getRowType())) {
if (LOGGER.isDebugEnabled()) {
LOGGER.warn("Empty row!");
}
if (analysisContext.readWorkbookHolder().getIgnoreEmptyRow()) {
return;
}
}
dealData(analysisContext);
}
private void dealData(AnalysisContext analysisContext) {
ReadRowHolder readRowHolder = analysisContext.readRowHolder();
Map<Integer, ReadCellData<?>> cellDataMap = (Map)readRowHolder.getCellMap();
readRowHolder.setCurrentRowAnalysisResult(cellDataMap);
int rowIndex = readRowHolder.getRowIndex();
int currentHeadRowNumber = analysisContext.readSheetHolder().getHeadRowNumber();
boolean isData = rowIndex >= currentHeadRowNumber;
// Last head column
if (!isData && currentHeadRowNumber == rowIndex + 1) {
buildHead(analysisContext, cellDataMap);
}
// Now is data
for (ReadListener readListener : analysisContext.currentReadHolder().readListenerList()) {
try {
if (isData) {
readListener.invoke(readRowHolder.getCurrentRowAnalysisResult(), analysisContext);
... ...
}