MySQL LOAD XML LOCAL slows down after first 100k entries

334
June 29, 2017, at 05:17 AM

I am trying to import a large XML (arround 700MB) into a MySQL database. It is working so far but the thing is, it is terribly slow. The XML contains arround 300k entries. The first 100k entries are quite fast, from then on every +1000 entries need about 5min.

Schema of the corresponding MySQL table:

CREATE TABLE IF NOT EXISTS `articles` (
`PHAR` varchar(10) NOT NULL,
`PRDNO` varchar(10),
`SMCAT` varchar(5),
`DSCRLONGD` varchar(200),
`DSCRLONGF` varchar(200),
`QTYUD` varchar(10),
`QTYUF` varchar(10),
`IMG2` varchar(50),
`DSCRPACKD` varchar(200),
`DSCRPACKF` varchar(200),
`NOPCS` int(11),
`WEIGHT` varchar(50)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

The xml which I am trying to import:

<ARTICLE CREATION_DATETIME="2017-06-27T13:23:04.2814446+02:00" PROD_DATE="2017-06-27T00:00:00" VALID_DATE="2017-06-27T00:00:00" xmlns="http://www.hcisolutions.ch/index">
  <ART DT="2016-08-08T00:00:00+02:00">
    <PHAR>0020244</PHAR>
    <PHARMACODE>20244</PHARMACODE>
    <GTIN>7680316440115</GTIN>
    <ARTNO>20244</ARTNO>
    <GRPCD>M1</GRPCD>
    <CDSO1>03.00.00.00</CDSO1>
    <PRDNO>24538</PRDNO>
    <SMCAT>C</SMCAT>
    <SMNO>31644011</SMNO>
    <HOSPCD>N</HOSPCD>
    <CLINCD>N</CLINCD>
    <ARTTYP>0</ARTTYP>
    <VAT>2</VAT>
    <SALECD>N</SALECD>
    <INSLIM>N</INSLIM>
    <LIMPTS>0</LIMPTS>
    <GRDFR>0</GRDFR>
    <TEMP>15/25</TEMP>
    <BG>N</BG>
    <EXP>60</EXP>
    <QTY>30</QTY>
    <DSCRD>FERRO-GRADUMET Depottabl 30 Stk</DSCRD>
    <DSCRF>FERRO-GRADUMET cpr dépôt 30 pce</DSCRF>
    <DSCRLONGD>Ferro-Gradumet Depottabl 30 Stk</DSCRLONGD>
    <DSCRLONGF>Ferro-Gradumet cpr dépôt 30 pce</DSCRLONGF>
    <SORTD>FERRO-GRADUMET DEPOTTABL 30 STK</SORTD>
    <SORTF>FERRO-GRADUMET CPR DÉPÔT 30 PCE</SORTF>
    <QTYUD>Stk</QTYUD>
    <QTYUF>pce</QTYUF>
    <MULT>1</MULT>
    <NOPCS>30</NOPCS>
    <MINI>14</MINI>
    <DEPCD>N</DEPCD>
    <LOACD>N</LOACD>
    <STTOX>N</STTOX>
    <GGL>N</GGL>
    <SMDAT>1967-06-22T00:00:00+02:00</SMDAT>
    <BAGDOSSIER>10696</BAGDOSSIER>
    <DEL>false</DEL>
    <ARTCOMP>
      <COMPNO>1836</COMPNO>
      <ROLE>H</ROLE>
      <ARTNO1>685230</ARTNO1>
      <ARTNO2>685230</ARTNO2>
      <ARTNO3>685230</ARTNO3>
    </ARTCOMP>
    <ARTCOMP>
      <COMPNO>1836</COMPNO>
      <ROLE>V</ROLE>
      <ARTNO1>685230</ARTNO1>
      <ARTNO2>685230</ARTNO2>
      <ARTNO3>685230</ARTNO3>
    </ARTCOMP>
    <ARTCOMP>
      <COMPNO>5360</COMPNO>
      <ROLE>L</ROLE>
      <ARTNO1>685230</ARTNO1>
      <ARTNO2>685230</ARTNO2>
      <ARTNO3>685230</ARTNO3>
    </ARTCOMP>
    <ARTBAR>
      <CDTYP>E13</CDTYP>
      <BC>7680316440115</BC>
      <BCSTAT>A</BCSTAT>
    </ARTBAR>
    <ARTCH>
      <PHAR2>4204981</PHAR2>
      <CHTYPE>SDO</CHTYPE>
      <LINENO>1</LINENO>
      <NOUNITS>30</NOUNITS>
    </ARTCH>
    <ARTPRI>
      <VDAT>1999-01-01T00:00:00+01:00</VDAT>
      <PTYP>PPUB</PTYP>
      <PRICE>10.9</PRICE>
    </ARTPRI>
    <ARTPRI>
      <VDAT>2005-07-22T00:00:00+02:00</VDAT>
      <PTYP>PEXF</PTYP>
      <PRICE>7.92</PRICE>
    </ARTPRI>
    <ARTINS>
      <VDAT>2004-07-01T00:00:00+02:00</VDAT>
      <INCD>3</INCD>
      <NINCD>30</NINCD>
    </ARTINS>
  </ART>
  <ART>...</ART
  </ARTICLE>

I try to import it using a the follwing MySQL command:

LOAD XML LOCAL INFILE '~/Desktop/HCI/Article.xml' INTO TABLE articles ROWS IDENTIFIED BY '';

Is there anything I can do to speed it up? Other XMLs (same size, maybe less tags/attributes) are much faster (they need arround 5mins in total).

I would appreciate it to get some help from you.

My machine: Apple Macbook Pro i7, 3.1Ghz, 16GB Ram, 1TB SSD.

Thanks in advance guys and girls.

Answer 1

How is the ARTCOMP etc (so the more indented part of the data) processed? FK relations? Because I think you just ran out of memory and MySQL decided to use tmp, or the system started swapping.

Rent Charter Buses Company
READ ALSO
Why do I not get a response from JSON post request unless the number of rows is limited?

Why do I not get a response from JSON post request unless the number of rows is limited?

I am using a JSON post request to return rows from a mySQL databaseMy server-side language is php and unless I limit the results in sql to about 45 I don't get a response

239
How to convert mySQL time to python time?

How to convert mySQL time to python time?

I'm trying to retrieve time variable from mySQL database using pythonBelow shows the python code I use to retrieve the time variable from mySQL database

384
Laravel query Builder with DATE mysql function

Laravel query Builder with DATE mysql function

Hi I have a query in my laravel project which should retrieve dates between rangesMy query is:

452