Bug 1217688 - mariadb: a table is missing inserted entries from concurrent processes
Summary: mariadb: a table is missing inserted entries from concurrent processes
Status: NEW
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Other (show other bugs)
Version: Current
Hardware: Other Other
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: Antonio Teixeira
QA Contact: E-mail List
URL: https://jira.mariadb.org/browse/MDEV-...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-11-30 09:13 UTC by Jiri Slaby
Modified: 2024-07-08 06:46 UTC (History)
0 users

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
db inserter (3.75 KB, text/plain)
2023-11-30 09:13 UTC, Jiri Slaby
Details
c reproducer, incl. fork() (3.00 KB, text/plain)
2023-11-30 12:46 UTC, Jiri Slaby
Details
cr.sql to prepare the db, user, and the table (296 bytes, text/plain)
2023-11-30 12:46 UTC, Jiri Slaby
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jiri Slaby 2023-11-30 09:13:31 UTC
Created attachment 871082 [details]
db inserter

With mariadb: 11.0.2, 11.1.2 and 11.1.3 (from server:database):

From my c++ code, using connector-c, I do:

SET autocommit=0
CREATE TABLE IF NOT EXISTS sources(id INTEGER AUTO_INCREMENT PRIMARY KEY, src VARCHAR(1024) NOT NULL UNIQUE)
INSERT IGNORE INTO sources(src) VALUES (?)
START TRANSACTION

And then a lot of:

INSERT IGNORE INTO sources(src) VALUES ('src-003431-00001')

src is composed of pid and values from 0 to 49. So they are all unique.
Finally, I do:

COMMIT

If I run 10 instances in parallel like:

for aa in {1..10} ; do ./db & done ; wait

I see:

select count(id), substr(src,5,6) as sss from sources group by sss;
+-----------+--------+
| count(id) | sss    |
+-----------+--------+
|        50 | 003430 |
|        50 | 003431 |
|        50 | 003432 |
|        40 | 003433 |
|        50 | 003434 |
|        50 | 003435 |
|        50 | 003436 |
|        50 | 003437 |
|        50 | 003438 |
|        50 | 003439 |
+-----------+--------+

I.e. Some of pids stored only last 40 values, numbers 0..9 are missing in the table:

select src from sources where src like '%003433%' order by src ;

+------------------+
| src              |
+------------------+
| src-003433-00010 |
| src-003433-00011 |
| src-003433-00012 |
| src-003433-00013 |
| src-003433-00014 |
| src-003433-00015 |
| src-003433-00016 |
| src-003433-00017 |
| src-003433-00018 |
| src-003433-00019 |
| src-003433-00020 |
| src-003433-00021 |
| src-003433-00022 |
| src-003433-00023 |
| src-003433-00024 |
| src-003433-00025 |
| src-003433-00026 |
| src-003433-00027 |
| src-003433-00028 |
| src-003433-00029 |
| src-003433-00030 |
| src-003433-00031 |
| src-003433-00032 |
| src-003433-00033 |
| src-003433-00034 |
| src-003433-00035 |
| src-003433-00036 |
| src-003433-00037 |
| src-003433-00038 |
| src-003433-00039 |
| src-003433-00040 |
| src-003433-00041 |
| src-003433-00042 |
| src-003433-00043 |
| src-003433-00044 |
| src-003433-00045 |
| src-003433-00046 |
| src-003433-00047 |
| src-003433-00048 |
| src-003433-00049 |
+------------------+

Is this a bug in my program (attached) or the db? If I look into the general log, I see the inserts which apparently did not get into the table:

grep src-003433 /tmp/glog

                    24 Execute  INSERT IGNORE INTO sources(src) VALUES ('src-003433-00000')
                    24 Execute  INSERT IGNORE INTO sources(src) VALUES ('src-003433-00000')
                    24 Execute  INSERT IGNORE INTO sources(src) VALUES ('src-003433-00000')
                    24 Execute  INSERT IGNORE INTO sources(src) VALUES ('src-003433-00000')
                    24 Execute  INSERT IGNORE INTO sources(src) VALUES ('src-003433-00000')
                    24 Execute  INSERT IGNORE INTO sources(src) VALUES ('src-003433-00001')
                    24 Execute  INSERT IGNORE INTO sources(src) VALUES ('src-003433-00002')
                    24 Execute  INSERT IGNORE INTO sources(src) VALUES ('src-003433-00003')
                    24 Execute  INSERT IGNORE INTO sources(src) VALUES ('src-003433-00004')
                    24 Execute  INSERT IGNORE INTO sources(src) VALUES ('src-003433-00005')
                    24 Execute  INSERT IGNORE INTO sources(src) VALUES ('src-003433-00006')
                    24 Execute  INSERT IGNORE INTO sources(src) VALUES ('src-003433-00007')
                    24 Execute  INSERT IGNORE INTO sources(src) VALUES ('src-003433-00008')
                    24 Execute  INSERT IGNORE INTO sources(src) VALUES ('src-003433-00009')
                    24 Execute  INSERT IGNORE INTO sources(src) VALUES ('src-003433-00010')
                    24 Execute  INSERT IGNORE INTO sources(src) VALUES ('src-003433-00010')
                    24 Execute  INSERT IGNORE INTO sources(src) VALUES ('src-003433-00011')
                    24 Execute  INSERT IGNORE INTO sources(src) VALUES ('src-003433-00012')

Note also the repeated inserts. They come from a deadlock, so I repeat the exec in a loop (see the code).

Dockers with

    mariadb:latest (11.2.2)
    mariadb:11.1.3
    mariadb:11.1.2

all run FINE too.
Comment 1 Jiri Slaby 2023-11-30 09:15:41 UTC
Disabling LTO does not help.
Comment 2 Danilo Spinella 2023-11-30 10:39:54 UTC
Hello Jiri, thank you for the detailed report with a reproducer.

By my understanding, this is an issue that only happens with mariadb on TW. Any version of the official mariadb images works fine. Can you please confirm?
Comment 3 Jiri Slaby 2023-11-30 11:12:15 UTC
(In reply to Danilo Spinella from comment #2)
> By my understanding, this is an issue that only happens with mariadb on TW.

Yeah, I haven't tried anything else than TW so far.

> Any version of the official mariadb images works fine. Can you please
> confirm?

Concur. I wonder, if we build mariadb with some different flags. But I failed to find what is used for building the docker images (it's apparently mariadb on Ubuntu 22.04).
Comment 4 Danilo Spinella 2023-11-30 11:24:47 UTC
(In reply to Jiri Slaby from comment #3)
> (In reply to Danilo Spinella from comment #2)
> > By my understanding, this is an issue that only happens with mariadb on TW.
> 
> Yeah, I haven't tried anything else than TW so far.
> 
> > Any version of the official mariadb images works fine. Can you please
> > confirm?
> 
> Concur. I wonder, if we build mariadb with some different flags. But I
> failed to find what is used for building the docker images (it's apparently
> mariadb on Ubuntu 22.04).

One flag that usually causes problem is -DFORTIFY_SOURCE=2. lto is also often the cause of many issues, but it looks like it's not the problem here.
Comment 5 Jiri Slaby 2023-11-30 12:24:04 UTC
(In reply to Jiri Slaby from comment #3)
> (In reply to Danilo Spinella from comment #2)
> > By my understanding, this is an issue that only happens with mariadb on TW.
> 
> Yeah, I haven't tried anything else than TW so far.
> 
> > Any version of the official mariadb images works fine. Can you please
> > confirm?
> 
> Concur. I wonder, if we build mariadb with some different flags. But I
> failed to find what is used for building the docker images (it's apparently
> mariadb on Ubuntu 22.04).

Scratch this. I temporarily dropped the column size from varchar(1024) to varchar(256) in the docker images (when testing mysql-community where 1024 is not allowed for unique columns).

When varchar(1024) is used, it's broken also in the docker. Let me update the upstream issue.
Comment 6 Jiri Slaby 2023-11-30 12:46:06 UTC
Created attachment 871086 [details]
c reproducer, incl. fork()
Comment 7 Jiri Slaby 2023-11-30 12:46:58 UTC
Created attachment 871087 [details]
cr.sql to prepare the db, user, and the table
Comment 8 Jiri Slaby 2023-12-01 07:31:44 UTC
It happens in 15sp6 too:
$ grep VERSION= /etc/os-release
VERSION="15-SP6"
$ rpm -q mariadb
mariadb-10.11.5-150600.2.6.x86_64
+-----------+--------+
| count(id) | sss    |
+-----------+--------+
|        50 | 001656 |
|        50 | 001657 |
|        50 | 001658 |
|        30 | 001659 |
|        50 | 001660 |
|        46 | 001661 |
|        50 | 001662 |
|        50 | 001663 |
|        50 | 001664 |
|        50 | 001665 |
+-----------+--------+
Comment 9 Jiri Slaby 2023-12-01 07:50:30 UTC
In mysql:
> mysql> ALTER TABLE sources CHANGE COLUMN src src VARCHAR(769) NOT NULL UNIQUE;
> ERROR 1071 (42000): Specified key was too long; max key length is 3072 bytes

The maximum length of UNIQUE column in mysql is 3072 bytes, i.e. 768 utf8mb4 chars.

So if I do this in mariadb:
> MariaDB [structs_66]> ALTER TABLE sources CHANGE COLUMN src src VARCHAR(768) NOT NULL UNIQUE;
everything works. If I enlarge to 769, it breaks:
> MariaDB [structs_66]> ALTER TABLE sources CHANGE COLUMN src src VARCHAR(769) NOT NULL UNIQUE;
Comment 10 Jiri Slaby 2023-12-01 07:51:28 UTC
So it might be that the only problem is that no error (max key length is 3072 bytes) is emitted and the table is miscreated.
Comment 11 Jiri Slaby 2023-12-01 07:59:49 UTC
(In reply to Jiri Slaby from comment #10)
> So it might be that the only problem is that no error (max key length is
> 3072 bytes) is emitted and the table is miscreated.

No, support for > 3072 was added to mariadb in 10.5 via:
https://jira.mariadb.org/browse/MDEV-371

So it's likely a bug in the hashing over the virtual table.
Comment 12 Jiri Slaby 2024-07-08 06:46:36 UTC
BTW, the upstream issue:
https://jira.mariadb.org/browse/MDEV-32871

FWIW this was fixed with 11.2.2->11.2.3 update a a by-product by some other change.