Web lists-archives.com

Re: Optimize fails due to duplicate rows error but no duplicates found




(please do not top post - see my answer below)

On 2/13/2018 4:00 PM, Machiel Richards wrote:
ok, so we have managed to get an id out of the errors etc... however
when we look in the table that id does not even exist at all.


no idea what is going on here though.


------------------------------------------------------------------------
*From:* shawn l.green <shawn.l.green@xxxxxxxxxx>
*Sent:* 13 February 2018 09:51:33 PM
*To:* mysql@xxxxxxxxxxxxxxx
*Subject:* Re: Optimize fails due to duplicate rows error but no
duplicates found
Hello Machiel,

On 2/13/2018 3:02 AM, Machiel Richards wrote:
Good day guys,


     I am hoping this mail finds you well.


    I am at a bit of a loss here...


     We are trying to run optimize against a table in order to reclaim disk space from archived data which has been removed.


     However, after running for over an hour , the optimize fails stating there is a duplicate entry in the table.



      We have now spent 2 days using various methods but we are unable to find any duplicates in the primary key and also nothing on the unique key fields.


Any idea on why optimize would still be failing ?



Regards


Is it possible that the duplicate keys were the result of
re-partitioning your data where one of the "older" copies was in the
wrong partition as part of an upgrade from an earlier version?

See the entry in
https://dev.mysql.com/doc/relnotes/mysql/5.5/en/news-5-5-31.html that
start with...
Incompatible Change; Partitioning: Changes in the KEY partitioning hashing functions used with numeric, date and time, ENUM, and SET columns in MySQL 5.5 makes tables using partitioning or subpartitioning by KEY on any of the affected column types and created  on a MySQL 5.5 or later server incompatible with a MySQL 5.1 server.
This is because the partition IDs as calculated by a MySQL 5.5 or later
server almost certainly differ from those calculated by a MySQL 5.1
server for the same table definition and data as a result of the changes
in these functions.

A normal indexed lookup against a partitioned table will use
(particularly for a PK value) "partition pruning" .  To see all of your
PK values regardless of which partition they are in, you need to scan
the table and avoid all indexes.

example:
# for a numeric PK column
CREATE TABLE myPK_list SELECT pk from sourcetable WHERE pk+0 > 0;

Then you can check the list in the generated table to find any duplicate
values.

Then you can modify a SELECT command to search each partition or
subpartition individually until you find the rows that are in the wrong
spots.
https://dev.mysql.com/doc/refman/5.7/en/partitioning-selection.html


Yours,
--
Shawn Green
MySQL Senior Principal Technical Support Engineer
Oracle USA, Inc. - Integrated Cloud Applications & Platform Services
Office: Blountville, TN

Become certified in MySQL! Visit https://www.mysql.com/certification/
for details.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql


Then another thing to consider is that someone (during the lifetime of this table) changed the character set of your table (possibly changing it from using a case-sensitive collation to a case-insensitive collation) without actually converting the data on the table to use the new character set.

Is the key being duplicated numeric or character-based?

If numeric, is the value being reported as the duplicate at the high end of the permitted range of values for that column?

Regards,

--
Shawn Green
MySQL Senior Principal Technical Support Engineer
Oracle USA, Inc. - Integrated Cloud Applications & Platform Services
Office: Blountville, TN

Become certified in MySQL! Visit https://www.mysql.com/certification/ for details.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql