Web lists-archives.com

Re: Estimate mysqldump size




Hi Olivier,

On 28/11/2018 8:00 pm, Olivier wrote:
Hello,

Is there a way that gives an estimate of the size of a mysqldump such a
way that it would always be larger than the real size?

So far, I have found:

    mysql -s -u root -e "SELECT SUM(data_length) Data_BB FROM
       information_schema.tables WHERE table_schema NOT IN
       ('information_schema','performance_schema','mysql');

but the result may be smaller than the real size.

In the above example, you also need to account for index_length, eg

mysql>  select round(SUM(data_length+index_length)/POWER(1024,2),1) Total_MB,round(SUM(data_length)/POWER(1024,2),1) data_MB,round(SUM(index_length)/POWER(1024,2),1) index_MB  FROM information_schema.tables where TABLE_SCHEMA not in ( "information_schema", "performance_schema", "mysql") ;
+----------+---------+----------+
| Total_MB | data_MB | index_MB |
+----------+---------+----------+
|   4546.0 |  4093.7 |    452.2 |
+----------+---------+----------+
1 row in set (0.00 sec)

However, this doesn't 100% map to OS file size ( if using innodb file per table ) and will likely never be 100% accurate to what the OS reports, due to fragmentation etc.


I am writting a program that takes the result of mysqldump and pipe it
in a tar file.

A typical global mysqldump ( ie taken with -A ) will be a single file.  Why are you then wanting to pipe this to a tar archive?

Its also common for mysqldump to be compressed via a pipe due to the nature of the output file created ( eg text files compress *very* well ) , to then be sent across the network , eg via ssh

mysqldump -u.. -p -A | gzip > schema.sql.gz


Aside from your stated goal of piping to tar, if we can step back a level briefly - what are you trying to achieve here?

Tar file format has the size in the header, before the
data and if the size of the dump is bigger than the size declared in the
header, tar does not like that (if the size of the dump is smaller than
the actual size, it can be padded with spaces).

So, the estimate must be larger than the actual dump, how to acheive
that?

It wont be anything other than an estimate , however it should still be reasonably close if you arent doing a *lot* of dml on it.

You could artificially inflate the expected size by ,eg multiplying by 1.1x or 1.2x , however there will always be an edge case table which will be greater still..


Regards

Ronan McGlue

MySQL Support




Thanks in advance,

Olivier



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql