`

mysql slave 备库延迟是怎么得到的

阅读更多

    在mysql的备库的监控中有一项很重要的指标:Seconds_Behind_Master,这个值是怎么得到的呢?下面从5.1.58的代码中分析一下:

    mysql的replication中有2个比较重要的class:Master_info(rpl_mi.h), Relay_log_info(rpl_rli.h),他们分别对应于master,info文件和slave.info文件;很显然,Master_info是io_thread需要的,Relay_log_info是sql_thread需要的。Master_info中有一个变量 clock_diff_with_master,这个值记录着mysql的主库和备库的时间差,可以理解为主备的主机时间差。clock_diff_with_master变量的定义如下:

  /*
     The difference in seconds between the clock of the master and the clock of
     the slave (second - first). It must be signed as it may be <0 or >0.
     clock_diff_with_master is computed when the I/O thread starts; for this the
     I/O thread does a SELECT UNIX_TIMESTAMP() on the master.
     "how late the slave is compared to the master" is computed like this:
     clock_of_slave - last_timestamp_executed_by_SQL_thread - clock_diff_with_master

  */
  long clock_diff_with_master;

     这个变量的注释直接提到了Seconds_Behind_Master的计算方法:clock_of_slave - last_timestamp_executed_by_SQL_thread - clock_diff_with_master。clock_of_slave是slave的当前时间--执行show slave status的当前时间。

     先看一下clock_diff_with_master的计算:(slave.cc)。执行”start slave;“/“start slave io_thread;”后,会执行start_slave_threads来启动io thread,io thread启动后首先做的就是获取主库的mysql版本和主库的当前时间(mysql_real_query(mysql, STRING_WITH_LEN("SELECT UNIX_TIMESTAMP()"))),获取到主库的当前时间后,用备库的当前时间减去主库的时间,得到clock_diff_with_master。

int start_slave_threads(bool need_slave_mutex, bool wait_for_start,
                        Master_info* mi, const char* master_info_fname,
                        const char* slave_info_fname, int thread_mask)
{
...
  if (thread_mask & SLAVE_IO)
    error=start_slave_thread(handle_slave_io,lock_io,lock_cond_io,
                             cond_io,
                             &mi->slave_running, &mi->slave_run_id,
                             mi, 1); //high priority, to read the most possible
...
}
...
/**
  Slave IO thread entry point.

  @param arg Pointer to Master_info struct that holds information for
  the IO thread.

  @return Always 0.
*/
pthread_handler_t handle_slave_io(void *arg)
{
...
  thd_proc_info(thd, "Checking master version");
  ret= get_master_version_and_clock(mysql, mi);
...
}
...
static int get_master_version_and_clock(MYSQL* mysql, Master_info* mi)
{
...
  master_res= NULL;
  if (!mysql_real_query(mysql, STRING_WITH_LEN("SELECT UNIX_TIMESTAMP()")) &&
      (master_res= mysql_store_result(mysql)) &&
      (master_row= mysql_fetch_row(master_res)))
  {
    mi->clock_diff_with_master=
      (long) (time((time_t*) 0) - strtoul(master_row[0], 0, 10));
  }
  else if (is_network_error(mysql_errno(mysql)))
  {
    mi->report(WARNING_LEVEL, mysql_errno(mysql),
               "Get master clock failed with error: %s", mysql_error(mysql));
    goto network_err;
  }
  else 
  {
    mi->clock_diff_with_master= 0; /* The "most sensible" value */
    sql_print_warning("\"SELECT UNIX_TIMESTAMP()\" failed on master, "
                      "do not trust column Seconds_Behind_Master of SHOW "
                      "SLAVE STATUS. Error: %s (%d)",
                      mysql_error(mysql), mysql_errno(mysql));
  }
...
}

     last_timestamp_executed_by_SQL_thread则是sql slave thread最近一次执行的event的时间戳(rpl.rli.cc中的last_event_start_time变量)。而这个变量的更新则是在stmt_done()函数中完成的(rpl.rli.cc),相关定义如下:

  /*
    Used by row-based replication to detect that it should not stop at
    this event, but give it a chance to send more events. The time
    where the last event inside a group started is stored here. If the
    variable is zero, we are not in a group (but may be in a
    transaction).
   */
  time_t last_event_start_time;

  /**
    Helper function to do after statement completion.

    This function is called from an event to complete the group by
    either stepping the group position, if the "statement" is not
    inside a transaction; or increase the event position, if the
    "statement" is inside a transaction.

    @param event_log_pos
    Master log position of the event. The position is recorded in the
    relay log info and used to produce information for <code>SHOW
    SLAVE STATUS</code>.

    @param event_creation_time
    Timestamp for the creation of the event on the master side. The
    time stamp is recorded in the relay log info and used to compute
    the <code>Seconds_behind_master</code> field.
  */
  void stmt_done(my_off_t event_log_pos,
                 time_t event_creation_time);

     具体的逻辑则是,sql thread启动后,读取relaylog(netxt_event()),apply & update pos(apply_event_and_update_pos()),update_pos的时候判断是否执行到了事务的结束位置,如果执行到了,则调用stmt_done(),stmt_done()会将last_master_timestamp更新为最近一次event的创建时间(event_creation_time)。因此如果在主备基本无延迟的时候,主库执行了一个大事务,你会发现备库延迟突然很大,然后又没了,"延迟跳跃"。

/**
  Slave SQL thread entry point.

  @param arg Pointer to Relay_log_info object that holds information
  for the SQL thread.

  @return Always 0.
*/
pthread_handler_t handle_slave_sql(void *arg)
{
...
    if (exec_relay_log_event(thd,rli))
...
}
...
static int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{
...
  Log_event * ev = next_event(rli);
...
    exec_res= apply_event_and_update_pos(ev, thd, rli);
...
}
...
int apply_event_and_update_pos(Log_event* ev, THD* thd, Relay_log_info* rli)
{
...
    exec_res= ev->apply_event(rli);
...
    int error= ev->update_pos(rli);
...
}
....log_event.cc
int
Rows_log_event::do_update_pos(Relay_log_info *rli)
{
...
  if (get_flags(STMT_END_F))
  {
    /*
      Indicate that a statement is finished.
      Step the group log position if we are not in a transaction,
      otherwise increase the event log position.
    */
    rli->stmt_done(log_pos, when);
   }
...
}
...rpl_rli.cc
void Relay_log_info::stmt_done(my_off_t event_master_log_pos,
                                  time_t event_creation_time)
{
...
  if ((sql_thd->options & OPTION_BEGIN) && opt_using_transactions)
    inc_event_relay_log_pos();
  else
  {
    inc_group_relay_log_pos(event_master_log_pos);
    flush_relay_log_info(this);
    /*
      Note that Rotate_log_event::do_apply_event() does not call this
      function, so there is no chance that a fake rotate event resets
      last_master_timestamp.  Note that we update without mutex
      (probably ok - except in some very rare cases, only consequence
      is that value may take some time to display in
      Seconds_Behind_Master - not critical).
    */
#ifndef DBUG_OFF
    if (!(event_creation_time == 0 && debug_not_change_ts_if_art_event > 0))
#else
      if (event_creation_time != 0)
#endif
        last_master_timestamp= event_creation_time;
  }
}

    另外一种可能会出现延迟跳跃的情况是:主库的网络有问题,io thread拉binlog的速度很慢,备库实际有很大的延迟,而sql thread执行的时候永远是hot log,这个时候会出现,备库一直有很大的延迟,突然一下延迟没了(=0),接下去延迟又变得很大。这种情况的原因为:sql thread读取relay log是在next_event(slave.cc)中完成的,而next_event中会判断当前的读取的relay log是否是“热的”--正在被io thread使用,如果是"热的",并且读取到当前relay log的最后一个event 或io thread没有再更新relay log的时候,会将last_master_time_stamp置为0,并new一个 Rotate_log_event。而如果在这个时候执行"show slave status",延迟就是0了。这个地方需要补充一点的是“show slave status”和前面提到的逻辑,唯一有争用的锁是&rli->data_lock,但是next_event在恢复last_master_time_stamp前就释放了这个锁:

...slave.cc...
static Log_event* next_event(Relay_log_info* rli)
{
...
  safe_mutex_assert_owner(&rli->data_lock);
...
    if (!cur_log->error) /* EOF */
    {
      /*
        On a hot log, EOF means that there are no more updates to
        process and we must block until I/O thread adds some and
        signals us to continue
      */
      if (hot_log)
      {
        time_t save_timestamp= rli->last_master_timestamp;
        rli->last_master_timestamp= 0;
...
        /*
          We can, and should release data_lock while we are waiting for
          update. If we do not, show slave status will block
        */
        pthread_mutex_unlock(&rli->data_lock);
...
        rli->last_master_timestamp= save_timestamp;
...
       }
...
     }
...
}
....
/**
  Execute a SHOW SLAVE STATUS statement.

  @param thd Pointer to THD object for the client thread executing the
  statement.

  @param mi Pointer to Master_info object for the IO thread.

  @retval FALSE success
  @retval TRUE failure
*/
bool show_master_info(THD* thd, Master_info* mi)
{
...
    pthread_mutex_lock(&mi->run_lock);
    protocol->store(mi->io_thd ? mi->io_thd->proc_info : "", &my_charset_bin);
    pthread_mutex_unlock(&mi->run_lock);

    pthread_mutex_lock(&mi->data_lock);
    pthread_mutex_lock(&mi->rli.data_lock);
    pthread_mutex_lock(&mi->err_lock);
    pthread_mutex_lock(&mi->rli.err_lock);
...
    /*
      Seconds_Behind_Master: if SQL thread is running and I/O thread is
      connected, we can compute it otherwise show NULL (i.e. unknown).
    */
    if ((mi->slave_running == MYSQL_SLAVE_RUN_CONNECT) &&
        mi->rli.slave_running)
    {
      long time_diff= ((long)(time(0) - mi->rli.last_master_timestamp)
                       - mi->clock_diff_with_master);
      /*
        Apparently on some systems time_diff can be <0. Here are possible
        reasons related to MySQL:
        - the master is itself a slave of another master whose time is ahead.
        - somebody used an explicit SET TIMESTAMP on the master.
        Possible reason related to granularity-to-second of time functions
        (nothing to do with MySQL), which can explain a value of -1:
        assume the master's and slave's time are perfectly synchronized, and
        that at slave's connection time, when the master's timestamp is read,
        it is at the very end of second 1, and (a very short time later) when
        the slave's timestamp is read it is at the very beginning of second
        2. Then the recorded value for master is 1 and the recorded value for
        slave is 2. At SHOW SLAVE STATUS time, assume that the difference
        between timestamp of slave and rli->last_master_timestamp is 0
        (i.e. they are in the same second), then we get 0-(2-1)=-1 as a result.
        This confuses users, so we don't go below 0: hence the max().

        last_master_timestamp == 0 (an "impossible" timestamp 1970) is a
        special marker to say "consider we have caught up".
      */
      protocol->store((longlong)(mi->rli.last_master_timestamp ?
                                 max(0, time_diff) : 0));
    }
    else
    {
      protocol->store_null();
    }
...
}

     到这里Seconds_Behind_Master的计算就结束了。总结起来就是Seconds_Behind_Master的计算有3部分组成,备库当前时间,sql thread最近一次执行的event的时间,主备系统的时间差。备库出现延迟跳跃,可能是大事务引起,也可能是网络问题。(ps:这2种情况,都被我遇到过:))

 

链接:

http://www.mysqlperformanceblog.com/2013/09/16/possible-reasons-when-mysql-replication-lag-is-flapping-between-0-and-xxxxx/

分享到:
评论

相关推荐

    mysql同步问题之Slave延迟很大优化方法

    一般而言,slave相对master延迟较大,其根本原因就是slave上的复制线程没办法真正做到并发。简单说,在master上是并发模式(以InnoDB引擎为主)完成事务提交的,而在slave上,复制线程只有一个sql thread用于binlog的...

    MySQL中slave监控的延迟情况分析

    在MySQL复制环境中,我们通常只根据 Seconds_Behind_Master 的值来判断SLAVE的延迟。这么做大部分情况下尚可接受,但并不够准确,而应该考虑更多因素。 首先,我们先看下SLAVE的状态: 代码如下:yejr@imysql.com [...

    新版 MySQL DBA 高级视频 基于MySQL 5.7 MySQL 8.0版本.rar

    │ 6_MySQL复制状态和延迟复制.mp4 │ 7_MySQL半同步复制.mp4 │ 作业.docx │ ├─新版MySQL DBA综合实战班 第10天 │ │ 1_课后作业讲解.mp4 │ │ 2_MySQL传统复制手动切换和GTID复制原理及切换.mp4 │ │ 3_...

    解决MySQL中的Slave延迟问题的基本教程

    一般而言,slave相对master延迟较大,其根本原因就是slave上的复制线程没办法真正做到并发。简单说,在master上是并发模式(以InnoDB引擎为主)完成事务提交的,而在slave上,复制线程只有一个sql thread用于binlog的...

    淘宝物流MySQL slave复制数据丢失问题的个人整理

    对于这几天微博上较火的关于淘宝物流MySQLslave复制数据丢失问题,我自己也比较关注,然后根据沃趣科技的一篇分析文章算是大概明白了其中的明细,现在再来根据我自己的理解理一下思路,顺便加深自己的理解。...

    部署MySQL延迟从库的好处小结

    前言 MySQL的主从复制(replication)关系,不太...本文要讨论的延迟从库,也是在5.6之后才有的功能,在这之前需要用percona pt-slave-delay 工具来变相实现。 另外,从5.6版本开始增加了并行复制,不过这时还是基于sch

    MySQL中从库延迟状况排查的一则案例

    今天给一个客户巡检的情况下发从库没有业务的情况mysqld的cpu的一个core占用100%.查主库慢查询也没有关于写的SQL. 可以说是典的单进程复制把一个cpu占满造成的.知道原因了,就好分析了. 分析一下binlog中写的什么...

    深入mysql主从复制延迟问题的详解

    MySQL的复制(replication)是一个异步的复制,从一个MySQL instace(称之为Master)复制到另一个MySQL instance(称之Slave)。实现整个复制操作主要由三个进程完成的,其中两个进程在Slave(Sql进程和IO进程),...

    MYSQL主从不同步延迟原理分析及解决方案

    1. MySQL数据库主从同步延迟原理。要说延时原理,得从mysql的数据库主从复制原理说起,mysql的主从复制都是单线程的操作,主库对所有DDL和DML产生binlog,binlog是顺序写,所以效率很高,slave的Slave_IO_Running...

    MySQL主从延迟现象及原理分析详解

    凌晨对线上一张表添加索引,表数据量太大(1亿+数据,数据量50G以上),造成主从延迟几个小时,各个依赖从库的系统无法查询数据,最终影响业务。 现在就梳理下主从延迟的原理。 二、原理 根据 MySQL 官方文档 MySQL ...

    MySQL主从复制延迟原因以及解决方案

    2、主库会有binlog dump线程实时监测binlog的变更并将这些新的events推给从库(Master has sent all binlog to slave; waiting for more updates) 3、从库的IO Thread接收这些events,并将其记录入re

    深入探索MySQL主从架构与读写分离:提升数据安全和性能的实战指南

    此外,还探讨了MySQL高可用方案和分库分表的实用性,说明了这些技术在处理大数据量时如何提升数据库性能和可扩展性。总之,本文为开发者提供了一套完整的MySQL主从架构与读写分离的实战指南,旨在帮助他们有效地解决...

    MySQL多线程同步MySQL-Transfer介绍

    其主要目的是为了解决原生版本的主从同步里,从库是单线程apply主库的binlog,导致的延迟。最近完成测试的版本将multi-master(byP.Linux)合并到Transfer中并针对支付宝的应用需求做了定制性能改进。这里做一个已经...

    MySQL数据延迟跳动的问题解决

    首先在高可用检测中,有一套环境的检测时断时续,经过排查发现是数据库产生了延迟,在登录到从库show slave status查看,会发现Seconds_behind_master的值是不断跳动的,即从0~39~0~39这样的频率不断跳动,让人很搓...

    Mysql 1864 主从错误解决方法

    从字面意思看了一下是因为slave_pending_jobs_size_max默认值为16777216(16MB),但是slave接收到的slave_pending_jobs_size_max为17085453(17M); 解决方案 从库执行如下SQL mysql&gt;stop slave; mysql&gt;set ...

    详解mysql集群:一主多从架构实现

    实验环境: 1、三台CentOS 7 服务器 2、mysql5.7.26(三台都通过yum...4、还能通过及时增加从库来减少读库压力 5、主库单点故障 6、数据一致性问题(同步延迟造成) 7、一但主机宕机就不可以进行写操作 二、搭建集群

    MySQL 主从复制搭建,基于日志(binlog)

     简单来说,是保证主SQL(Master)和从SQL(Slave)的数据是一致性的,向Master插入数据后,Slave会自动从Master把修改的数据同步过来(有一定的延迟),通过这种方式来保证数据的一致性,是主从复制。  MySQL...

Global site tag (gtag.js) - Google Analytics