>首页  >数据库开发  >查看内容

MySQL5.7X Plugin:流水线技术vs.并行查询技术【1】

2016/10/12 16:09:07|发布者: 艾悠乐|查看: 993|评论:0

本文将以MySQL 5.7 X Plugin为例,对比分析流水线(pipelining)和并行查询技术。

另一篇博文《MySQL 5.7 X Plugin支持异步查询》(Asynchronous Query Execution with MySQL 5.7 X Plugin),介绍了运行MySQL 5.7 X Plugin的方法:

  • Hash分区
  • 开放MySQL的CPU内核数连接

由于5.7 X Plugin只支持流水线技术(缩短往返延时),且不支持MySQL连接复用(MySQL在执行单项查询时,不启动CPU多核),所以需要手动操作(包括最终的结果排序)。

TL:DR; 版本

本文将分析MySQL 5.7 X Plugin、X协议和文档存储,并加以总结:

  1. 5.7 X Plugin不支持MySQL连接/会话复用。类似初始协议,每条X插件的连接都会打开一次MySQL会话;
  2. 一条5.7 X Plugin指令(在库支持的情况下)会立刻返回,且查询终止之前,仍可继续操作(异步调用)。MySQL以队列形式运行;
  3. 5.7 X Plugin没有附加的服务器级持久性配置。若不进行核查或等待服务器确认(异步),则不能确保数据写入MySQL(“fire and forget”模式)。

同时,X协议适用于以下情况:

  • 在MySQL表锁定时,想要实现一个异步客户端(不希望限制网络通信,如下载或API调用);
  • 想要使用MySQL队列,缩短往返延时。

Benchmark评价结果:流水线vs.并行vs.单项查询

多组测试对比流水线、并行处理和单项查询,结论如下:

1. 并行查询支持NodeJS:
$ time node async_wikistats.js ... All done! Total: 17753 ... real    0m30.668s
user    0m0.256s
sys     0m0.028s
2. 流水线技术支持NojeJS:
$ time node async_wikistats_pipeline.js ... All done! Total: 17753 ... real 5m39.666s
user 0m0.212s
sys 0m0.024s

测试中针对NojeJS环境下运行的流水线,复用相同连接(并且不对线程开放新连接)。

3. 直接查询——分区表:
mysql> select sum(tot_visits) from wikistats.wikistats_by_day_spark_part where url like ‘%postgresql%’;
+-----------------+ | sum(tot_visits) |
+-----------------+ | 17753           |
+-----------------+ 1 row in set (5 min 31.44 sec)
4. 直接查询——非分区表
mysql> select sum(tot_visits) from wikistats.wikistats_by_day_spark where url like ‘%postgresql%’;
+-----------------+ | sum(tot_visits) |
+-----------------+ | 17753           |
+-----------------+ 1 row in set (4 min 38.16 sec)

5.7 X Plugin流水线技术的优点:

尽管5.7 X Plugin的流水线技术没有增加太多查询响应时间(可以降低总延迟),其仍适用于某些场合,例如从网上下载东西的时候,需要缩减下载进程和文档元数据,这时就可使用youtube-dl进行搜索并下载YouTube的视频元数据。然后,将原数据JSON存入MySQL 5.7文本库。代码如下:

var mysqlx = require('mysqlx'); # This is the same as running $ youtube-dl -j -i ytsearch100:"mysql 5.7" const spawn = require('child_process').spawn; const yt = spawn('youtube-dl', ['-j', '-i', 'ytsearch100:"mysql 5.7"'], {maxBuffer: 1024 * 1024 * 128}); var mySession =
mysqlx.getSession({ host: 'localhost', port: 33060, dbUser: 'root', dbPassword: '' });
yt.stdout.on('data', (data) => { try {
                dataObj = JSON.parse(data); console.log(dataObj.fulltitle);
                mySession.then(session => {
                                                session.getSchema("yt").getCollection("youtube").add(  dataObj  )
                                                .execute(function (row) {
                                                }).catch(err => { console.log(err);
                                                })
                                                .then( function (notices) { console.log("Wrote to MySQL: " + JSON.stringify(notices))  });
                                }).catch(function (err) { console.log(err);
                                              process.exit();
                                });
        } catch (e) { console.log(" --- Can't parse json" + e );
        }
}); yt.stderr.on('data', (data) => { console.log("Error receiving data");
}); yt.on('close', (code) => { console.log(`child process exited with code ${code}`);
  mySession.then(session => {session.close() } );
});

如上例所示,执行二进制youtube-dl(需要提前安装)来搜索MySQL 5.7相关视频。无需下载视频,只需获取JSON格式的元数据(“-j”flag)。因为是JSON格式,所以可以保存至MySQL文本库。表架构如下:

CREATE TABLE `youtube` ( `doc` json DEFAULT NULL, `_id` varchar(32) GENERATED ALWAYS AS (json_unquote(json_extract(`doc`,'$._id'))) STORED NOT NULL, UNIQUE KEY `_id` (`_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

执行案例:

$ node yt.js
What's New in MySQL 5.7 Wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["3f312c3b-b2f3-55e8-0ee9-b706eddf"]}}
MySQL 5.7: MySQL JSON data type example Wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["88223742-9875-59f1-f535-f1cfb936"]}}
MySQL Performance Tuning: Part 1. Configuration (Covers MySQL 5.7)
Wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["c377e051-37e6-8a63-bec7-1b81c6d6"]}}
Dave Stokes — MySQL 5.7 - New Features and Things That Will Break — php[world] 2014 Wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["96ae0dd8-9f7d-c08a-bbef-1a256b11"]}}
MySQL 5.7 & JSON: New Opportunities for Developers - Thomas Ulin - Forum PHP 2015 Wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["ccb5c53e-561c-2ed5-6deb-1b325739"]}}
Cara Instal MySQL 5.7.10 NoInstaller pada Windows Manual Part3
Wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["95efbd79-8d79-e7b6-a535-271640c8"]}}
MySQL 5.7 Install and Configuration on Ubuntu 14.04 Wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["b8cfe132-aca4-1eba-c2ae-69e48db8"]}}

有趣的是:当NodeJS + X Plugin = Asynchronous + Pipelining时,一旦表被锁定,程序操作不会停止。现已打开两个会话:

  • session 1: $ node yt.js > test_lock_table.log
  • session 2:
mysql> lock table youtube read; select sleep(10); unlock tables; Query OK, 0 rows affected (0.00 sec)
+-----------+ | sleep(10) |
+-----------+ |         0 |
+-----------+ 1 row in set (10.01 sec)
Query OK, 0 rows affected (0.00 sec)

结果:

... Upgrade MySQL Server from 5.5 to 5.7 ... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["d4d62a8a-fbfa-05ab-2110-2fd5cf6d"]}}
OSC15 - Georgi Kodinov - Secure Deployment Changes Coming in MySQL 5.7 ... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["8ac1cdb9-1499-544c-da2a-5db1ccf5"]}}
MySQL 5.7: Create JSON string using mysql
FreeBSD 10.3 - Instalación de MySQL 5.7 desde Código Fuente - Source Code
Webinar replay: How To Upgrade to MySQL 5.7 - The Best Practices - part 1 How to install MySQL Server on Mac OS X Yosemite - ltamTube
Webinar replay: How To Upgrade to MySQL 5.7 - The Best Practices - part 4 COMO INSTALAR MYSQL VERSION 5.7.13 MySQL and JSON
MySQL 5.7: Merge JSON data using MySQL ... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["a11ff369-6f23-11e9-187b-e3713e6e"]}} ... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["06143a61-4add-79da-0e1d-c2b52cf6"]}} ... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["1eb94ef4-db63-cb75-767e-e1555549"]}} ... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["e25f15b5-8c19-9531-ed69-7b46807a"]}} ... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["02b5a4c9-6a21-f263-90d5-cd761906"]}} ... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["e0bef958-10af-b181-81cd-5debaaa0"]}} ... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["f48fa635-fa63-7481-0668-addabbac"]}} ... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["557fa5c5-3c8a-fe01-c17c-549c557e"]}}
MySQL 5.7 Install and Configuration on Ubuntu 14.04 ... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["456b11d8-ba03-0aec-8e06-9517c6e1"]}}
MySQL WorkBench 6.3 installation on Ubuntu 14.04 ... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["0b651987-9b23-b5e0-f8f7-49b8ba5c"]}}
Going through era of IoT with MySQL 5.7 - FOSSASIA 2016 ... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["e133746c-836c-a7e0-3893-292a7429"]}}
MySQL 5.7: MySQL JSON operator example ... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["4d13830d-7b30-5b31-d068-c7305e0a"]}}

由此可见,前两项是即时写入。随后,表被锁定,且无MySQL查询。同时,下载进程(此处为速度最慢的环节)继续进行,没有锁定(上述titles没有附加lines“… => wrote to MySQL:”)。表解锁后,大量等待查询通过。

这有助于下载进程的运行,而网络则成了瓶颈。传统同步查询中,表锁定则应用程序锁定(包括网络通信)。但如果有NodeJS和5.7 X Plugin,则MySQL做队列运行,下载环节继续。

流水线持久性

为了研究流水线的持久性,或者说连接中断后会发生的事,做了一些测试。这一次,在开始NodeJS之前锁定表格,切断连接,最后再解锁表格。结果如下:

Session 1:
---------- mysql> truncate table youtube_new;
Query OK, 0 rows affected (0.25 sec)
mysql> lock table youtube_new read;
Query OK, 0 rows affected (0.00 sec) mysql> select count(*) from youtube_new;
+----------+ | count(*) |
+----------+ |        0 |
+----------+ 1 row in set (0.00 sec) Session 2:
---------- (when table is locked)
$ node yt1.js
11 03  MyISAM
Switching to InnoDB from MyISAM
tablas InnoDB a MyISAM
MongoDB vs MyISAM (MariaDB/MySQL)
MySQL Tutorial 35 - Foreign Key Constraints for the InnoDB Storage Engine
phpmyadmin foreign keys myisam innodb
Convert or change database manual from Myisam to Innodb ... >100 other results omited ...
^C Session 1:
---------- mysql> select count(*) from youtube_new;
+----------+ | count(*) |
+----------+ |        0 |
+----------+ 1 row in set (0.00 sec)  Id: 4916  User: root  Host: localhost:33221  db: NULL Command: Query  Time: 28  State: Waiting for table metadata lock  Info: PLUGIN: INSERT INTO `iot`.`youtube_new` (doc) VALUES ('{"upload_date":"20140319","protocol":" mysql> unlock table;
Query OK, 0 rows affected (0.00 sec) mysql> select count(*) from youtube_new;
+----------+ | count(*) |
+----------+ |        2 |
+----------+ 1 row in set (0.00 sec) mysql>  select json_unquote(doc->'$.title') from youtube_new;
+---------------------------------+ | json_unquote(doc->'$.title')    |
+---------------------------------+ | 11 03  MyISAM                   | | Switching to InnoDB from MyISAM |
+---------------------------------+ 2 rows in set (0.00 sec)

注意:上例中,没有出现MySQL服务器发出的确认字符。代码接收MySQL回复,列出指令“Wrote to MySQL: {“_state”:{“rows_affected”:1,”doc_ids”:[“…”]}}”。此外,切断连接后,MySQL进程还在表锁等待。

有趣的是此处只有两行被插入文本库。X协议的作者Jan Kneschke被问及此处是否有“history length”或是其它可扩展的缓存(buffer)时,回答如下:

  • Q:是否有history length或缓存,而且我们能够加以调控吗? 
    • A:此处完全不存在“history”或“缓存”,且其都处于连接器层面。
  • Q:为什么最终插入两行? 
    • A:对于这个问题,我已经连接了tcpdump与port 33060(X协议),如下:

这条信息非常重要,要记住异步流水线的配置不是一尘不变的:如果应用程序出错且存在pending writes,则writes可能丢失(或被写入)。

为了充分了解该协议的运行方式,截获tcpdump(并在Jan Kneschke的帮助下进行分析):

tcpdump -i lo -s0 -w tests/node-js-pipelining.pcap "tcp port 33060"

(tcpdump可视化更新如下)

  • 点击CTRL+C,NodeJS关闭连接。由于表仍锁定,MySQL不能写,且无法返回插入结果。
  • 表解锁后,尽管连接已锁定,仍可开始第一个statement。随后确认第一项插入,并开始下一项。
  • 然而,此时脚本(客户端)已经关闭连接,且最后的数据包(此处是id)被否定。5.7 X Plugin由此发现客户端关闭连接并停止执行流水线。

最新评论

热门资讯

排行榜