Multi-line parsing supports SIMD optimization #1872

zhongyuankai · 2024-11-11T12:15:15Z

Improve the performance of parsing newlines through SIMD.
The following are the performance comparison test results under 150 tasks and 180MB/s traffic:
Before optimization:

After optimization:

After optimization, the performance is improved by about 8%. If the GetNextLine method is tested separately, the performance is improved by 1 times.

CLAassistant · 2024-11-11T12:15:22Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

zhongyuankai seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

linrunqi08 · 2024-11-12T05:34:17Z

@zhongyuankai Can you share the performance comparison before and after this PR?

linrunqi08 · 2024-11-12T05:36:10Z

@zhongyuankai Is it convenient to communicate on WeChat or DingTalk?

zhongyuankai · 2024-11-12T09:06:40Z

@linrunqi08 Thank you for your reply. I have updated the comment. I am happy to communicate with you. How can I contact you on WeChat?

linrunqi08 · 2024-11-13T02:44:53Z

@zhongyuankai You can add my WeChat ID: linrunqi08

yyuuttaaoo · 2024-12-17T11:55:14Z

core/file_server/reader/LogFileReader.cpp

+    const int vecSize = 32;
+    __m256i newlineVec = _mm256_set1_epi8('\n');
+
+    for (int32_t pos = end - vecSize; pos >= 0; pos -= vecSize) {


如果不是32的整数倍长度怎么办？好像剩下的没检查？

Multi-line parsing supports SIMD optimization

46e9a4c

zhongyuankai force-pushed the add_simd_support branch from e1fdb43 to 46e9a4c Compare November 12, 2024 01:11

yyuuttaaoo reviewed Dec 17, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-line parsing supports SIMD optimization #1872

Multi-line parsing supports SIMD optimization #1872

zhongyuankai commented Nov 11, 2024 •

edited

Loading

CLAassistant commented Nov 11, 2024 •

edited

Loading

linrunqi08 commented Nov 12, 2024

linrunqi08 commented Nov 12, 2024

zhongyuankai commented Nov 12, 2024

linrunqi08 commented Nov 13, 2024

yyuuttaaoo Dec 17, 2024

Multi-line parsing supports SIMD optimization #1872

Are you sure you want to change the base?

Multi-line parsing supports SIMD optimization #1872

Conversation

zhongyuankai commented Nov 11, 2024 • edited Loading

CLAassistant commented Nov 11, 2024 • edited Loading

linrunqi08 commented Nov 12, 2024

linrunqi08 commented Nov 12, 2024

zhongyuankai commented Nov 12, 2024

linrunqi08 commented Nov 13, 2024

yyuuttaaoo Dec 17, 2024

Choose a reason for hiding this comment

zhongyuankai commented Nov 11, 2024 •

edited

Loading

CLAassistant commented Nov 11, 2024 •

edited

Loading