profile
viewpoint

Ask questionschain() make collect very slow

While working on a SO question.

We was wondering if chain() would produce an acceptable speed, after some digging and benchmark, we come to the conclusion that collect() is slow because it use while let. Unfortunately, this make collect very slow, I don't really understand why but that a fact.

But we saw that for_each() (probably thank to fold()) implementation of chain() don't have this problem and produce something a lot faster.

#![feature(test)]
extern crate test;

use either::Either; // 1.5.2
use std::iter;

#[derive(Debug, Default)]
pub struct Data<X, Y> {
    head: Option<Y>,
    pairs: Vec<(X, Y)>,
    tail: Option<X>,
}

impl<X, Y> Data<X, Y> {
    pub fn iter(&self) -> impl Iterator<Item = Either<&X, &Y>> {
        let head = self.head.iter().map(Either::Right);

        let pairs = self.pairs.iter().flat_map(|(a, b)| {
            let a = iter::once(Either::Left(a));
            let b = iter::once(Either::Right(b));
            a.chain(b)
        });

        let tail = self.tail.iter().map(Either::Left);

        head.chain(pairs).chain(tail)
    }
}

#[derive(Debug)]
struct AData(usize);
#[derive(Debug)]
struct BData(usize);

#[cfg(test)]
mod tests {
    use crate::{AData, BData, Data};
    use test::Bencher; // 1.5.2

    #[bench]
    fn test_for_each(b: &mut Bencher) {
        b.iter(|| {
            let data = Data {
                head: Some(BData(84)),
                pairs: std::iter::repeat_with(|| (AData(42), BData(84)))
                    .take(20998)
                    .collect(),
                tail: Some(AData(42)),
            };

            let mut data_bis = Vec::with_capacity(21000);
            data.iter().for_each(|x| data_bis.push(x));
        });
    }

    #[bench]
    fn test_collect(b: &mut Bencher) {
        b.iter(|| {
            let data = Data {
                head: Some(BData(84)),
                pairs: std::iter::repeat_with(|| (AData(42), BData(84)))
                    .take(20998)
                    .collect(),
                tail: Some(AData(42)),
            };

            let _: Vec<_> = data.iter().collect();
        });
    }
}
test tests::test_collect  ... bench:   1,682,529 ns/iter (+/- 2,157,023)
test tests::test_for_each ... bench:     609,031 ns/iter (+/- 750,944)

So, should we change implementation of collect to use for_each() ? Note that a for loop doesn't solve the problem. For this to be optimized we need to use for_each().

<!-- TRIAGEBOT_START -->

<!-- TRIAGEBOT_ASSIGN_START --> This issue has been assigned to @hbina via this comment. <!-- TRIAGEBOT_ASSIGN_DATA_START$${"user":"hbina"}$$TRIAGEBOT_ASSIGN_DATA_END -->

<!-- TRIAGEBOT_ASSIGN_END --> <!-- TRIAGEBOT_END -->

rust-lang/rust

Answer questions mati865

@Stargateur I compiled Rust with https://github.com/rust-lang/rust/issues/63340#issuecomment-518919638 but it gave me slightly worse results on windows-gnu toolchain (linux-gnu toolchain doesn't reproduce the issue for me, most likely because I have latest glibc).

@hbina if you still want to work on this issue you will have to create environment where you can reproduce the slowness (maybe running old distros in Docker?).

useful!

Related questions

Spurious NaNs produced by trig functions with valid inputs on Windows GNU toolchains hot 2
using 'cargo install xsv' on windows 10 triggers rustc internal error hot 1
if/while Some(n) = &mut foo sugar will leak a temporary mutable borrow to current scope in particular situation hot 1
build an empty project failed (undefined reference to `__onexitbegin') hot 1
Invalid collision with TryFrom implementation? hot 1
Crater runs for Rust 1.38.0 hot 1
Spurious NaNs produced by trig functions with valid inputs on Windows GNU toolchains hot 1
under latest MinGW, cannot link with C code using stdout hot 1
Archive all nightlies hot 1
Building LLVM with Clang fails hot 1
Internal compiler error: can't buffer lints after HIR lowering hot 1
Tracking issue for `Option::contains` and `Result::contains` hot 1
async fn + rustfmt don't "just work" inside of RLS hot 1
Some closures are not inlined in release mode hot 1
nightly version fails: invalid version 3 on git_proxy_options; class=Invalid (3) hot 1
Github User Rank List