profile
viewpoint
If you are wondering where the data of this site comes from, please visit https://api.github.com/users/BurntSushi/events. GitMemory does not store any data, but only uses NGINX to cache data for a period of time. The idea behind GitMemory is simply to give users a better reading experience.
Andrew Gallant BurntSushi @salesforce Marlborough, MA https://burntsushi.net I love to code.

BurntSushi/byteorder 678

Rust library for reading/writing numbers in big-endian and little-endian.

BurntSushi/aho-corasick 488

A fast implementation of Aho-Corasick in Rust.

BurntSushi/bstr 390

A string type for Rust that is not required to be valid UTF-8.

BurntSushi/advent-of-code 372

Rust solutions to AoC 2018

BurntSushi/chan 366

Multi-producer, multi-consumer concurrent channel for Rust.

BurntSushi/cargo-benchcmp 264

A small utility to compare Rust micro-benchmarks.

BurntSushi/chan-signal 126

Respond to OS signals with channels.

BurntSushi/critcmp 120

A command line tool for comparing benchmarks run by Criterion.

BurntSushi/clibs 90

A smattering of miscellaneous C libraries. Includes sane argument parsing, a thread-safe multi-producer/multi-consumer queue, and implementation of common data structures (hashmaps, vectors and linked lists).

BurntSushi/blog 30

My blog.

issue closedBurntSushi/ripgrep

Ripgrep should ignore all the `--glob !{paths}` defined in the `$RIPGREP_CONFIG_PATH`.

What version of ripgrep are you using?

ripgrep 13.0.0
-SIMD -AVX (compiled)
+SIMD +AVX (runtime)

How did you install ripgrep?

brew install --formula ripgrep

What operating system are you using ripgrep on?

macOS 11.6

Describe your bug.

Ripgrep is not ignoring the --glob !{paths} defined in the $RIPGREP_CONFIG_PATH.

What are the steps to reproduce the behavior?

In a clean temporary directory:

❯ cat << EOF > /Users/me/.ripgreprc
--smart-case
--no-ignore
--hidden
--glob
!{.direnv,.git,.idea,.svn,cdk.out,/System/Volumes,/Volumes}
--max-columns
512
EOF
❯ export RIPGREP_CONFIG_PATH=/Users/me/.ripgreprc
❯ git clone https://github.com/zimfw/utility.git
❯ rg --debug --files

What is the actual behavior?

Ripgrep is not ignoring the .git path as expected. This is the output with debug enabled:

DEBUG|rg::config|crates/core/config.rs:40: /Users/me/.ripgreprc: arguments loaded from config file: ["--smart-case", "--no-ignore", "--hidden", "--glob", "!{.direnv,.git,.idea,.svn,cdk.out,/System/Volumes,/Volumes}", "--max-columns", "512"]
DEBUG|rg::args|crates/core/args.rs:543: final argv: ["rg", "--smart-case", "--no-ignore", "--hidden", "--glob", "!{.direnv,.git,.idea,.svn,cdk.out,/System/Volumes,/Volumes}", "--max-columns", "512", "--debug", "--files"]
DEBUG|globset|crates/globset/src/lib.rs:416: glob converted to regex: Glob { glob: "{.direnv,.git,.idea,.svn,cdk.out,/System/Volumes,/Volumes}", re: "(?-u)^(/Volumes|/System/Volumes|cdk\\.out|\\.svn|\\.idea|\\.git|\\.direnv)$", opts: GlobOptions { case_insensitive: false, literal_separator: true, backslash_escape: true }, tokens: Tokens([Alternates([Tokens([Literal('/'), Literal('V'), Literal('o'), Literal('l'), Literal('u'), Literal('m'), Literal('e'), Literal('s')]), Tokens([Literal('/'), Literal('S'), Literal('y'), Literal('s'), Literal('t'), Literal('e'), Literal('m'), Literal('/'), Literal('V'), Literal('o'), Literal('l'), Literal('u'), Literal('m'), Literal('e'), Literal('s')]), Tokens([Literal('c'), Literal('d'), Literal('k'), Literal('.'), Literal('o'), Literal('u'), Literal('t')]), Tokens([Literal('.'), Literal('s'), Literal('v'), Literal('n')]), Tokens([Literal('.'), Literal('i'), Literal('d'), Literal('e'), Literal('a')]), Tokens([Literal('.'), Literal('g'), Literal('i'), Literal('t')]), Tokens([Literal('.'), Literal('d'), Literal('i'), Literal('r'), Literal('e'), Literal('n'), Literal('v')])])]) }
DEBUG|globset|crates/globset/src/lib.rs:421: built glob set; 0 literals, 0 basenames, 0 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 1 regexes
utility/.git/packed-refs
utility/.git/index
utility/.git/refs/remotes/origin/HEAD
utility/.git/refs/heads/master
utility/.git/hooks/push-to-checkout.sample
utility/.git/hooks/update.sample
utility/.git/hooks/pre-push.sample
utility/.git/hooks/pre-applypatch.sample
utility/.git/hooks/pre-merge-commit.sample
utility/.git/hooks/post-update.sample
utility/.git/hooks/prepare-commit-msg.sample
utility/.git/hooks/pre-receive.sample
utility/.git/hooks/fsmonitor-watchman.sample
utility/.git/hooks/applypatch-msg.sample
utility/.git/hooks/pre-commit.sample
utility/.git/hooks/pre-rebase.sample
utility/.git/hooks/commit-msg.sample
utility/.git/description
utility/.git/logs/refs/remotes/origin/HEAD
utility/.git/logs/refs/heads/master
utility/.git/logs/HEAD
utility/.git/info/exclude
utility/.git/HEAD
utility/.git/objects/pack/pack-d8be03228476b6d44a25ce74da3c45ca2bd200b2.idx
utility/.git/objects/pack/pack-d8be03228476b6d44a25ce74da3c45ca2bd200b2.pack
utility/.git/config
utility/functions/mkpw
utility/functions/mkcd
utility/init.zsh
utility/.gitignore
utility/README.md
utility/LICENSE

What is the expected behavior?

Ripgrep should have ignored all the --glob !{paths} defined in the $RIPGREP_CONFIG_PATH.

closed time in a day

ericbn

issue commentBurntSushi/ripgrep

Ripgrep should ignore all the `--glob !{paths}` defined in the `$RIPGREP_CONFIG_PATH`.

It's not that. It's actually a semantic of gitignore here that I myself forgot about. Basically, if you have a glob that contains a /, then the glob has to match from the beginning of the path:

// If there is a literal slash, then this is a glob that must match the
// entire path name. Otherwise, we should let it match anywhere, so use
// a **/ prefix.
if !is_absolute && !line.chars().any(|c| c == '/') {
    // ... but only if we don't already have a **/ prefix.
    if !glob.has_doublestar_prefix() {
        glob.actual = format!("**/{}", glob.actual);
    }
}

You can see this in action when you split the {...} part into separate globs:

$ rg --files --hidden -g '!{.git,/quux}'
.test2
utility/LICENSE
utility/.gitignore
utility/.git/objects/pack/pack-d8be03228476b6d44a25ce74da3c45ca2bd200b2.idx
utility/.git/objects/pack/pack-d8be03228476b6d44a25ce74da3c45ca2bd200b2.pack
utility/.git/description
utility/.git/refs/remotes/origin/HEAD
utility/.git/refs/heads/master
utility/.git/packed-refs
utility/.git/logs/refs/remotes/origin/HEAD
utility/.git/logs/HEAD
utility/.git/logs/refs/heads/master
utility/.git/config
utility/.git/HEAD
utility/.git/hooks/pre-commit.sample
utility/.git/hooks/fsmonitor-watchman.sample
utility/.git/info/exclude
utility/.git/hooks/post-update.sample
utility/.git/hooks/pre-rebase.sample
utility/.git/hooks/pre-merge-commit.sample
utility/.git/hooks/update.sample
utility/.git/hooks/pre-push.sample
utility/.git/hooks/prepare-commit-msg.sample
utility/.git/hooks/pre-receive.sample
utility/.git/hooks/applypatch-msg.sample
utility/.git/hooks/commit-msg.sample
utility/.git/hooks/push-to-checkout.sample
utility/.git/hooks/pre-applypatch.sample
utility/.git/index
utility/README.md
utility/init.zsh
test1
.test3/foo
.test3/.git/foo
utility/functions/mkpw
utility/functions/mkcd

$ rg --files --hidden -g '!.git' -g '!/quux'
.test2
utility/LICENSE
utility/.gitignore
utility/functions/mkcd
utility/functions/mkpw
utility/README.md
utility/init.zsh
.test3/foo
test1

So you just need to split your glob into things that contain a / and things that don't.

ericbn

comment created time in a day

issue commentBurntSushi/ripgrep

Ripgrep should ignore all the `--glob !{paths}` defined in the `$RIPGREP_CONFIG_PATH`.

I can't reproduce your problem. Please provide an actual reproduction. I should be able to run precisely your commands on some precise input and get an output that matches yours. Otherwise, from what I can see, --glob !{paths} is working just fine:

$ tree -a
.
├── .git
│   └── foo
├── test1
├── .test2
└── .test3
    └── foo

2 directories, 4 files

$ RIPGREP_CONFIG_PATH=/tmp/ripgreprc-i2005 rg --files
.test2
.test3/foo
test1

$ RIPGREP_CONFIG_PATH=/tmp/ripgreprc-i2005 rg --files -g '!.test2'
.test3/foo
test1

$ RIPGREP_CONFIG_PATH=/tmp/ripgreprc-i2005 rg --files -g '!{.test2,.test3}'
test1

$ RIPGREP_CONFIG_PATH=/tmp/ripgreprc-i2005 rg --files
.test2
.test3/foo
test1

$ cat /tmp/ripgreprc-i2005
--smart-case
--no-ignore
--hidden
--glob
!{.direnv,.git,.idea,.svn,cdk.out,/System/Volumes,/Volumes}
--max-columns
512

$ vim /tmp/ripgreprc-i2005

$ cat /tmp/ripgreprc-i2005
--smart-case
--no-ignore
--hidden
--glob
!{.direnv,.git,.idea,.svn,cdk.out,/System/Volumes,/Volumes}
--max-columns
512
--glob
!{.test2,.test3}

$ RIPGREP_CONFIG_PATH=/tmp/ripgreprc-i2005 rg --files
test1
ericbn

comment created time in a day

issue closedBurntSushi/ripgrep

Returns multi-line error for escaped \\n searches

What version of ripgrep are you using?

ripgrep 12.1.1
-SIMD -AVX (compiled)
+SIMD +AVX (runtime)

How did you install ripgrep?

apt-get

What operating system are you using ripgrep on?

PopOS (a flavour of Ubuntu)

Describe your bug.

The error "the literal '"\n"' is not allowed in a regex" returns an error for the regex \\n because RG thinks it should be a multi-line search. Yet the \n is actually escaped so it doesn't match a literal newline character, it matches \n literally. The error message should only appear for \n, not \\n.

What are the steps to reproduce the behavior?

Run rg \\n

What is the actual behavior?

$ rg \\n
the literal '"\n"' is not allowed in a regex

Consider enabling multiline mode with the --multiline flag (or -U for short).
When multiline mode is enabled, new line characters can be matched.

What is the expected behavior?

Not return an error for regexes containing \\n

closed time in a day

Mattsi-Jansky

issue commentBurntSushi/ripgrep

Returns multi-line error for escaped \\n searches

The behavior here is correct. It's your escaping that is not right. Try, for example, rg \n. That doesn't search for a newline. It searches for n. It's rg \\n that searches for a newline. To do it your way, I believe you would need rg \\\\n.

To be honest, I confess that I do not quite grok the full unquoted escaping rules of shell, so i can't give you a full explanation here. Instead, I can offer you some wisdom: when using escape sequences or special characters, consider putting them in single quotes. For example, rg '\\n' behaves how you want here.

Note that ripgrep does not detect newline characters in regexes with a simple search. Its detection is precisely done via the regex parser itself. So if the regex parser gets it wrong, then either something is very very very wrong (unlikely) or your input isn't the input you think it is (much more likely).

Mattsi-Jansky

comment created time in a day

issue commentrust-lang/rust

Decide whether `asm!` and/or `global_asm!` should be exported from the prelude.

I personally did not tick my box because I'm not really convinced at all that either of these macros should be in the prelude, now that we have the ability to namespace macros. I just do not think inline assembly is anywhere near common enough to be deserving of it. I do not think comparisons to Vec or Option are particularly good here, because the frequency of use across all domains is no where near the same.

I've stopped short of registering a blocking concern because I don't know that I feel strongly enough to go be a lone voice against this if everyone else wants it.

bstrie

comment created time in 2 days

issue commentBurntSushi/ripgrep

[globset] supporting paths with Windows separators

I think there might be a ticket for this already.

In general, I'm not opposed to this, but I don't know when it's going to happen. The globbing/gitignore code is pretty hairy and one of the more common source of ripgrep bugs. So even just reviewing changes to it is difficult.

sunshowers

comment created time in 2 days

issue commentrust-lang/rust

Please reconsider endorsing `dirs` and deprecating `std::env::home_dir`

I use $HOME a lot: https://github.com/BurntSushi/dotfiles/search?q=%24HOME

A lot of those are shell scripts, but it's not that uncommon for shell scripts to "evolve" into programs in other languages (like Rust).

Xaeroxe

comment created time in 2 days

issue closedrust-lang/regex

Significant performance drop when used in multithreading environment

What version of regex are you using?

1.5.4

Describe the bug at a high level.

Sharing a Regex object between threads could cause significant performance drop.

What are the steps to reproduce the behavior?

Create a new binary cargo project, then add the following dependencies:

  • criterion
  • once_cell
  • regex

Replace the content of src/main.rs with:

use criterion::Criterion;
use once_cell::sync::Lazy;
use regex::Regex;
use std::thread;

fn new_regex() -> Regex {
    Regex::new("a{1000}").unwrap()
}

fn lazy_static_regex(f: impl FnOnce(&Regex)) {
    static REGEX: Lazy<Regex> = Lazy::new(new_regex);

    f(&REGEX);
}

fn thread_local_regex(f: impl FnOnce(&Regex)) {
    thread_local! {
        static REGEX: Lazy<Regex> = Lazy::new(new_regex);
    }

    REGEX.with(|regex| f(regex));
}

fn run_tests(regex: &Regex) {
    for _ in 0..10000 {
        regex.is_match("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
    }
}

fn run_in_threads(f: impl FnOnce() + Copy + Send + 'static) {
    let threads = (0..4).map(|_| thread::spawn(f)).collect::<Vec<_>>();

    for thread in threads {
        thread.join().unwrap();
    }
}

fn benchmark(c: &mut Criterion) {
    let mut benchmark_group = c.benchmark_group("regex");

    benchmark_group.bench_function("lazy static", |b| {
        b.iter(|| run_in_threads(|| lazy_static_regex(run_tests)))
    });

    benchmark_group.bench_function("thread local", |b| {
        b.iter(|| run_in_threads(|| thread_local_regex(run_tests)))
    });
}

criterion::criterion_group!(benches, benchmark);

criterion::criterion_main!(benches);

Execute cargo run --release -- --bench.

What is the actual behavior?

I noticed that the lazy static version is 20 times slower than the thread local one on my PC:

    Finished release [optimized] target(s) in 0.07s
     Running `target/release/parallel-regex --bench`
WARNING: HTML report generation will become a non-default optional feature in Criterion.rs 0.4.0.
This feature is being moved to cargo-criterion (https://github.com/bheisler/cargo-criterion) and will be optional in a future version of Criterion.rs. To silence this warning, either switch to cargo-criterion or enable the 'html_reports' feature in your Cargo.toml.

Gnuplot not found, using plotters backend
regex/lazy static       time:   [8.1570 ms 8.1814 ms 8.2064 ms]                              
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild
regex/thread local      time:   [396.71 us 399.19 us 401.81 us]                               
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

What is the expected behavior?

They should perform roughly the same.

closed time in 2 days

EFanZh

issue commentrust-lang/regex

Significant performance drop when used in multithreading environment

This is documented here: https://github.com/rust-lang/regex/blob/master/PERFORMANCE.md#using-a-regex-from-multiple-threads

The relevant code that is impacting things here is: https://github.com/rust-lang/regex/blob/d6bc7a4c3b58e1d618024aaededa722df32fa6e8/src/pool.rs

Specifically, if you profile your "lazy static" benchmark, you'll see that a good chunk of the time is being spent locking and unlocking a mutex.

I don't have any specific plans to improve this, so I'm going to close it, but ideas/patches for improving things here are welcome.

EFanZh

comment created time in 2 days

issue commentBurntSushi/ripgrep

is it possible to transfer the reconciliation of 2 bases to CUDA NVidia?

but it could have become several tens or hundreds of times even better

Citation needed.

zh76internetru

comment created time in 3 days

issue commentBurntSushi/ripgrep

is it possible to transfer the reconciliation of 2 bases to CUDA NVidia?

Why did you close the question?

Because I answered the question.

In the next versions, you may be able to work on a video card?

Nope. I don't do GPU programming. It's too much of a niche. I can pretty safely say I'll ~never work on GPU. Maybe something in that niche will change in the years to come that will draw me in, but there's no point in tracking something that is unlikely to ever happen.

Like I said, go out and build it. If you aren't a programmer, then you might have to pay someone to do it or find someone who is specifically interested in GPU programming and regex searching.

zh76internetru

comment created time in 3 days

issue closedBurntSushi/ripgrep

is it possible to transfer the reconciliation of 2 bases to CUDA NVidia?

Hello. Comparing 2 key databases consisting of 2 text files. 1 file - a template with 1 million keys, in the second several million, in which you need to find similar ones. Note: since the files are large, you have to divide the template and the checked one into several parts, and then check each checked part with each part of the template. For this I use the bat file. In principle, it is tolerable. About processing speed: there are 65,000 lines in a template piece. There are also 65000 lines in the piece of the checked one. Checked by the central processor in 5 seconds. Request for speeding up the process - is it possible to transfer the reconciliation of 2 bases to CUDA NVidia? It would be even faster. Thanks.

closed time in 3 days

zh76internetru

issue commentBurntSushi/ripgrep

is it possible to transfer the reconciliation of 2 bases to CUDA NVidia?

and this process, is it possible to do such processing using a video card, and not a CPU?

I don't know. Maybe. I'm not the right person to ask. Sorry.

zh76internetru

comment created time in 3 days

issue commentBurntSushi/ripgrep

is it possible to transfer the reconciliation of 2 bases to CUDA NVidia?

I don't know what you're asking. I don't know what "is it possible to transfer the reconciliation of 2 bases to CUDA NVidia" means. Why not build a proof of concept and test it?

zh76internetru

comment created time in 4 days

issue commentBurntSushi/memchr

Potential use of `core::hint::unreachable_unchecked` to avoid bounds checks for all users

Sorry, I don't quite understand the request here. I would say there are a few dimensions to it:

  1. Do you have a real benchmark where eliding the check improves things? If so, I'd love to see it, because it is somewhat surprising to me that it would help much. In particular, memchr is always going to result in at least a pointer load and a function call, so a single branch in addition to that probably isn't adding much overhead.
  2. More importantly, it's unclear to me why you have this conditional at all. Why not just elide if i >= s.len() { completely? Maybe what you're saying---without actually saying it---is that the conditional provides a way for other code to avoid bounds checks. If so, then I'd really appreciate more detail, particularly with respect to (1).
  3. How would you go about moving the hint to the inside this crate?
yescallop

comment created time in 4 days

issue commentBurntSushi/bstr

Please declare experiment a success :-)

Note that I did leave the warning about making bstr a public dependency. I'll remove that once 1.0 is released.

ijackson

comment created time in 5 days

issue commentBurntSushi/bstr

Please declare experiment a success :-)

OK, so I went to take a look at how the word was being used. I agree that it's no longer an appropriate word to use, so I just removed it and did some light re-wording.

Fixed in 0.2.17.

ijackson

comment created time in 5 days

created tagBurntSushi/bstr

tag0.2.17

A string type for Rust that is not required to be valid UTF-8.

created time in 5 days

push eventBurntSushi/bstr

Andrew Gallant

commit sha e38e7a7ca986f9499b30202f49d79e531d14d192

0.2.17

view details

push time in 5 days

push eventBurntSushi/bstr

Andrew Gallant

commit sha 602c147daa9b303d9370c614e46c205771d52d15

doc: remove use of the word 'experiment' While bstr is not yet at 1.0, it is not really an experiment any more. Its API has and will remain stable for many years to come (with only very minor breaking changes planned for 1.0, which should happen soon). In general, the docs used the word "experiment" to refer to two different things: 1. The idea that "UTF-8 by convention" is more useful in some cases than "required valid UTF-8." 2. That dependency trees should be kept to a minimum, i.e., providing a cohesive crate rather than a big tree of micro-crates. (1) seems to be very clearly settled. There are folks using bstr in big projects precisely because it makes their life a lot easier. (2) is perhaps still questionable, but it doesn't really make sense to call it an "experiment" and scare folks away. I think it makes more sense to just declare "small dependency tree" as a goal that we strive for. Closes #99

view details

push time in 5 days

issue closedBurntSushi/bstr

Please declare experiment a success :-)

Hi. I wasn't previously aware of bstr, but a correspondent suggested that I ought to recommend it in my (soon-to-be-published) guide Rust for the Polyglot Programmer.

Well, speaking personally I wish I had known about this crate sooner. I haven't yet tried it but I was doing something just last week that I think it woud really have helped with.

So, I would really like to point my readers to bstr from the Libraries section of my guide. But I'm a bit concerned about the fact that it's described very prominently as "experimental". The crates.io download statistics suggest the experiment has been a success, and you're still on 0.2.x which suggests the API is really quite stable.

Would you care to remove the word "experimental" from the README and crate docs ? :-)

closed time in 5 days

ijackson

issue commentBurntSushi/bstr

Please declare experiment a success :-)

See https://github.com/BurntSushi/bstr/issues/40 for the 1.0 release. TL;DR - Only minor breaking changes are planned.

ijackson

comment created time in 5 days

issue commentBurntSushi/bstr

Please declare experiment a success :-)

Aye. I'll do this as part of the 1.0 release, which I hope to do "soon." Where "soon" is probably "before 2022."

I would say that, IMO, so far, it is a mild success in terms of the number of people using it. But the folks that do wind up using it do seem to feel like it is making their lives a lot easier.

ijackson

comment created time in 5 days

issue commentBurntSushi/termcolor

How to prevent StandardStream from writing to stdout in test?

Unfortunately, this is not a termcolor problem. It's a problem with how cargo test captures output. See https://github.com/env-logger-rs/env_logger/issues/107 for a related problem. Here's a minimal example that shows the problem that doesn't involve termcolor at all:

#[cfg(test)]
mod test {
    use std::io::Write;

    #[test]
    fn test() {
        println!("this is captured");
        writeln!(&mut std::io::stdout(), "this is NOT captured").unwrap();
    }
}

And its output:

$ cargo test
   Compiling i51 v0.1.0 (/home/andrew/tmp/issues/termcolor/i51)
    Finished test [unoptimized + debuginfo] target(s) in 0.27s
     Running unittests (target/debug/deps/i51-fd2c172cfeac607c)

running 1 test
this is NOT captured
test test::test ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

   Doc-tests i51

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
schneems

comment created time in 5 days

issue closedBurntSushi/termcolor

How to prevent StandardStream from writing to stdout in test?

When I run tests that use StandardStream:

use std::io::Write;

use termcolor::{
    Buffer,
    StandardStream,
};
use std::cell::RefCell;
pub struct Logger {
    pub stream: StandardStream,
    pub capture_stream: Buffer,
}

impl Logger {
    pub fn new() -> Self {
        Logger {
            stream: StandardStream::stdout(termcolor::ColorChoice::Always),
            capture_stream: Buffer::ansi(),
        }
    }

    pub fn write(&mut self, message: impl AsRef<str>) {
        let message = message.as_ref();
        writeln!(&mut self.stream, "{}", message).unwrap();
        writeln!(&mut self.capture_stream, "{}", message).unwrap();
    }

    #[allow(dead_code)]
    pub fn as_str(&mut self) -> String {
        std::str::from_utf8(self.capture_stream.as_slice())
            .unwrap()
            .to_string()
    }

    #[allow(dead_code)]
    pub fn assert_contains(&mut self, substring: impl AsRef<str>) {
        let body = self.as_str();
        let substring = substring.as_ref();
        assert!(
            body.contains(substring),
            "Expected log to contain '{}' but it did not.\nLog contents:\n{}",
            substring,
            body,
        );
    }
}

// https://riptutorial.com/rust/example/25994/thread-local-objects
thread_local! {
    static LOGGER: RefCell<Logger> = RefCell::new(Logger::new());
}

fn write_log(message: impl AsRef<str>) {
    let message = message.as_ref();
    LOGGER.with(|log_ref| {
        let logger = &mut *log_ref.borrow_mut();
        logger.write(message);
    })
}

fn assert_log_contains(substring: impl AsRef<str>) {
    let substring = substring.as_ref();
    LOGGER.with(|log_ref| {
        let logger = &mut *log_ref.borrow_mut();
        logger.assert_contains(substring);
    })
}

#[cfg(test)]
mod test {
    use super::*;

    #[test]
    fn test_lol() {
        write_log("lol");
        assert_log_contains("lol");
    }

    #[test]
    fn test_hahaha() {
        write_log("hahahaha");
        assert_log_contains("hahahaha");
    }

}

The output is not captured by Rust and ends up in my tests:

$ cargo test
   Compiling procfile-buildpack v0.1.0 (/Users/rschneeman/Documents/projects/work/buildpacks/procfile_rust)
    Finished test [unoptimized + debuginfo] target(s) in 1.16s
     Running unittests (target/debug/deps/procfile_buildpack-f90d2f6985da90de)

running 2 tests
lol
hahahaha
test global_logger::test::test_lol ... ok
test global_logger::test::test_hahaha ... ok

(note the lol and hahahaha in the output). Is there a way to prevent StandardStream from showing up in test output?

Related: https://github.com/heroku/libherokubuildpack/issues/29

closed time in 5 days

schneems

issue closedBurntSushi/ripgrep

Parse errors in .gitignore do not sanitize some terminal escape sequences before printing to stderr

What version of ripgrep are you using?

ripgrep 11.0.2
-SIMD -AVX (compiled)
+SIMD +AVX (runtime)

How did you install ripgrep?

apt install ripgrep in a Ubuntu 20.04 WSL2 install.

What operating system are you using ripgrep on?

Ubuntu 20.04 in a WSL2

Describe your bug.

The line printed from a .gitignore in case of an parse errors might contain terminal escape sequences that mess up the terminal state.

What are the steps to reproduce the behavior?

I'm not sure which kind of terminal escape sequences these are, so in case this is terminal emulator specific:

  • Be on windows
  • Use the new "Windows Terminal"
  • Be logged into a WSL2 instance with it (bash or fish)
  • Have a .gitignore containing the following bytes: \x5b\x5d\xc2\x96\xc2\x97\xc2\x98\xc2\x99\xc2\x9a\x0a
  • run rg foo or any other command that causes it to parse .gitignore files.
  • You should now be able to observe that some terminal control sequences got send to the terminal

What is the actual behavior?

> hexdump -C .gitignore
00000000  5b 5d c2 96 c2 97 c2 98  c2 99 c2 9a 0a           |[]...........|
0000000d
> rg foo
./.gitignore: line 1: error parsing glob '[]': unclosed character class; missing ']'
^[[?1;0c
> [?1;0c

grafik

What is the expected behavior?

Any terminal escape sequences should get sanitized out from the line, at least if printed to an interactive terminal.

closed time in 5 days

Kimundi

issue commentBurntSushi/ripgrep

Parse errors in .gitignore do not sanitize some terminal escape sequences before printing to stderr

I can't seem to reproduce this:

no-color-mangling

Either way, I appreciate the bug report, but I think I'm going to mark this as wontfix. ripgrep passes through content it reads, as-is, pretty much everywhere. The same would happen if you were searching a file and ripgrep printed a match with ANSI escape codes, for example. Similarly for file names and probably other stuff. I'm not really sure it's ripgrep's place to get super paranoid about this and start cleansing everything it prints. It certainly doesn't seem like other tools do it.

Kimundi

comment created time in 5 days

issue commentmariomka/regex-benchmark

Compile time regex for C++

@straywriter I think it would be good to share precisely how you got your measurements. e.g., git clone ... && ./run-benchmark or something to that effect.

bstaletic

comment created time in 8 days

issue commentrust-lang/api-guidelines

Mention the need to define a reasonable MSRV

If we were to add something like this, it should probably start with something like, "If you'd like to define an MSRV, then ..."

estebank

comment created time in 8 days