If you are wondering where the data of this site comes from, please visit https://api.github.com/users/shamatar/events. GitMemory does not store any data, but only uses NGINX to cache data for a period of time. The idea behind GitMemory is simply to give users a better reading experience.
Alexander shamatar Matter Labs SNARKs vs SONICs

Open source implementation of zkSTARKs in pure Rust

Implementation of various primitives for bellman using CUDA (WIP)

C++ implementation of EIP 1962

Specification documents for EIP1962

Fuzzy testing of various EIP1962 implementations

Benchmark of precompiles in OpenEthereum

CUDA long arithmetic becnhes

Fuzzer for elliptic curve crypto

Alternative tree bookkeeping construction for more efficient on-chain privacy

issue commentmatter-labs/bellman

Here is an example how to define a custom gate that does 5th degree non-linearity https://github.com/matter-labs/franklin-crypto/blob/770156e665d2d866f080e241876b358136ccc9f1/src/plonk/circuit/custom_rescue_gate.rs#L31

DreamWuGit

comment created time in 12 hours

issue commentmatter-labs/bellman

Yes, such gate requires an extra polynomial in setup (coefficients in front of D_next), but this is also not too large price because it's amortized by linearization

DreamWuGit

comment created time in 3 days

issue commentmatter-labs/bellman

Your understanding of witness is correct, it's just that I've used another naming.

"Witness" in my terms is not used anywhere, but it can be used e.g. to make custom affine transform + non-linearity gate for Poseidon hash like the following: A + B + C = W_0; W_0 ^4 = W_1; W_0 * W_1 = D. Effectively we get that D = (A + B + C)^5 and it's what we want, but used extra witness/advice columns (that are effectively discarded) for some extra bookkeeping and to have a degree of the constraints low (this example is quite artificial, but should give you a hint).

Halo2 uses other types of gates that allow one to access values in the previous and next rows as far as I remember, but with Plonk you can design many relationships using both relative or absolute row addressing.

Regarding D_on_next_row: you can do like this a1 + b1 + c1 + d1 + const - d2 = 0, -a2 + b2 + c2 + d2 = 0, effectively making a long linear combination into a2 (I've zeroed a coefficient for D_next in the row number 2)

DreamWuGit

comment created time in 3 days

issue commentmatter-labs/bellman

Ok, so:

• STATE_WIDTH is a number of polynomials for which a copy-permutation check applies. In the original Plonk paper there are 3 of them, in our case there are usually 4 of them because it has marginal cost in a prover, but allows to have smaller circuits. To look at it another way - when you allocate values here you can use a term of "variable" that is unique in a circuit and always has provably the same value (much like in a Groth16)
• WITNESS_WIDTH is currently 0, but represents a number of additional polynomials that you can address through some gate (will give an example below). E.g in Halo2 those are called "advice" polynomials
• CAN_ACCESS_NEXT_TRACE_STEP - determines if any relation in the circuit (usually we call such relationships "gates") can access only current row of the trace or may be also next row of the trace. In the original Plonk paper gate has a form (up to coefficients) AB + A + B + C + constant = 0, where capitals are state polynomials' names, and touches only a single row of the trace. In our work we use a modified version of the "main" gate as AB + A + B + C + D + D_on_the_next_row + constant = 0 that allows to better chain long linear combinations (D_on_the_next_row is like a carry)
DreamWuGit

comment created time in 3 days

issue commentmatter-labs/bellman

I should answer you tomorrow when back to the laptop with a normal keyboard :)

DreamWuGit

comment created time in 4 days

issue commentmatter-labs/bellman

The "num_inputs" is a property of the statement, but I agree that in the proof your are ok to have only a vector of input values and a separate "num_inputs" is redundant because the same information (num_inputs) is also located in the verification key, so at any point in time you will have both for actual verification.

I hope it clarifies my previous answer a little

DreamWuGit

comment created time in 7 days

issue commentmatter-labs/bellman

Well, they have a little bit different meaning from my perspective:

• num_inputs determines a property of the proven statement (for various proofs acceptable by this circuit)
• input_values are concrete inputs into the given proof

But in practice you can indeed just keep a vector only

DreamWuGit

comment created time in 9 days

pull request commentmatter-labs/bellman

• add a function like "create tasks" that takes a closure that returns a vector of subworks
• some scoped context should be available for such closure to be able to have hierarchical structure (inherit priority, etc)
• information about potential upper limit of available resources should be passed to the "create task" too, and in general provided by worker
• each task should inform how many resources this task wants (minimal, optimal and max), and also when task is invoked it should get a number of how many resources are allocated for it
• the thread pool should also monitor a total about of resources that certain type that all the tasks in the queues want, and be clever to be able to give more than minimum if e.g. there are no other tasks that ever want that type of resource
slumber

comment created time in 12 days

Pull request review commentmatter-labs/bellman

`+use std::collections::VecDeque;+use std::sync::Arc;++use parking_lot::{Condvar, Mutex};++use crossbeam::channel::{self, Receiver};++type Task = Box<dyn FnOnce() + Send + 'static>;++#[derive(Default)]+struct TaskQueue(VecDeque<Task>, VecDeque<Task>);++#[derive(Default)]+struct State(Mutex<TaskQueue>, Condvar);++#[derive(Clone)]+pub struct Worker {+    pub(crate) cpus: usize,+    shared_state: Arc<State>,+}++impl Worker {+    fn spawn_in_pool(shared_state: Arc<State>) {+        std::thread::spawn(move || {+            let State(ref mutex, ref cv) = &*shared_state;+            let mut guard = mutex.lock();++            loop {+                while let Some(task) = guard.0.pop_front() {+                    drop(guard);+                    task();+                    guard = mutex.lock();+                }++                match guard.1.pop_front() {+                    Some(task) => {+                        drop(guard);+                        task();+                        guard = mutex.lock();+                    }+                    None => cv.wait(&mut guard),+                }+            }+        });+    }++    fn start_workers(cpus: usize, shared_state: &Arc<State>) {+        for _ in 0..cpus {+            Self::spawn_in_pool(shared_state.clone());+        }+    }++    pub fn new_with_cpus(cpus: usize) -> Worker {+        assert!(cpus > 0);++        let shared_state = Arc::default();+        Self::start_workers(cpus, &shared_state);+        Worker { cpus, shared_state }+    }++    pub fn new() -> Worker {+        Self::new_with_cpus(num_cpus::get())+    }++    pub fn compute<F, T, E>(&self, f: F) -> Receiver<Result<T, E>>+    where+        F: FnOnce() -> Result<T, E> + Send + 'static,`

The same way, F doesn't have any access to the resource constraining capabilities. It should get some "spawner", or "scope", or whatever, or at least immediately supply an number of CPU cores it wants to utilize

slumber

comment created time in 12 days

Pull request review commentmatter-labs/bellman

`+use std::collections::VecDeque;+use std::sync::Arc;++use parking_lot::{Condvar, Mutex};++use crossbeam::channel::{self, Receiver};++type Task = Box<dyn FnOnce() + Send + 'static>;++#[derive(Default)]+struct TaskQueue(VecDeque<Task>, VecDeque<Task>);++#[derive(Default)]+struct State(Mutex<TaskQueue>, Condvar);++#[derive(Clone)]+pub struct Worker {+    pub(crate) cpus: usize,+    shared_state: Arc<State>,+}++impl Worker {+    fn spawn_in_pool(shared_state: Arc<State>) {+        std::thread::spawn(move || {+            let State(ref mutex, ref cv) = &*shared_state;+            let mut guard = mutex.lock();++            loop {+                while let Some(task) = guard.0.pop_front() {+                    drop(guard);+                    task();+                    guard = mutex.lock();+                }++                match guard.1.pop_front() {+                    Some(task) => {+                        drop(guard);+                        task();+                        guard = mutex.lock();+                    }+                    None => cv.wait(&mut guard),+                }+            }+        });+    }++    fn start_workers(cpus: usize, shared_state: &Arc<State>) {+        for _ in 0..cpus {+            Self::spawn_in_pool(shared_state.clone());+        }+    }++    pub fn new_with_cpus(cpus: usize) -> Worker {+        assert!(cpus > 0);++        let shared_state = Arc::default();+        Self::start_workers(cpus, &shared_state);+        Worker { cpus, shared_state }+    }++    pub fn new() -> Worker {+        Self::new_with_cpus(num_cpus::get())+    }++    pub fn compute<F, T, E>(&self, f: F) -> Receiver<Result<T, E>>+    where+        F: FnOnce() -> Result<T, E> + Send + 'static,+        T: Send + 'static,+        E: Send + 'static,+    {+        let State(ref mutex, ref cv) = &*self.shared_state;+        let (sender, receiver) = channel::bounded(1);++        let boxed_fn = Box::new(move || {+            let result = f();+            sender.send(result).unwrap();+        });++        {+            let mut guard = mutex.lock();+            guard.0.push_back(boxed_fn);`

You push a function that returns a result into the back of the priority queue, so a result will not become available before all the scheduled priority tasks finish, that is not an option

slumber

comment created time in 12 days

Pull request review commentmatter-labs/bellman

`+use std::collections::VecDeque;+use std::sync::Arc;++use parking_lot::{Condvar, Mutex};++use crossbeam::channel::{self, Receiver};++type Task = Box<dyn FnOnce() + Send + 'static>;++#[derive(Default)]+struct TaskQueue(VecDeque<Task>, VecDeque<Task>);++#[derive(Default)]+struct State(Mutex<TaskQueue>, Condvar);++#[derive(Clone)]+pub struct Worker {+    pub(crate) cpus: usize,+    shared_state: Arc<State>,+}++impl Worker {+    fn spawn_in_pool(shared_state: Arc<State>) {+        std::thread::spawn(move || {+            let State(ref mutex, ref cv) = &*shared_state;+            let mut guard = mutex.lock();++            loop {+                while let Some(task) = guard.0.pop_front() {+                    drop(guard);+                    task();`

What stops me to spawn infinite number of non-cooperative threads in the task?

slumber

comment created time in 12 days

PullRequestReviewEvent

pull request commentrust-lang/rust

@rustbot label: +S-waiting-on-review -S-waiting-for-author

shamatar

comment created time in 17 days

pull request commentrust-lang/rust

It is good in a current form. Is a question about performance run results?

shamatar

comment created time in 17 days

Pull request review commentrust-lang/rust

` +         nop;                             // scope 4 at \$DIR/simplify_try.rs:21:19: 21:33 +         nop;                             // scope 0 at \$DIR/simplify_try.rs:21:32: 21:33 +         _5 = discriminant(_0);           // scope 0 at \$DIR/simplify_try.rs:22:9: 22:15-          goto -> bb1;                     // scope 0 at \$DIR/simplify_try.rs:22:9: 22:15+          switchInt(move _5) -> [0_isize: bb1, otherwise: bb2]; // scope 0 at \$DIR/simplify_try.rs:22:9: 22:15       }          bb1: { -         _0 = move _3;                    // scope 1 at \$DIR/simplify_try.rs:25:5: 25:10 -         StorageDead(_3);                 // scope 0 at \$DIR/simplify_try.rs:24:6: 24:7 +         nop;                             // scope 1 at \$DIR/simplify_try.rs:25:5: 25:10++         nop;                             // scope 0 at \$DIR/simplify_try.rs:24:6: 24:7+          StorageDead(_2);                 // scope 0 at \$DIR/simplify_try.rs:26:1: 26:2+          return;                          // scope 0 at \$DIR/simplify_try.rs:26:2: 26:2+      }+  +      bb2: {+-         StorageLive(_6);                 // scope 0 at \$DIR/simplify_try.rs:22:13: 22:14+-         _6 = ((_3 as Err).0: i32);       // scope 0 at \$DIR/simplify_try.rs:22:13: 22:14+-         StorageLive(_8);                 // scope 2 at \$DIR/simplify_try.rs:22:37: 22:50+-         StorageLive(_9);                 // scope 2 at \$DIR/simplify_try.rs:22:48: 22:49+-         _9 = _6;                         // scope 2 at \$DIR/simplify_try.rs:22:48: 22:49+-         _8 = _9;                         // scope 5 at \$DIR/simplify_try.rs:22:37: 22:50+-         StorageDead(_9);                 // scope 2 at \$DIR/simplify_try.rs:22:49: 22:50+-         ((_0 as Err).0: i32) = _8;       // scope 6 at \$DIR/simplify_try.rs:22:26: 22:51++         nop;                             // scope 0 at \$DIR/simplify_try.rs:22:13: 22:14++         nop;                             // scope 0 at \$DIR/simplify_try.rs:22:13: 22:14++         nop;                             // scope 2 at \$DIR/simplify_try.rs:22:37: 22:50++         nop;                             // scope 2 at \$DIR/simplify_try.rs:22:48: 22:49++         nop;                             // scope 2 at \$DIR/simplify_try.rs:22:48: 22:49++         nop;                             // scope 5 at \$DIR/simplify_try.rs:22:37: 22:50++         nop;                             // scope 2 at \$DIR/simplify_try.rs:22:49: 22:50++         nop;                             // scope 6 at \$DIR/simplify_try.rs:22:26: 22:51+          discriminant(_0) = 1;            // scope 6 at \$DIR/simplify_try.rs:22:26: 22:51+-         StorageDead(_8);                 // scope 2 at \$DIR/simplify_try.rs:22:50: 22:51+-         StorageDead(_6);                 // scope 0 at \$DIR/simplify_try.rs:22:50: 22:51+-         StorageDead(_3);                 // scope 0 at \$DIR/simplify_try.rs:24:6: 24:7`

What would be a reason of such a large cumulative effect? As far as I can see it should be something like `((_0 as Err).0: i32) = ((_3 as Err).0: i32);` if I correctly understood that this pass only works along the in the blocks and not over the full body

comment created time in 19 days

PullRequestReviewEvent

pull request commentrust-lang/rust

I've made this pass because it's for sure much more trivial than full destination propagation and it's easier to make it sound.

Regarding the logic, it's simple

• visit every MIR statement
• if it's in a form `_2 = _1` (any Place can be instead of `_1`), then we mark `_2` as "interesting" in and keep track of it's state, as well as track `_1`. Any copy statement like `_2 = _1` marks a new generation of `_2`
• if `_2` (destination) is mutated or used anyhow other then copy or move we discard destination from potential replacement candidates
• if we see mutable use of `_1` (source) then we discard any destinations that use this source from an "interesting list"
• otherwise if `_2` is still interesting and we see it's use then we record "coordinates" of this use for future replacement of `_2` by it's source
• if `_2` is not dead at the end of the block then we do not mark the last "generation" to be suitable for replacing

This allows to handle cases like

``````_2 = _1;
_3 = use(_2);
``````

``````_2 = _1;
_3 = use(_1);
``````

that is later simplified by SimplifyLocals, as well as

``````_2 = _1;
_3 = use(_2);
_2 = _4;
...
``````

that is replaced only as

``````_2 = _1;
_3 = use(_1);
_2 = _4;
...
``````
shamatar

comment created time in 24 days

Pull request review commentrust-lang/rust

`           discriminant(_2) = 1;            // scope 0 at \$DIR/issue-73223.rs:2:23: 2:30           StorageLive(_3);                 // scope 0 at \$DIR/issue-73223.rs:3:14: 3:15           _3 = ((_2 as Some).0: i32);      // scope 0 at \$DIR/issue-73223.rs:3:14: 3:15-          _1 = _3;                         // scope 2 at \$DIR/issue-73223.rs:3:20: 3:21+          _1 = ((_2 as Some).0: i32);      // scope 2 at \$DIR/issue-73223.rs:3:20: 3:21`

Oh, I think there is a pass that does what you say (not a pass from this PR), but it's under opt-level = 4 and also under "unsound mir optimizations"

shamatar

comment created time in 24 days

PullRequestReviewEvent

Pull request review commentrust-lang/rust

`           discriminant(_2) = 1;            // scope 0 at \$DIR/issue-73223.rs:2:23: 2:30           StorageLive(_3);                 // scope 0 at \$DIR/issue-73223.rs:3:14: 3:15           _3 = ((_2 as Some).0: i32);      // scope 0 at \$DIR/issue-73223.rs:3:14: 3:15-          _1 = _3;                         // scope 2 at \$DIR/issue-73223.rs:3:20: 3:21+          _1 = ((_2 as Some).0: i32);      // scope 2 at \$DIR/issue-73223.rs:3:20: 3:21`

I actually did see a result of the pass (SimplifyLocals I think) that does what you say - if there is a variable that is just Live - copy-assign- Dead then it gets removed. The problem is that such patters it not present in MIR by default and only emerges e.g. as a result of this pass

shamatar

comment created time in 24 days

PullRequestReviewEvent

I'm ok to merge it, but ideally it should also update cocoapod spec version, and I can not republish the updated package in cocoapods for a next few weeks.

I don't know if just merging is enough for your purposes (for SPM I think it's ok), but if it's a very required then let's merge (need comments on it)

sche

comment created time in 24 days

pull request commentrust-lang/rust

Looks like neutral or small gains. May be I should write some artificial benchmark that makes a stress tests for const generics

shamatar

comment created time in a month

push eventmatter-labs/rescue-poseidon

update var length hashing. bring sponge back

commit sha d64125c40fc84d017ebc773ab943f7a12361b0e7

Move hash params to function signatures for generic sponge

commit sha b9a89340922e070d39ca81cf5fd298e1deacecc9

update franklin dependency

commit sha 0e68aeeaae3ccabe479d8f0ba37279318d7bd40e

rework sboxes

commit sha cb3afd3a0efa4890c869c24dbb2f783b778a4daa

update tests and add helper functions which return nums

add multiple custom gate support to sbox

commit sha 23a191071cf40b1f07d0f5d0506916e8cc333aa1

commit sha 594bffa32e8d86c012a0684f6c7c2f8a7017e35b

better handle partial state during sbox application

commit sha 456096a13c2a6856ca28a096a8ec37e7df791d87

Add helper methods to parameters for initialization with custom gates

simply matrix vector product for circuit

commit sha b20ea19bb62afbd5994b911527c6dfac48e277fb

use 8 rounds by default for bn256

commit sha 6597ac94af21dcd98cfe3242a530fef34d86878b

Merge branch 'dev' of https://github.com/matter-labs/rescue-poseidon into dev

commit sha 64c25eab3c3abd939d657ea20b2dd6e52986bd6a

warnings and reexports

commit sha 992032a78c378b5dcbaf8af3e70c2cfd354e2f3b

take all deps from git

commit sha 7a07d1f31ca39e2a08483139a38f95abf7be75a7

Merge pull request #2 from matter-labs/dev Custom gate support and backward compatible Sponge API

push time in a month

PR merged matter-labs/rescue-poseidon

+2117 -1463

0 comment

30 changed files

saitima

pr closed time in a month

pull request commentrust-lang/rust

I can not even access the experiment data, looks like internal error

shamatar

comment created time in a month

pull request commentrust-lang/rust

I've added early returns based on number of blocks/locals (for now > then in keccak) and some extra analysis to return if there are no arrays in scope at all. @oli-obk Can you make a performance run?

I'm also not committing keccak MIR test cause it's a lot of code and it's not too relevant for future

shamatar

comment created time in a month

push eventshamatar/rust

commit sha 3fd74a81bf2c68785045e15ee756f7d8bff7ed95

try to optimize and add early returns

push time in a month

push eventmatter-labs/rescue-poseidon

commit sha 992032a78c378b5dcbaf8af3e70c2cfd354e2f3b

take all deps from git

push time in a month

push eventmatter-labs/franklin-crypto

commit sha 3343470405f96dcc50884b7215bc23e018dfd96e

take all deps from git for supply chain

push time in a month