profile
viewpoint
If you are wondering where the data of this site comes from, please visit https://api.github.com/users/shamatar/events. GitMemory does not store any data, but only uses NGINX to cache data for a period of time. The idea behind GitMemory is simply to give users a better reading experience.
Alexander shamatar Matter Labs SNARKs vs SONICs

matter-labs/hodor 57

Open source implementation of zkSTARKs in pure Rust

matter-labs-archive/belle_cuda 25

Implementation of various primitives for bellman using CUDA (WIP)

matter-labs-archive/eip1962_cpp 9

C++ implementation of EIP 1962

matter-labs/eip1962_specs 6

Specification documents for EIP1962

matter-labs-archive/eip1962_fuzzing 6

Fuzzy testing of various EIP1962 implementations

shamatar/bench_precompiles 4

Benchmark of precompiles in OpenEthereum

Konstantce/CUDA-arithmetic 3

CUDA long arithmetic becnhes

shamatar/algebraic_fuzzer 3

Fuzzer for elliptic curve crypto

matter-labs-archive/MerkleShrubs 2

Alternative tree bookkeeping construction for more efficient on-chain privacy

issue commentmatter-labs/bellman

what does field STATE_WIDTH and CAN_ACCESS_NEXT_TRACE_STEP mean of PlonkConstraintSystemParams?

Here is an example how to define a custom gate that does 5th degree non-linearity https://github.com/matter-labs/franklin-crypto/blob/770156e665d2d866f080e241876b358136ccc9f1/src/plonk/circuit/custom_rescue_gate.rs#L31

DreamWuGit

comment created time in 12 hours

issue commentmatter-labs/bellman

what does field STATE_WIDTH and CAN_ACCESS_NEXT_TRACE_STEP mean of PlonkConstraintSystemParams?

Yes, such gate requires an extra polynomial in setup (coefficients in front of D_next), but this is also not too large price because it's amortized by linearization

DreamWuGit

comment created time in 3 days

issue commentmatter-labs/bellman

what does field STATE_WIDTH and CAN_ACCESS_NEXT_TRACE_STEP mean of PlonkConstraintSystemParams?

Your understanding of witness is correct, it's just that I've used another naming.

"Witness" in my terms is not used anywhere, but it can be used e.g. to make custom affine transform + non-linearity gate for Poseidon hash like the following: A + B + C = W_0; W_0 ^4 = W_1; W_0 * W_1 = D. Effectively we get that D = (A + B + C)^5 and it's what we want, but used extra witness/advice columns (that are effectively discarded) for some extra bookkeeping and to have a degree of the constraints low (this example is quite artificial, but should give you a hint).

Halo2 uses other types of gates that allow one to access values in the previous and next rows as far as I remember, but with Plonk you can design many relationships using both relative or absolute row addressing.

Regarding D_on_next_row: you can do like this a1 + b1 + c1 + d1 + const - d2 = 0, -a2 + b2 + c2 + d2 = 0, effectively making a long linear combination into a2 (I've zeroed a coefficient for D_next in the row number 2)

DreamWuGit

comment created time in 3 days

issue commentmatter-labs/bellman

what does field STATE_WIDTH and CAN_ACCESS_NEXT_TRACE_STEP mean of PlonkConstraintSystemParams?

Ok, so:

  • STATE_WIDTH is a number of polynomials for which a copy-permutation check applies. In the original Plonk paper there are 3 of them, in our case there are usually 4 of them because it has marginal cost in a prover, but allows to have smaller circuits. To look at it another way - when you allocate values here you can use a term of "variable" that is unique in a circuit and always has provably the same value (much like in a Groth16)
  • WITNESS_WIDTH is currently 0, but represents a number of additional polynomials that you can address through some gate (will give an example below). E.g in Halo2 those are called "advice" polynomials
  • CAN_ACCESS_NEXT_TRACE_STEP - determines if any relation in the circuit (usually we call such relationships "gates") can access only current row of the trace or may be also next row of the trace. In the original Plonk paper gate has a form (up to coefficients) AB + A + B + C + constant = 0, where capitals are state polynomials' names, and touches only a single row of the trace. In our work we use a modified version of the "main" gate as AB + A + B + C + D + D_on_the_next_row + constant = 0 that allows to better chain long linear combinations (D_on_the_next_row is like a carry)
DreamWuGit

comment created time in 3 days

issue commentmatter-labs/bellman

what does field STATE_WIDTH and CAN_ACCESS_NEXT_TRACE_STEP mean of PlonkConstraintSystemParams?

I should answer you tomorrow when back to the laptop with a normal keyboard :)

DreamWuGit

comment created time in 4 days

issue commentmatter-labs/bellman

Question about plonk proof input fields

The "num_inputs" is a property of the statement, but I agree that in the proof your are ok to have only a vector of input values and a separate "num_inputs" is redundant because the same information (num_inputs) is also located in the verification key, so at any point in time you will have both for actual verification.

I hope it clarifies my previous answer a little

DreamWuGit

comment created time in 7 days

issue commentmatter-labs/bellman

Question about plonk proof input fields

Well, they have a little bit different meaning from my perspective:

  • num_inputs determines a property of the proven statement (for various proofs acceptable by this circuit)
  • input_values are concrete inputs into the given proof

But in practice you can indeed just keep a vector only

DreamWuGit

comment created time in 9 days

pull request commentmatter-labs/bellman

[WIP] New multicore task executor

What about this:

  • add a function like "create tasks" that takes a closure that returns a vector of subworks
  • some scoped context should be available for such closure to be able to have hierarchical structure (inherit priority, etc)
  • information about potential upper limit of available resources should be passed to the "create task" too, and in general provided by worker
  • each task should inform how many resources this task wants (minimal, optimal and max), and also when task is invoked it should get a number of how many resources are allocated for it
  • the thread pool should also monitor a total about of resources that certain type that all the tasks in the queues want, and be clever to be able to give more than minimum if e.g. there are no other tasks that ever want that type of resource
slumber

comment created time in 12 days

Pull request review commentmatter-labs/bellman

[WIP] New multicore task executor

+use std::collections::VecDeque;+use std::sync::Arc;++use parking_lot::{Condvar, Mutex};++use crossbeam::channel::{self, Receiver};++type Task = Box<dyn FnOnce() + Send + 'static>;++#[derive(Default)]+struct TaskQueue(VecDeque<Task>, VecDeque<Task>);++#[derive(Default)]+struct State(Mutex<TaskQueue>, Condvar);++#[derive(Clone)]+pub struct Worker {+    pub(crate) cpus: usize,+    shared_state: Arc<State>,+}++impl Worker {+    fn spawn_in_pool(shared_state: Arc<State>) {+        std::thread::spawn(move || {+            let State(ref mutex, ref cv) = &*shared_state;+            let mut guard = mutex.lock();++            loop {+                while let Some(task) = guard.0.pop_front() {+                    drop(guard);+                    task();+                    guard = mutex.lock();+                }++                match guard.1.pop_front() {+                    Some(task) => {+                        drop(guard);+                        task();+                        guard = mutex.lock();+                    }+                    None => cv.wait(&mut guard),+                }+            }+        });+    }++    fn start_workers(cpus: usize, shared_state: &Arc<State>) {+        for _ in 0..cpus {+            Self::spawn_in_pool(shared_state.clone());+        }+    }++    pub fn new_with_cpus(cpus: usize) -> Worker {+        assert!(cpus > 0);++        let shared_state = Arc::default();+        Self::start_workers(cpus, &shared_state);+        Worker { cpus, shared_state }+    }++    pub fn new() -> Worker {+        Self::new_with_cpus(num_cpus::get())+    }++    pub fn compute<F, T, E>(&self, f: F) -> Receiver<Result<T, E>>+    where+        F: FnOnce() -> Result<T, E> + Send + 'static,

The same way, F doesn't have any access to the resource constraining capabilities. It should get some "spawner", or "scope", or whatever, or at least immediately supply an number of CPU cores it wants to utilize

slumber

comment created time in 12 days

Pull request review commentmatter-labs/bellman

[WIP] New multicore task executor

+use std::collections::VecDeque;+use std::sync::Arc;++use parking_lot::{Condvar, Mutex};++use crossbeam::channel::{self, Receiver};++type Task = Box<dyn FnOnce() + Send + 'static>;++#[derive(Default)]+struct TaskQueue(VecDeque<Task>, VecDeque<Task>);++#[derive(Default)]+struct State(Mutex<TaskQueue>, Condvar);++#[derive(Clone)]+pub struct Worker {+    pub(crate) cpus: usize,+    shared_state: Arc<State>,+}++impl Worker {+    fn spawn_in_pool(shared_state: Arc<State>) {+        std::thread::spawn(move || {+            let State(ref mutex, ref cv) = &*shared_state;+            let mut guard = mutex.lock();++            loop {+                while let Some(task) = guard.0.pop_front() {+                    drop(guard);+                    task();+                    guard = mutex.lock();+                }++                match guard.1.pop_front() {+                    Some(task) => {+                        drop(guard);+                        task();+                        guard = mutex.lock();+                    }+                    None => cv.wait(&mut guard),+                }+            }+        });+    }++    fn start_workers(cpus: usize, shared_state: &Arc<State>) {+        for _ in 0..cpus {+            Self::spawn_in_pool(shared_state.clone());+        }+    }++    pub fn new_with_cpus(cpus: usize) -> Worker {+        assert!(cpus > 0);++        let shared_state = Arc::default();+        Self::start_workers(cpus, &shared_state);+        Worker { cpus, shared_state }+    }++    pub fn new() -> Worker {+        Self::new_with_cpus(num_cpus::get())+    }++    pub fn compute<F, T, E>(&self, f: F) -> Receiver<Result<T, E>>+    where+        F: FnOnce() -> Result<T, E> + Send + 'static,+        T: Send + 'static,+        E: Send + 'static,+    {+        let State(ref mutex, ref cv) = &*self.shared_state;+        let (sender, receiver) = channel::bounded(1);++        let boxed_fn = Box::new(move || {+            let result = f();+            sender.send(result).unwrap();+        });++        {+            let mut guard = mutex.lock();+            guard.0.push_back(boxed_fn);

You push a function that returns a result into the back of the priority queue, so a result will not become available before all the scheduled priority tasks finish, that is not an option

slumber

comment created time in 12 days

Pull request review commentmatter-labs/bellman

[WIP] New multicore task executor

+use std::collections::VecDeque;+use std::sync::Arc;++use parking_lot::{Condvar, Mutex};++use crossbeam::channel::{self, Receiver};++type Task = Box<dyn FnOnce() + Send + 'static>;++#[derive(Default)]+struct TaskQueue(VecDeque<Task>, VecDeque<Task>);++#[derive(Default)]+struct State(Mutex<TaskQueue>, Condvar);++#[derive(Clone)]+pub struct Worker {+    pub(crate) cpus: usize,+    shared_state: Arc<State>,+}++impl Worker {+    fn spawn_in_pool(shared_state: Arc<State>) {+        std::thread::spawn(move || {+            let State(ref mutex, ref cv) = &*shared_state;+            let mut guard = mutex.lock();++            loop {+                while let Some(task) = guard.0.pop_front() {+                    drop(guard);+                    task();

What stops me to spawn infinite number of non-cooperative threads in the task?

slumber

comment created time in 12 days

PullRequestReviewEvent

pull request commentrust-lang/rust

Array `.len()` MIR optimization pass

@rustbot label: +S-waiting-on-review -S-waiting-for-author

shamatar

comment created time in 17 days

pull request commentrust-lang/rust

Array `.len()` MIR optimization pass

It is good in a current form. Is a question about performance run results?

shamatar

comment created time in 17 days

Pull request review commentrust-lang/rust

Local copy propagation

 +         nop;                             // scope 4 at $DIR/simplify_try.rs:21:19: 21:33 +         nop;                             // scope 0 at $DIR/simplify_try.rs:21:32: 21:33 +         _5 = discriminant(_0);           // scope 0 at $DIR/simplify_try.rs:22:9: 22:15-          goto -> bb1;                     // scope 0 at $DIR/simplify_try.rs:22:9: 22:15+          switchInt(move _5) -> [0_isize: bb1, otherwise: bb2]; // scope 0 at $DIR/simplify_try.rs:22:9: 22:15       }          bb1: { -         _0 = move _3;                    // scope 1 at $DIR/simplify_try.rs:25:5: 25:10 -         StorageDead(_3);                 // scope 0 at $DIR/simplify_try.rs:24:6: 24:7 +         nop;                             // scope 1 at $DIR/simplify_try.rs:25:5: 25:10++         nop;                             // scope 0 at $DIR/simplify_try.rs:24:6: 24:7+          StorageDead(_2);                 // scope 0 at $DIR/simplify_try.rs:26:1: 26:2+          return;                          // scope 0 at $DIR/simplify_try.rs:26:2: 26:2+      }+  +      bb2: {+-         StorageLive(_6);                 // scope 0 at $DIR/simplify_try.rs:22:13: 22:14+-         _6 = ((_3 as Err).0: i32);       // scope 0 at $DIR/simplify_try.rs:22:13: 22:14+-         StorageLive(_8);                 // scope 2 at $DIR/simplify_try.rs:22:37: 22:50+-         StorageLive(_9);                 // scope 2 at $DIR/simplify_try.rs:22:48: 22:49+-         _9 = _6;                         // scope 2 at $DIR/simplify_try.rs:22:48: 22:49+-         _8 = _9;                         // scope 5 at $DIR/simplify_try.rs:22:37: 22:50+-         StorageDead(_9);                 // scope 2 at $DIR/simplify_try.rs:22:49: 22:50+-         ((_0 as Err).0: i32) = _8;       // scope 6 at $DIR/simplify_try.rs:22:26: 22:51++         nop;                             // scope 0 at $DIR/simplify_try.rs:22:13: 22:14++         nop;                             // scope 0 at $DIR/simplify_try.rs:22:13: 22:14++         nop;                             // scope 2 at $DIR/simplify_try.rs:22:37: 22:50++         nop;                             // scope 2 at $DIR/simplify_try.rs:22:48: 22:49++         nop;                             // scope 2 at $DIR/simplify_try.rs:22:48: 22:49++         nop;                             // scope 5 at $DIR/simplify_try.rs:22:37: 22:50++         nop;                             // scope 2 at $DIR/simplify_try.rs:22:49: 22:50++         nop;                             // scope 6 at $DIR/simplify_try.rs:22:26: 22:51+          discriminant(_0) = 1;            // scope 6 at $DIR/simplify_try.rs:22:26: 22:51+-         StorageDead(_8);                 // scope 2 at $DIR/simplify_try.rs:22:50: 22:51+-         StorageDead(_6);                 // scope 0 at $DIR/simplify_try.rs:22:50: 22:51+-         StorageDead(_3);                 // scope 0 at $DIR/simplify_try.rs:24:6: 24:7

What would be a reason of such a large cumulative effect? As far as I can see it should be something like ((_0 as Err).0: i32) = ((_3 as Err).0: i32); if I correctly understood that this pass only works along the in the blocks and not over the full body

tmiasko

comment created time in 19 days

PullRequestReviewEvent

pull request commentrust-lang/rust

MIR per-block copy elimination

I've made this pass because it's for sure much more trivial than full destination propagation and it's easier to make it sound.

Regarding the logic, it's simple

  • visit every MIR statement
  • if it's in a form _2 = _1 (any Place can be instead of _1), then we mark _2 as "interesting" in and keep track of it's state, as well as track _1. Any copy statement like _2 = _1 marks a new generation of _2
  • if _2 (destination) is mutated or used anyhow other then copy or move we discard destination from potential replacement candidates
  • if we see mutable use of _1 (source) then we discard any destinations that use this source from an "interesting list"
  • otherwise if _2 is still interesting and we see it's use then we record "coordinates" of this use for future replacement of _2 by it's source
  • if _2 is not dead at the end of the block then we do not mark the last "generation" to be suitable for replacing

This allows to handle cases like

_2 = _1;
_3 = use(_2);
StorageDead(_2);

made into

_2 = _1;
_3 = use(_1);
StorageDead(_2);

that is later simplified by SimplifyLocals, as well as

_2 = _1;
_3 = use(_2);
_2 = _4;
...

that is replaced only as

_2 = _1;
_3 = use(_1);
_2 = _4;
...
shamatar

comment created time in 24 days

Pull request review commentrust-lang/rust

MIR per-block copy elimination

           discriminant(_2) = 1;            // scope 0 at $DIR/issue-73223.rs:2:23: 2:30           StorageLive(_3);                 // scope 0 at $DIR/issue-73223.rs:3:14: 3:15           _3 = ((_2 as Some).0: i32);      // scope 0 at $DIR/issue-73223.rs:3:14: 3:15-          _1 = _3;                         // scope 2 at $DIR/issue-73223.rs:3:20: 3:21+          _1 = ((_2 as Some).0: i32);      // scope 2 at $DIR/issue-73223.rs:3:20: 3:21

Oh, I think there is a pass that does what you say (not a pass from this PR), but it's under opt-level = 4 and also under "unsound mir optimizations"

shamatar

comment created time in 24 days

PullRequestReviewEvent

Pull request review commentrust-lang/rust

MIR per-block copy elimination

           discriminant(_2) = 1;            // scope 0 at $DIR/issue-73223.rs:2:23: 2:30           StorageLive(_3);                 // scope 0 at $DIR/issue-73223.rs:3:14: 3:15           _3 = ((_2 as Some).0: i32);      // scope 0 at $DIR/issue-73223.rs:3:14: 3:15-          _1 = _3;                         // scope 2 at $DIR/issue-73223.rs:3:20: 3:21+          _1 = ((_2 as Some).0: i32);      // scope 2 at $DIR/issue-73223.rs:3:20: 3:21

I actually did see a result of the pass (SimplifyLocals I think) that does what you say - if there is a variable that is just Live - copy-assign- Dead then it gets removed. The problem is that such patters it not present in MIR by default and only emerges e.g. as a result of this pass

shamatar

comment created time in 24 days

PullRequestReviewEvent

pull request commentshamatar/EthereumAddress

update CryptoSwift version

I'm ok to merge it, but ideally it should also update cocoapod spec version, and I can not republish the updated package in cocoapods for a next few weeks.

I don't know if just merging is enough for your purposes (for SPM I think it's ok), but if it's a very required then let's merge (need comments on it)

sche

comment created time in 24 days

pull request commentrust-lang/rust

Array `.len()` MIR optimization pass

Looks like neutral or small gains. May be I should write some artificial benchmark that makes a stress tests for const generics

shamatar

comment created time in a month

push eventmatter-labs/rescue-poseidon

Sait Imamoglu

commit sha cffb10b7d9269d1ad4b4a4f387cf6312339d0db9

update var length hashing. bring sponge back

view details

Sait Imamoglu

commit sha d64125c40fc84d017ebc773ab943f7a12361b0e7

Move hash params to function signatures for generic sponge

view details

Sait Imamoglu

commit sha b9a89340922e070d39ca81cf5fd298e1deacecc9

update franklin dependency

view details

Sait Imamoglu

commit sha 0e68aeeaae3ccabe479d8f0ba37279318d7bd40e

rework sboxes

view details

Sait Imamoglu

commit sha cb3afd3a0efa4890c869c24dbb2f783b778a4daa

update tests and add helper functions which return nums

view details

Sait Imamoglu

commit sha d618d6788270b607127a9817adef08a22f337538

add multiple custom gate support to sbox

view details

Sait Imamoglu

commit sha 23a191071cf40b1f07d0f5d0506916e8cc333aa1

add boilerplate tests for additional custom gates

view details

Sait Imamoglu

commit sha 594bffa32e8d86c012a0684f6c7c2f8a7017e35b

better handle partial state during sbox application

view details

Sait Imamoglu

commit sha 456096a13c2a6856ca28a096a8ec37e7df791d87

Update readme and examples

view details

Sait Imamoglu

commit sha bc8cd762291e61ecf15f01d66a6ad61c608ae859

Add helper methods to parameters for initialization with custom gates

view details

Sait Imamoglu

commit sha 229f72706cbc731adf54383dcccca5edcec54883

simply matrix vector product for circuit

view details

Alex Vlasov

commit sha b20ea19bb62afbd5994b911527c6dfac48e277fb

use 8 rounds by default for bn256

view details

Alex Vlasov

commit sha 6597ac94af21dcd98cfe3242a530fef34d86878b

Merge branch 'dev' of https://github.com/matter-labs/rescue-poseidon into dev

view details

Alex Vlasov

commit sha 64c25eab3c3abd939d657ea20b2dd6e52986bd6a

add conditional sponge for completeness

view details

Alex Vlasov

commit sha f8adc502d4608f8703b6666ccb20119eac0d4d80

warnings and reexports

view details

Alex Vlasov

commit sha 992032a78c378b5dcbaf8af3e70c2cfd354e2f3b

take all deps from git

view details

Alexander

commit sha 7a07d1f31ca39e2a08483139a38f95abf7be75a7

Merge pull request #2 from matter-labs/dev Custom gate support and backward compatible Sponge API

view details

push time in a month

pull request commentrust-lang/rust

Array `.len()` MIR optimization pass

I can not even access the experiment data, looks like internal error

shamatar

comment created time in a month

pull request commentrust-lang/rust

Array `.len()` MIR optimization pass

I've added early returns based on number of blocks/locals (for now > then in keccak) and some extra analysis to return if there are no arrays in scope at all. @oli-obk Can you make a performance run?

I'm also not committing keccak MIR test cause it's a lot of code and it's not too relevant for future

shamatar

comment created time in a month

push eventshamatar/rust

Alex Vlasov

commit sha 3fd74a81bf2c68785045e15ee756f7d8bff7ed95

try to optimize and add early returns

view details

push time in a month

push eventmatter-labs/rescue-poseidon

Alex Vlasov

commit sha 992032a78c378b5dcbaf8af3e70c2cfd354e2f3b

take all deps from git

view details

push time in a month

push eventmatter-labs/franklin-crypto

Alex Vlasov

commit sha 3343470405f96dcc50884b7215bc23e018dfd96e

take all deps from git for supply chain

view details

push time in a month