profile
viewpoint
nvartolomei Cloudflare opinions expressed belong solely to the author, and not necessarily to the author’s employer, organization or other group

nvartolomei/dotfiles 4

nv ♥️ ~/

nvartolomei/ant-design 0

:ant: One design language

nvartolomei/awesome-bigdata 0

A curated list of awesome big data frameworks, ressources and other awesomeness.

nvartolomei/awesome-data-engineering 0

A curated list of data engineering tools for software developers

nvartolomei/awesome-interview-questions 0

:octocat: A curated awesome list of lists of interview questions. Feel free to contribute! :mortar_board:

nvartolomei/backoff 0

The exponential backoff algorithm in Go (Golang).

nvartolomei/bimg 0

Small Go package for fast high-level image processing using libvips via C bindings

nvartolomei/clef-employee-handbook 0

An employee handbook built for inclusion

nvartolomei/ClickHouse 0

ClickHouse is a free analytic DBMS for big data.

nvartolomei/clickhouse-grafana 0

Clickhouse datasource for grafana

startedolivernn/lunr.js

started time in 7 days

startedjbesomi/texthero

started time in 7 days

issue commentClickHouse/ClickHouse

JSON structured logging

@alex-zaitsev is this really an option for log capturing including crashes?

nvartolomei

comment created time in 7 days

issue openedClickHouse/ClickHouse

JSON structured logging

Today logs look like this:

2020.07.08 15:48:16.153765 [ 19602083 ] {} <Debug> TCPHandler: Connected Golang SQLDriver version 1.1.0, revision: 54213, database: r0, user: boo.
2020.07.08 15:48:16.153938 [ 19602083 ] {} <Error> ServerErrorHandler: Code: 192, e.displayText() = DB::Exception: Unknown user boo, e.what() = DB::Exception, Stack trace:
0. /usr/bin/clickhouse-server(StackTrace::StackTrace()+0x16) [0x91de916]
1. /usr/bin/clickhouse-server(DB::Exception::Exception(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int)+0x1f) [0x334594f]
2. /usr/bin/clickhouse-server(DB::SecurityManager::authorizeAndGetUser(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, Poco::Net::IPAddress const&) const+0x4d5) [0x7064f85]
3. /usr/bin/clickhouse-server(DB::Context::setUser(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, Poco::Net::SocketAddress const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x66) [0x6fca626]
4. /usr/bin/clickhouse-server(DB::TCPHandler::receiveHello()+0x5ab) [0x334eefb]
5. /usr/bin/clickhouse-server(DB::TCPHandler::runImpl()+0x1cd) [0x335189d]
6. /usr/bin/clickhouse-server(DB::TCPHandler::run()+0x1c) [0x335264c]
7. /usr/bin/clickhouse-server(Poco::Net::TCPServerConnection::start()+0xf) [0x945aa9f]
8. /usr/bin/clickhouse-server(Poco::Net::TCPServerDispatcher::run()+0xe5) [0x945b195]
9. /usr/bin/clickhouse-server(Poco::PooledThread::run()+0x81) [0x99c9c01]
10. /usr/bin/clickhouse-server(Poco::ThreadImpl::runnableEntry(void*)+0x38) [0x99c61a8]
11. /usr/bin/clickhouse-server() [0x9a9a20f]
12. /lib/x86_64-linux-gnu/libpthread.so.0(+0x74a4) [0x7fd9d437c4a4]
13. /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7fd9d39aed0f]
2020.07.08 15:48:19.692329 [ 19644371 ] {58c31f1f-b6bd-47bb-82ad-9bf31a63eaaa} <Debug> executeQuery: Query pipeline:
Aggregating
 Concat
  Expression
   Filter
    One

which is not a very convenient format for forwarding to something like Kibana for analysis.

What do you think about adding an option to allow printing logs in json format? At least like this:

{
    "timestamp": "2020.07.08 15:48:16.153765",
    "thread_id": "19602083",
    "query_id": {},
    "level": "Debug",
    "source": "TCPHandler"
    "message" "Connected Golang SQLDriver version 1.1.0, revision: 54213, database: r0, user: boo."
}
{
 "timestamp": "2020.07.08 15:48:16.153765",
    "thread_id": "19602083",
    "query_id": {},
    "level": "Debug",
    "source": "TCPHandler"
    -- with proper escaping
    "message" "Code: 192, e.displayText() = DB::Exception: Unknown user boo, e.what() = DB::Exception, Stack trace:
                0. /usr/bin/clickhouse-server(StackTrace::StackTrace()+0x16) [0x91de916]
                1. /usr/bin/clickhouse-server(DB::Exception::Exception(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int)+0x1f) [0x334594f]
                2. /usr/bin/clickhouse-server(DB::SecurityManager::authorizeAndGetUser(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, Poco::Net::IPAddress const&) const+0x4d5) [0x7064f85]
                3. /usr/bin/clickhouse-server(DB::Context::setUser(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, Poco::Net::SocketAddress const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x66) [0x6fca626]
                4. /usr/bin/clickhouse-server(DB::TCPHandler::receiveHello()+0x5ab) [0x334eefb]
                5. /usr/bin/clickhouse-server(DB::TCPHandler::runImpl()+0x1cd) [0x335189d]
                6. /usr/bin/clickhouse-server(DB::TCPHandler::run()+0x1c) [0x335264c]
                7. /usr/bin/clickhouse-server(Poco::Net::TCPServerConnection::start()+0xf) [0x945aa9f]
                8. /usr/bin/clickhouse-server(Poco::Net::TCPServerDispatcher::run()+0xe5) [0x945b195]
                9. /usr/bin/clickhouse-server(Poco::PooledThread::run()+0x81) [0x99c9c01]
                10. /usr/bin/clickhouse-server(Poco::ThreadImpl::runnableEntry(void*)+0x38) [0x99c61a8]
                11. /usr/bin/clickhouse-server() [0x9a9a20f]
                12. /lib/x86_64-linux-gnu/libpthread.so.0(+0x74a4) [0x7fd9d437c4a4]
                13. /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7fd9d39aed0f]"
}

Workaround may be a custom parser, but it will break when log format changes. Ie if https://github.com/ClickHouse/ClickHouse/pull/12171 is merged.

created time in 8 days

startedfbdesignpro/sweetviz

started time in 9 days

startedfeenkcom/gtoolkit

started time in 12 days

delete branch nvartolomei/ClickHouse

delete branch : nv/set-index-tuple-types

delete time in 13 days

issue commentClickHouse/ClickHouse

Ddl queries are not working and hanging

https://github.com/ClickHouse/ClickHouse/blob/8513e1ec7498ca5cb2dc162ebd642cadfc708c6e/src/Interpreters/DDLWorker.cpp#L323-L346

thyn

comment created time in 13 days

issue commentClickHouse/ClickHouse

Ddl queries are not working and hanging

DDL worker tries to resolve addresses like ch%2Ds1%2Dr1:9000 (from hosts list in ZK task) and find if any of them are bound to local network interfaces to determine if the task is intended for it, looks like those aren't resolving to any local interfaces?

thyn

comment created time in 13 days

push eventnvartolomei/ClickHouse

Nicolae Vartolomei

commit sha 64bbccb42e918424a61453c8518ea3239b2edc5d

Add force_primary_key to a pk in tuple test

view details

Nicolae Vartolomei

commit sha f8ceca6942a201fa547b2db329ce9ca891a687c8

Remove const specifier to allow auto-move (clangtidy)

view details

Nicolae Vartolomei

commit sha 3854ce6d8467ca29b7261f9420e25d8551a19bd3

Rewrite Set lookup to make it more readable

view details

push time in 15 days

push eventnvartolomei/ClickHouse

Nicolae Vartolomei

commit sha c95d09aed056f10bba531e5f3e44723f951d17de

Add a test to cover non-const tuple elemenets (just in case)

view details

push time in 15 days

push eventnvartolomei/ClickHouse

Nicolae Vartolomei

commit sha b1d2d55cba7bbb357a5cc5dd43c2248cf4c1cd35

Add explicit test for a case where AST hashes collide for different prepared sets

view details

push time in 15 days

create barnchnvartolomei/ClickHouse

branch : nv/set-index-tuple-types

created branch time in 16 days

PR opened ClickHouse/ClickHouse

Try fix pk in tuple performance

Possible approach for fixing #10574

The problem is that prepared sets are built correctly, it is a hash map of key -> set where key is a hash of AST and list of data types (when we a list of tuples of literals).

However, when the key is built from the index to try and find if there exists a prepared set that would match it looks for data types of the primary key (see how data_types is populated) because the primary key has only one field (v in my example) it can not find the prepared set.

The patch looks for any prepared indexes where data types match for the subset of fields found in primary key, we are not interested in other fields anyway for the purpose of primary key pruning.

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • New Feature
  • Bug Fix
  • Improvement
  • Performance Improvement
  • Backward Incompatible Change
  • Build/Testing/Packaging Improvement
  • Documentation (changelog entry is not required)
  • Other
  • Not for changelog (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

...

Detailed description / Documentation draft:

...

By adding documentation, you'll allow users to try your new feature immediately, not when someone else will have time to document it later. Documentation is necessary for all features that affect user experience in any way. You can add brief documentation draft above, or add documentation right into your patch as Markdown files in docs folder.

If you are doing this for the first time, it's recommended to read the lightweight Contributing to ClickHouse Documentation guide first.

+59 -5

0 comment

3 changed files

pr created time in 16 days

issue commentClickHouse/ClickHouse

Index not used for IN operator with literals

Caused by https://github.com/ClickHouse/ClickHouse/commit/1168ba53157ab59ffc9c38f1d7c962c6c5d7a1f9 from https://github.com/ClickHouse/ClickHouse/pull/4099

akuzm

comment created time in 16 days

issue commentClickHouse/ClickHouse

Index not used for IN operator with literals

Simple test case:

drop database if exists test;
create database test;
use test;
create table z(v UInt64) engine = MergeTree() order by (v) settings index_granularity = 1;
insert into z select number from numbers(10000000);
select count() from z where (v, 0) IN ((1, 0));

Last good version available on docker hub: v19.1.5 first bad version available on docker hub v19.3.3

akuzm

comment created time in 16 days

startedtianon/gosu

started time in 18 days

startedp8952/bocker

started time in 18 days

startedkiali/kiali

started time in 19 days

issue openedClickHouse/ClickHouse

Long running Distributed DDL fail with `Cannot execute replicated DDL query on leader`

Hello,

I'm running into the following issue with some Distributed DDL queries, ie ALTER MODIFY COLUMN which are long running due to the need of rewriting data. They fail with Cannot execute replicated DDL query on leader error, but they don't have to.

The issues is here: https://github.com/ClickHouse/Clickhouse/blob/42b8ed3ec64d7077422afb898db174edf6c191b0/src/Interpreters/DDLWorker.cpp#L776 replicas wait only 20 seconds (20 tries with 1 second sleep after each) when checking the status of the migration, since those migrations are taking longer than that it fails with the above mentioned error.

Maybe we should wait as long as the execution lock is held by the a leader instead?

created time in 23 days

push eventnvartolomei/awesome-data-engineering

nvartolomei

commit sha bf534518d693c51ac7c2b17a12e4ef7b558eb301

Update ClickHouse description

view details

push time in 24 days

push eventnvartolomei/awesome-data-engineering

nvartolomei

commit sha 3b699b0f80fb437acdbaa164f17f0634476de4a2

Add ClickHouse to Columnar Databases

view details

push time in 24 days

fork nvartolomei/awesome-data-engineering

A curated list of data engineering tools for software developers

fork in 24 days

startedw9jds/firebase-action

started time in 24 days

pull request commentClickHouse/ClickHouse

Add number of errors to ignore while choosing replicas

I think a better way to balance things would be to track the rate of errors (instead of an absolute value) with a sliding window and allow a small variability in the rate of errors between replicas.

The ignore value added here seems to help only in the beginning before the err counter reaches that value and then the behaviour will be exactly the same as it currently is until the counters come back to a values less than this min constant.

azat

comment created time in 25 days

issue commentClickHouse/ClickHouse

Extend load_balancing first_or_random to first_2th_or_random

This gets complicated, but a more flexible solution might be nested replica lists/groups.

<shard>
  <internal_replication>true</internal_replication>
  <replica_list load_balancing="in_order">
    <replica_list load_balancing="random">
      <replica>
        <host>AZ_A_shard1_replicas1</host>
        <port>9000</port>
      </replica>
      <replica>
        <host>AZ_A_shard1_replicas2</host>
        <port>9000</port>
      </replica>
    </replica_list>
    <replica_list load_balancing="random">
      <replica>
        <host>AZ_B_shard1_replicas1</host>
        <port>9000</port>
      </replica>
      <replica>
        <host>AZ_B_shard1_replicas2</host>
        <port>9000</port>
      </replica>
    </replica_list>
  </replica_list>
</shard>

When picking a replica for a shard you hit go to first replica_list, look at balancing policy and pick a nested replica/replica_list based on that. Apply recursively until you have picked a replica.

This also solves another problem with first_or_random. That is when you have a circular replication topology with 3 replicas and one of them dies and you want to remove it from topology. Currently first_or_random will degrade to the in_order policy and the hack is to put a unavailable host in place of the first replica, with nested pools you could do this:

<shard>
  <internal_replication>true</internal_replication>
  <replica_list load_balancing="first_or_random">
    <replica_list>
      <replica>
        <host>replica1</host>
        <port>9000</port>
      </replica>
    </replica_list>
    <replica_list load_balancing="random">
      <replica>
        <host>replica2</host>
        <port>9000</port>
      </replica>
      <replica>
        <host>replica3</host>
        <port>9000</port>
      </replica>
    </replica_list>
  </replica_list>
</shard>

Removing replica1 from the list will work as expected.

qixiaogang

comment created time in a month

issue commentProseMirror/prosemirror

Deleting lines inside code blocks leaves void space

Minimal reproducible example: https://codesandbox.io/s/interesting-kare-0ls2i

Put caret on line 2, delete it, void space remains there. It's interesting that if you press enter inside this vanity contenteditable and then delete characters them it behaves as you would expect. Puzzled.

<p
  id="test"
  style="white-space: pre-wrap; border: 1px solid"
  contenteditable="true"
/>
<script>
  document.getElementById("test").innerHTML = "line1\nline2\nline3";
</script>
nvartolomei

comment created time in a month

issue commentProseMirror/prosemirror

Deleting lines inside code blocks leaves void space

@marijnh disabled all extensions, still reproducible. Re github issue editor textarea, it seems to be prosemirror actually but integrated in a peculiar way (let's ignore this fact).

nvartolomei

comment created time in a month

issue openedProseMirror/prosemirror

Deleting lines inside code blocks leaves void space

Issue details

<!-- Please provide issue details here. -->

Steps to reproduce

<!-- Please provide necessary steps to reproduce the issue. For convenience, you can use this Glitch demo to start with. https://glitch.com/edit/#!/remix/prosemirror-demo-basic -->

  1. Enter code_block mode by typing ```
  2. Write 3 lines of code
  3. Go to second line and press backspace until you delete it completely and cursor jumps to the end of the first line
  4. Observe that in the place of second line which we just tried to delete now is a void space

ProseMirror version

<!-- Please provide which version of ProsemMirror you're running. -->

Latest on npm, latest on the ProseMirror website.

Affected platforms

<!-- Please provide specific version of affected browsers or platforms. -->

  • [ ] Chrome
  • [ ] Firefox
  • [ ] Internet Explorer
  • [x] Other (Safari Version 13.1.1 (15609.2.9.1.2))

Screenshots / Screencast (Optional)

Screen Recording 2020-06-10 at 23 08 04

The most odd thing is that I can trigger a somewhat similar behavior in this GitHub issue editor textarea.

created time in a month

Pull request review commentClickHouse/ClickHouse

Distributed alter query(delete/update/drop partition) on cross replication clusters

 void DDLWorker::parseQueryAndResolveHost(DDLTask & task)             {                 if (found_exact_match)                 {-                    throw Exception("There are two exactly the same ClickHouse instances " + address.readableString()-                        + " in cluster " + task.cluster_name, ErrorCodes::INCONSISTENT_CLUSTER_DEFINITION);+                    if(default_database == address.default_database)+                    {+                        throw Exception(+                            "There are two exactly the same Snowball instances " + address.readableString() + " in cluster "+                                + task.cluster_name, ErrorCodes::INCONSISTENT_CLUSTER_DEFINITION);+                }+                    else  ///circular replication is used.

https://www.altinity.com/blog/2018/5/10/circular-replication-cluster-topology-in-clickhouse

etah000

comment created time in a month

startedcytoscape/cytoscape.js

started time in a month

startedquantopian/zipline

started time in a month

push eventnvartolomei/gonja

Nicolae Vartolomei

commit sha 52becceeccb0a050aee8fa6b31bee7fdb8470753

Add support for slice concatenation

view details

push time in 2 months

pull request commentnoirbizarre/gonja

Add support for slice concatenation

Failure seems unrelated to me, there is a race condition somewhere but it is also present on master branch.

nvartolomei

comment created time in 2 months

PR opened noirbizarre/gonja

Add support for slice concatenation
+21 -0

0 comment

4 changed files

pr created time in 2 months

push eventnvartolomei/gonja

Nicolae Vartolomei

commit sha 4fc4eb71165eb1afd36b6b8435501d34223dc6c6

Add support for slice concatenation

view details

push time in 2 months

fork nvartolomei/gonja

Jinja-like syntax template-engine for Go

fork in 2 months

issue openedClickHouse/ClickHouse

Weird behaviour of function IPv4CIDRToRange/IPv6CIDRToRange which seems to be triggered by some query pipeline optimizations

How to reproduce

  • Which ClickHouse server version to use: version 20.4.4
DROP TABLE IF EXISTS test;
CREATE TABLE test Engine = Memory
AS SELECT '1.1.1.1/24' as address;
WITH '1.1.1.1' AS addr
SELECT
    address,
    splitByChar('/', address) AS prefix,
    prefix[1] AS base,
    toUInt8(prefix[2]) AS mask
FROM test
WHERE (IPv4CIDRToRange(toIPv4(addr), mask).1) = toIPv4(base)

Received exception from server (version 20.4.4):
Code: 44. DB::Exception: Received from localhost:9000. DB::Exception: Illegal column Const(UInt32) of argument of function IPv4CIDRToRange.

Expected behaviour is this (here mask is hard coded to 24.

WITH '1.1.1.1' AS addr
SELECT
    address,
    splitByChar('/', address) AS prefix,
    prefix[1] AS base,
    toUInt8(prefix[2]) AS mask
FROM test
WHERE (IPv4CIDRToRange(toIPv4(addr), 24).1) = toIPv4(base)

┌─address────┬─prefix───────────┬─base────┬─mask─┐
│ 1.1.1.0/24 │ ['1.1.1.0','24'] │ 1.1.1.0 │   24 │
└────────────┴──────────────────┴─────────┴──────┘

1 rows in set. Elapsed: 0.005 sec.

created time in 2 months

Pull request review commentClickHouse/ClickHouse

[WIP] Add DatabaseReplicated database engine

 void DatabaseReplicated::propose(const ASTPtr & query) {     current_zookeeper->createOrUpdate(zookeeper_path + "/last_entry", std::to_string(current_log_entry_n), zkutil::CreateMode::Persistent);      lock->unlock();-    // write to metastore the last entry?+    saveState();+}++void DatabaseReplicated::createSnapshot() {+    current_zookeeper->createAncestors(zookeeper_path + "/snapshot/huy");

🤦‍♂️😁

ValBaturin

comment created time in 2 months

startedbinaryaffairs/a-la-mode

started time in 2 months

issue commentClickHouse/ClickHouse

Is it a good Idea to put a high cardinality column in primary key ?

https://clickhouse.tech/docs/en/engines/table-engines/mergetree-family/mergetree/#selecting-the-primary-key

High cardinality is good as just by looking at the index (which is in memory) a large set of rows will be discarded.

hoseiney

comment created time in 2 months

push eventnvartolomei/qmk_firmware

Nicolae Vartolomei

commit sha cf55cc73d90b939ca153bcb78d97cebc91a46144

Remove layer 1 toggle from lhs

view details

push time in 2 months

pull request commentClickHouse/ClickHouse

Added multiple query formatting on clickhouse-format

A test would be useful, see https://github.com/ClickHouse/ClickHouse/blob/97f2a2213e754ba25dabba4bc8ddf507cd67660c/tests/queries/0_stateless/00948_format_in_with_single_element.sh together with https://github.com/ClickHouse/ClickHouse/blob/97f2a2213e754ba25dabba4bc8ddf507cd67660c/tests/queries/0_stateless/00948_format_in_with_single_element.reference for an example.

dgrr

comment created time in 2 months

fork nvartolomei/templates

A set of standard document templates.

fork in 2 months

startedmingrammer/diagrams

started time in 2 months

issue commentClickHouse/ClickHouse

Uniform tracking for queries and background processes.

#10316

akuzm

comment created time in 2 months

push eventnvartolomei/qmk_firmware

Nicolae Vartolomei

commit sha 84a0dbc085e9a7a4b69db7533ec42d5183781de8

ergodox_ez/keymaps/nvartolomei: Remove mapping for jumping to first layer with the same key which is used for escape

view details

push time in 2 months

startedlalrpop/lalrpop

started time in 3 months

startedpest-parser/pest

started time in 3 months

issue commentClickHouse/ClickHouse

Allow querying Distributed/Merge tables that have supertype columns of underlying tables

@alexey-milovidov I see SELECT v FROM d ORDER BY v; -- { clientError 36 } in the test and this issue closed, are you saying that enum's work as expected and that you are not planning to change their behaviour? Do you have suggestions for workarounds on how to make 0-downtime ALTERs in this case?

nvartolomei

comment created time in 3 months

delete branch nvartolomei/intellij-community

delete branch : nv/IDEA-194379-auto-hide-editor-scrollbar

delete time in 3 months

startedcanonical/dqlite

started time in 3 months

fork nvartolomei/handson-ml2

A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.

fork in 3 months

more