profile
viewpoint

hashicorp/packer 11477

Packer is a tool for creating identical machine images for multiple platforms from a single source configuration.

hashicorp/serf 5047

Service orchestration and management tool.

hyperium/tonic 2783

A native gRPC client & server implementation with async/await support.

bitgn/fdb-cloud-test 22

Packer + Terraform setup to experiment with FDB clusters in the cloud.

cbednarski/mkdeb 9

Go library for creating Debian packages (.deb)

jen20/awspolicyequivalence 4

A Go library for determining the equivalence of AWS IAM policies written in JSON.

jen20/AggregateSource 2

Lightweight infrastructure for doing eventsourcing using aggregates

jen20/action-docker-build 1

Build, push and tag Docker containers in GitHub Actions.

jen20/build-woftam 1

Visual C++ should be shot in the face.

Pull request review commentapple/foundationdb

Documentation: How FDB read and write path works in FDB 6.2

+##############################+FDB Read and Write Path (WiP)+##############################++| Author: Meng Xu+| Reviewer: Evan Tschannen, Jingyu Zhou+| Audience: FDB developers, SREs and expert users.++This document explains how FDB works at high level in database terms without mentioning FDB internal concepts.++We first discuss the read path and the write path separately for a single transaction. We then describe how the read path and write path work together for a read and write transaction. At the last section, we illustrate how multiple outstanding write transactions are processed and *ordered* in FDB. The processing order of multiple transactions is important because it affects the parallelism of transaction processing and the write throughput.++The content is based on FDB 6.2 and is true for FDB 6.3.  A new timestamp proxy role is introduced in FDB 6.4+, which affects the read path. We will discuss the timestamp proxy role in the future version of this document.++.. image:: /images/FDB_read_path.png++Components+=================++FDB is built on top of several key components. The terms below are common database or distributed system terms, instead of FDB specific terms.++**Timestamp generator.** It serves logical time,  which defines happen-before relation: An event at t1 happens before another event at t2, if t1 < t2. The logic time is used to order events in FDB distributed systems and it is used by concurrency control to decide if two transactions have conflicts.  The logical time is the timestamp for a transaction. ++* A read-only transaction has only one timestamp which is assigned when the transaction is created;+* A read-write transaction has one timestamp at the transaction’s creation time and one timestamp at its commit time.+++**Concurrency Control.** It decides if two transactions can be executed concurrently without violating Strict Serializable Isolation (SSI) property.  It uses the Optimistic Concurrency Control (OCC) mechanism described in [SSI] to achieve that. ++**Client.** It is a library, an FDB application (e.g., CK) uses, to access the database. It exposes the transaction concept to applications.  Client in FDB is a *fat* client that does multiple complex operations: (1) It calculates read and write conflict ranges for transactions; (2) it batches transaction requests for better throughput; (3) it automatically retries failed transactions.++**Proxies.** It is a subsystem that acts like reverse proxies to serve clients’ requests. Its main purposes is:++* Serve for read request by (1) serving the logical time to client; and (2) providing which storage server has data for a key;+* Process write transactions on behalf of clients and return the results;++Each proxy has the system’s metadata, called transaction state store (txnStateStore). The metadata decides: (1) which key should go to which storage servers in the storage system; (2) which key should go to which processes in the durable queuing system; (3) is the database locked; etc. ++The metadata on all proxies are consistent at any given timestamp.  To achieve that, when a proxy has a metadata mutation that changes the metadata at the timestamp V1, the mutation is propagated to all proxies (through the concurrency control component) and its effect is applied on all proxies before any proxy can process transactions after the timestamp V1.++**Durable queuing system.** It is a queuing system for write traffic. Its producers are proxies that send transaction mutation data for durability purpose. Its consumers are storage systems that index data and serve read request. The queuing system is partitioned for the key-space. A shard (i.e., key-range) is mapped to *k* log processes in the queuing system, where *k* is the replication factor. The mapping between shard and storage servers decides the mapping between shard and log processes.++**Storage system.** It is a collection of storage servers (SS), each of which is a sqlite database running on a single thread. It indexes data and serves read requests. Each SS has an in-memory p-tree data structure that stores the past 5-second mutations and an on-disk sqlite data. The in-memory data structure can serve multiple versions of key-values in the past 5 seconds. Due to memory limit, the in-memory data cannot hold more than 5 seconds’ multi-version key-values, which is the root cause why FDB’s transactions cannot be longer than 5 seconds. The on-disk sqlite data has only the most-recent key-value. ++**Zookeeper like system.** The system solves two main problems:++* Store the configuration of the transaction system, which includes information such as which processes are proxies, which processes belong to the queuing system, and which processes belong to the concurrency control, etc.  The system used to be zookeeper. FDB later replaced it with its own implementation. +* Service discovery. Processes in the zookeeper-like system serve as well-known endpoints for clients to connect to the cluster. These well-known endpoint returns the list of proxies to clients.++++Read path of a transaction+==================================++Fig. 1 above shows a high-level view of the read path. An application uses FDB client library to read data. It creates a transaction and calls its read() function. The read() operation will lead to several steps.++* **Step 1 (Timestamp request)**: The read operation needs a timestamp. The client initiates the timestamp request through an RPC to proxy. The request will trigger Step 2 and Step 3;+    * To improve throughput and reduce load on the server side, each client dynamically batches the timestamp requests. A client keeps adding requests to the current batch until *when* the number of requests in a batch exceeds a configurable threshold or *when* the batching times out at a dynamically computed threshold. Each batch sends only one timestamp request to proxy and all requests in the same batch share the same timestamp.+* **Step 2 ****(Get latest commit version)**: When the timestamp request arrives at a proxy, the proxy wants to get the largest commit version as the return value. So it contacts the rest of (n-1) proxies for their latest commit versions and uses the largest one as the return value for Step 1. +    * O(n^2) communication cost: Because each proxy needs to contact the rest of (n-1) proxies to serve clients’ timestamp request, the communication cost is n*(n-1), where n is the number of proxies;+    * Batching: To reduce communication cost, each proxy batches clients’ timestamp requests for a configurable time period (say 1ms) and return the same timestamp for requests in the same batch.+* **Step 3 (Confirm proxy’s liveness)**: To prevent proxies that are no longer a part of the system (such as due to network partition) from serving requests, each proxy contacts the queuing system for each timestamp request to confirm it is still valid proxies. This is based on the FDB property that at most one active queuing system is available at any given time.+    * Why do we need this step? This is to achieve consensus (i.e., external consistency). Compared to serializable isolation, Strict Serializable Isolation (SSI) requires external consistency. It means the timestamp received by clients cannot decrease. If we do not have step and network partition happens, a set of old proxies that are disconnected from the rest of systems can still serve timestamp requests to clients. These timestamps can be smaller than the new generation of proxies, which breaks the external consistency in SSI.+    * O(n * m) communication cost: To confirm a proxy’s liveness, the proxy has to contact all members in the queuing system to ensure the queuing system is still active. This causes *m* network communication, where *m* is the number of processes in the queuing system. A system with n proxies will have O(n * m) network communications at the step 3. In our deployment, n is typically equal to m;+    * Do FDB production clusters have this overhead? No. Our production clusters disable the external consistency by configuring the knob ALWAYS_CAUSAL_READ_RISKY.+* **Step 4 (Locality request)**: The client gets which storage servers have its requested keys by sending another RPC to proxy. This step returns a set of  *k* storage server interfaces, where k is the replication factor;+    * Client cache mechanism: The key location will be cached in client. Future requests will use the cache to directly read from storage servers, which saves a trip to proxy. If location is stale, read will return error and client will retry and refresh the cache.++* **Step 5 (Get data request)**: The client uses the interfaces from step 4 to directly get its keys from those storage servers. +    * Direct read from client’s memory: If a key’s value exists in the client’s memory, the client reads it directly from its local memory. This happens when a client updates a key’s value and later reads it. This optimization reduces the amount of unnecessary requests to storage servers.+    * Load balance: Each data exists on k storage servers, where k is the replication factor. To balance the load across the k replicas, client has a load balancing algorithm to balance the number of requests to each replica.+    * Transaction succeed: If the storage server has the data at the read timestamp, the client will receive the data and return succeed. +    * Transaction too old error: If the read request’s timestamp is older than 5 seconds, storage server may have already flushed the data from its in-memory multi-version data structure to its on-disk single-version data structure. This means storage server does not have the data older than 5 seconds. So client will receive transaction too old error. The client will retry with a new timestamp. One scenario that can lead to the error is when it takes too long for a client to send the read request after it gets the timestamp.+    * Future transaction error: Each storage server pulls data in increasing order of data’s timestamp from the queuing system. Let’s define a storage server’s timestamp as the largest timestamp of data the storage server has. If the read request’s timestamp is larger than the storage server’s timestamp, the storage server will reply future-transaction-error to the client. The client will retry. One scenario that can lead to the error is when the connection between the SS and the queuing system is slow.+    * Wrong shard error: If keys in the request or result depend on data outside this storage server OR if a large selector offset prevents all data from being read in one range read. Client will invalidate its locality cache for the key and retry the read request at the failed key.++Implementation of FDB read path+------------------------------------------++* **Step 1 (Timestamp request)**: +    * Each read request tries to get a timestamp if its transaction has not got one: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L2104+    * Client batches the get-timestamp requests: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L3172+    * Dynamic batching algorithm: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L3101-L3104+* **Step 2 (Get latest commit version)**: Contacting (n-1) proxies for commit version: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L1196+* **Step 3 (Confirm proxy’s liveness)**:+    * We typically set our clusters’ knob ALWAYS_CAUSAL_READ_RISKY to 1 to skip this step+    * Proxy confirm queuing system is alive: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L1199+    * How is confirmEpochLive(..) implemented for the above item. https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/TagPartitionedLogSystem.actor.cpp#L1216-L1225+* **Step 4 (Locality request)**: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L1312-L1313+* **Step 5 (Get data request)**:+    * Logics of handling get value request: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L1306-L1396+    * Load balance algorithm: The loadBalance() at https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L1342-L1344++++Write path of a transaction+================================++Suppose a client has a write-only transaction. Fig. 2 below shows the write path in a non-HA cluster. We will discuss how a transaction with both read and write works in the next section.++.. image:: /images/FDB_write_path.png++To simplify the explanation, the steps below do not include transaction batching on proxy, which is a typical database technique to increase transaction throughput.++* **Step 1 (Client buffers write mutations):** Client buffers all writes in a transaction until commit is called on the transaction. In the rest of document, a write is also named as a mutation.+    * Client is a fat client that preprocess transactions: (a) For atomic operations, if client knows the key value, it will convert atomic operations to set operations; (b) For version stamp atomic operations, client adds extra bytes to key for the version stamp; (c) If a key has multiple operations, client coalesces them to one operation, etc.
    * Client is a fat client that preprocess transactions: (a) For atomic operations, if client knows the key value, it will convert atomic operations to set operations; (b) For version stamp atomic operations, client adds extra bytes to key or value for the version stamp; (c) If a key has multiple operations, client coalesces them to one operation whenever possible.
xumengpanda

comment created time in a day

Pull request review commentapple/foundationdb

Documentation: How FDB read and write path works in FDB 6.2

+##############################+FDB Read and Write Path (WiP)+##############################++| Author: Meng Xu+| Reviewer: Evan Tschannen, Jingyu Zhou+| Audience: FDB developers, SREs and expert users.++This document explains how FDB works at high level in database terms without mentioning FDB internal concepts.++We first discuss the read path and the write path separately for a single transaction. We then describe how the read path and write path work together for a read and write transaction. At the last section, we illustrate how multiple outstanding write transactions are processed and *ordered* in FDB. The processing order of multiple transactions is important because it affects the parallelism of transaction processing and the write throughput.++The content is based on FDB 6.2 and is true for FDB 6.3.  A new timestamp proxy role is introduced in FDB 6.4+, which affects the read path. We will discuss the timestamp proxy role in the future version of this document.++.. image:: /images/FDB_read_path.png++Components+=================++FDB is built on top of several key components. The terms below are common database or distributed system terms, instead of FDB specific terms.++**Timestamp generator.** It serves logical time,  which defines happen-before relation: An event at t1 happens before another event at t2, if t1 < t2. The logic time is used to order events in FDB distributed systems and it is used by concurrency control to decide if two transactions have conflicts.  The logical time is the timestamp for a transaction. ++* A read-only transaction has only one timestamp which is assigned when the transaction is created;+* A read-write transaction has one timestamp at the transaction’s creation time and one timestamp at its commit time.+++**Concurrency Control.** It decides if two transactions can be executed concurrently without violating Strict Serializable Isolation (SSI) property.  It uses the Optimistic Concurrency Control (OCC) mechanism described in [SSI] to achieve that. ++**Client.** It is a library, an FDB application (e.g., CK) uses, to access the database. It exposes the transaction concept to applications.  Client in FDB is a *fat* client that does multiple complex operations: (1) It calculates read and write conflict ranges for transactions; (2) it batches transaction requests for better throughput; (3) it automatically retries failed transactions.++**Proxies.** It is a subsystem that acts like reverse proxies to serve clients’ requests. Its main purposes is:++* Serve for read request by (1) serving the logical time to client; and (2) providing which storage server has data for a key;+* Process write transactions on behalf of clients and return the results;++Each proxy has the system’s metadata, called transaction state store (txnStateStore). The metadata decides: (1) which key should go to which storage servers in the storage system; (2) which key should go to which processes in the durable queuing system; (3) is the database locked; etc. ++The metadata on all proxies are consistent at any given timestamp.  To achieve that, when a proxy has a metadata mutation that changes the metadata at the timestamp V1, the mutation is propagated to all proxies (through the concurrency control component) and its effect is applied on all proxies before any proxy can process transactions after the timestamp V1.++**Durable queuing system.** It is a queuing system for write traffic. Its producers are proxies that send transaction mutation data for durability purpose. Its consumers are storage systems that index data and serve read request. The queuing system is partitioned for the key-space. A shard (i.e., key-range) is mapped to *k* log processes in the queuing system, where *k* is the replication factor. The mapping between shard and storage servers decides the mapping between shard and log processes.++**Storage system.** It is a collection of storage servers (SS), each of which is a sqlite database running on a single thread. It indexes data and serves read requests. Each SS has an in-memory p-tree data structure that stores the past 5-second mutations and an on-disk sqlite data. The in-memory data structure can serve multiple versions of key-values in the past 5 seconds. Due to memory limit, the in-memory data cannot hold more than 5 seconds’ multi-version key-values, which is the root cause why FDB’s transactions cannot be longer than 5 seconds. The on-disk sqlite data has only the most-recent key-value. ++**Zookeeper like system.** The system solves two main problems:++* Store the configuration of the transaction system, which includes information such as which processes are proxies, which processes belong to the queuing system, and which processes belong to the concurrency control, etc.  The system used to be zookeeper. FDB later replaced it with its own implementation. +* Service discovery. Processes in the zookeeper-like system serve as well-known endpoints for clients to connect to the cluster. These well-known endpoint returns the list of proxies to clients.++++Read path of a transaction+==================================++Fig. 1 above shows a high-level view of the read path. An application uses FDB client library to read data. It creates a transaction and calls its read() function. The read() operation will lead to several steps.++* **Step 1 (Timestamp request)**: The read operation needs a timestamp. The client initiates the timestamp request through an RPC to proxy. The request will trigger Step 2 and Step 3;+    * To improve throughput and reduce load on the server side, each client dynamically batches the timestamp requests. A client keeps adding requests to the current batch until *when* the number of requests in a batch exceeds a configurable threshold or *when* the batching times out at a dynamically computed threshold. Each batch sends only one timestamp request to proxy and all requests in the same batch share the same timestamp.+* **Step 2 ****(Get latest commit version)**: When the timestamp request arrives at a proxy, the proxy wants to get the largest commit version as the return value. So it contacts the rest of (n-1) proxies for their latest commit versions and uses the largest one as the return value for Step 1. +    * O(n^2) communication cost: Because each proxy needs to contact the rest of (n-1) proxies to serve clients’ timestamp request, the communication cost is n*(n-1), where n is the number of proxies;+    * Batching: To reduce communication cost, each proxy batches clients’ timestamp requests for a configurable time period (say 1ms) and return the same timestamp for requests in the same batch.+* **Step 3 (Confirm proxy’s liveness)**: To prevent proxies that are no longer a part of the system (such as due to network partition) from serving requests, each proxy contacts the queuing system for each timestamp request to confirm it is still valid proxies. This is based on the FDB property that at most one active queuing system is available at any given time.+    * Why do we need this step? This is to achieve consensus (i.e., external consistency). Compared to serializable isolation, Strict Serializable Isolation (SSI) requires external consistency. It means the timestamp received by clients cannot decrease. If we do not have step and network partition happens, a set of old proxies that are disconnected from the rest of systems can still serve timestamp requests to clients. These timestamps can be smaller than the new generation of proxies, which breaks the external consistency in SSI.+    * O(n * m) communication cost: To confirm a proxy’s liveness, the proxy has to contact all members in the queuing system to ensure the queuing system is still active. This causes *m* network communication, where *m* is the number of processes in the queuing system. A system with n proxies will have O(n * m) network communications at the step 3. In our deployment, n is typically equal to m;+    * Do FDB production clusters have this overhead? No. Our production clusters disable the external consistency by configuring the knob ALWAYS_CAUSAL_READ_RISKY.+* **Step 4 (Locality request)**: The client gets which storage servers have its requested keys by sending another RPC to proxy. This step returns a set of  *k* storage server interfaces, where k is the replication factor;+    * Client cache mechanism: The key location will be cached in client. Future requests will use the cache to directly read from storage servers, which saves a trip to proxy. If location is stale, read will return error and client will retry and refresh the cache.++* **Step 5 (Get data request)**: The client uses the interfaces from step 4 to directly get its keys from those storage servers. +    * Direct read from client’s memory: If a key’s value exists in the client’s memory, the client reads it directly from its local memory. This happens when a client updates a key’s value and later reads it. This optimization reduces the amount of unnecessary requests to storage servers.+    * Load balance: Each data exists on k storage servers, where k is the replication factor. To balance the load across the k replicas, client has a load balancing algorithm to balance the number of requests to each replica.+    * Transaction succeed: If the storage server has the data at the read timestamp, the client will receive the data and return succeed. +    * Transaction too old error: If the read request’s timestamp is older than 5 seconds, storage server may have already flushed the data from its in-memory multi-version data structure to its on-disk single-version data structure. This means storage server does not have the data older than 5 seconds. So client will receive transaction too old error. The client will retry with a new timestamp. One scenario that can lead to the error is when it takes too long for a client to send the read request after it gets the timestamp.+    * Future transaction error: Each storage server pulls data in increasing order of data’s timestamp from the queuing system. Let’s define a storage server’s timestamp as the largest timestamp of data the storage server has. If the read request’s timestamp is larger than the storage server’s timestamp, the storage server will reply future-transaction-error to the client. The client will retry. One scenario that can lead to the error is when the connection between the SS and the queuing system is slow.+    * Wrong shard error: If keys in the request or result depend on data outside this storage server OR if a large selector offset prevents all data from being read in one range read. Client will invalidate its locality cache for the key and retry the read request at the failed key.++Implementation of FDB read path+------------------------------------------++* **Step 1 (Timestamp request)**: +    * Each read request tries to get a timestamp if its transaction has not got one: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L2104+    * Client batches the get-timestamp requests: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L3172+    * Dynamic batching algorithm: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L3101-L3104+* **Step 2 (Get latest commit version)**: Contacting (n-1) proxies for commit version: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L1196+* **Step 3 (Confirm proxy’s liveness)**:+    * We typically set our clusters’ knob ALWAYS_CAUSAL_READ_RISKY to 1 to skip this step+    * Proxy confirm queuing system is alive: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L1199+    * How is confirmEpochLive(..) implemented for the above item. https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/TagPartitionedLogSystem.actor.cpp#L1216-L1225+* **Step 4 (Locality request)**: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L1312-L1313+* **Step 5 (Get data request)**:+    * Logics of handling get value request: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L1306-L1396+    * Load balance algorithm: The loadBalance() at https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L1342-L1344++++Write path of a transaction+================================++Suppose a client has a write-only transaction. Fig. 2 below shows the write path in a non-HA cluster. We will discuss how a transaction with both read and write works in the next section.++.. image:: /images/FDB_write_path.png++To simplify the explanation, the steps below do not include transaction batching on proxy, which is a typical database technique to increase transaction throughput.++* **Step 1 (Client buffers write mutations):** Client buffers all writes in a transaction until commit is called on the transaction. In the rest of document, a write is also named as a mutation.+    * Client is a fat client that preprocess transactions: (a) For atomic operations, if client knows the key value, it will convert atomic operations to set operations; (b) For version stamp atomic operations, client adds extra bytes to key for the version stamp; (c) If a key has multiple operations, client coalesces them to one operation, etc.+    * How client buffers mutations: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbclient/NativeAPI.actor.cpp#L2345-L2361+* **Step 2 (Client commits the transaction):** When a client calls commit(), it performs several operations:+    * **Step 2-1**: Add extra conflict ranges that are added by user but cannot be calculated from mutations. +    * **Step 2-2**: Get a timestamp as the transaction’s start time. The timestamp does not need causal consistency because the transaction has no read. +        * This request goes to one of proxies. The proxy will contact all other (n-1) proxies to get the most recent commit version as it does in read path. The proxy does not need to contact log systems to confirm its activeness because it does not need causal consistency.+    * **Step 2-3**:  Sends the transaction’s information to a proxy. Load balancer in client decides which proxy will be used to handle a transaction. A transaction’s information includes:+        * All of its mutations;+        * Read and write conflict range;+        * Transaction options that control a transaction’s behavior. For example, should the transaction write when the DB is locked? Shall the transaction uses the first proxy in the proxy list to commit? +    * Implementation:+        * Transaction commit function: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbclient/NativeAPI.actor.cpp#L2895-L2899+        * Major work of commit in client side is done at here: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbclient/NativeAPI.actor.cpp#L2784-L2868+        * Step 2-1: Add extra conflict ranges: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbclient/NativeAPI.actor.cpp#L2826-L2828+        * Step 2-2: getReadVersion at commit which does not need external consistency because we do not have read in the transaction: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbclient/NativeAPI.actor.cpp#L2822-L2823+        * Step 2-3: Send transaction to a proxy via RPC: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbclient/NativeAPI.actor.cpp#L2691-L2700+* When a proxy receives clients’ transactions, it commits the transaction on behalf of clients with Step 3 - 9.+* **Step 3 (Proxy gets commit timestamp)**: The proxy gets the timestamp of the transaction’s commit time from the time oracle through an RPC call.+    * To improve transaction throughput and reduce network communication overhead, each proxy dynamically batch transactions and process transactions in batches. A proxy keeps batching transactions until the batch time exceeds a configurable timeout value or until the number of transactions exceed a configurable value or until the total bytes of the batch exceeds a dynamically calculated desired size. +    * The network overhead is 1 network communication per batch of commit transactions;+    * How is the dynamically calculated batch size calculated: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L1770-L1774+    * How commit transactions are batched: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L416-L486+    * How each transaction batch is handled: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L523-L1174+    * Where does proxy sends commit timestamp request to the timestamp generator:  https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L586-L587+* **Step 4 (Proxy builds transactions’ conflict ranges)**: Because the concurrency control component may have multiple processes, each of which is responsible for resolving conflicts in a key range, the proxy needs to build one transaction-conflict-resolution request for each concurrency control process: For each transaction, the proxy splits its read and write conflict ranges based on concurrency control process’ responsible ranges. The proxy will create k conflict resolution requests for each transaction, where k is the number of processes in the concurrency control component. +    * Implementation: https://github.com/xumengpanda/foundationdb/blob/4086e3a2750b776cc8bfb0f0e463fe00ac905595/fdbserver/MasterProxyServer.actor.cpp#L607-L618+* **Step 5 (Proxy sends conflict resolution requests to concurrency control)**: Each concurrency control process is responsible for checking conflicts in a key range. Each process checks if the transaction has conflicts with other transactions in its  key-range. Each process returns the conflict checking result back to the proxy.+    * What is conflict range? +        * A transaction’s write conflict range includes any key and key-ranges that are modified in the transactions. +        * A transaction’s read conflict range includes any key and key-ranges that are read in the transaction. +        * Client can also use transaction options to add explicit read-conflict-range or write-conflict-range. Example: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L2634-L2635+    * **Piggy-back metadata change**. If the transaction changes database’s metadata, such as locking the database, the change is considered as a special mutation and also checked for conflicts by the concurrency control component.  The primary difference between metadata mutation and normal mutations is that the metadata change must be propagated to all proxies so that all proxies have a consistent view of database’s metadata. This is achieved by piggy-backing metadata change in the reply from resolver to proxies.  +    * Implementation+        * Create conflict resolution requests for a batch of transactions: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L607-L618+        * Metadata mutations are sent from proxy to concurrency control processes: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L366-L369+* **Step 6 (Resolve conflicts among concurrent transactions)**: Each concurrency control process checks conflicts among transactions based on the theory in [1]. In a nutshell, it checks for read-write conflicts. Suppose two transactions operates on the same key. If a write transaction’s time overlaps between another read-write transaction’s start time and commit time, only one transaction can commit: the one that arrives first at all concurrency control processes will commit.+    * Implementation+        * Proxy sends conflict checking request: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L626-L629+        * Concurrency control process handles the request: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/Resolver.actor.cpp#L320-L322+* **Step 7 (Proxy’s post resolution processing)**: Once the proxy receives conflict-resolution replies from all concurrency control processes, it performs three steps+    * **Step 7-1 (Apply metadata effect caused by other proxies)**: As mentioned above, when a proxy changes database’s metadata, the metadata mutations will be propagated via the concurrency control component to other proxies. So the proxy needs to first compute and apply these metadata mutations onto the proxy’s local states. Otherwise, the proxy will operate in a different view of database’s metadata.+        * For example, if one proxy locks the database in a committed transaction at time t1, all other proxies should have seen the lock immediately after t1. Since another proxy may have transactions in flight already at t1, the proxy must first apply the “lock“ effect before it can process its in-flight transactions.+        * How metadata effect is applied in implementation:  https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L678-L719+    * **Step 7-2 (Determine which transactions are committed)**: Proxy combines results from all concurrency control processes. Only if all concurrency control processes say a transaction is committed, will the transaction be considered as committed by the proxy. +        * Implementation: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L721-L757+    * **Step 7-3 (Apply metadata effect caused by this proxy)**: For each committed transaction, this proxy applies its metadata mutations to the proxy’s local state. +        * Note: These metadata mutations are also sent to concurrency control processes and propagated to other proxies at Step 5. This step is to apply metadata effect on its own proxy’s states.+        * Implementation: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L763-L777+    * **Step 7-4 (Assign mutations to storage servers and serialize them)**: In order to let the rest of system (the queuing system and storage system) know which process a mutation should be routed to, the proxy needs to add tags to mutations. The proxy serializes mutations with the same tag into the same message and sends the serialized message to the queuing system.+        * Implementation of adding tags and serializing mutations into messages: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L800-L910+        * The lines that add tags to a mutation and serialize it: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L846-L847+    * **Step 7-5 (Duplicate and serialize mutations to backup system keyspace)**:  When backup or disaster recovery (DR) is enabled, each proxy captures mutation streams into a dedicated system keyspace. Mutations in a transaction batch are serialized as a single mutation in a dedicated system keyspace. +        * How mutations are duplicated for backup and DR: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L912-L986+        * Note: FDB will have a new backup system that avoids duplicating mutations to the system keyspace. Its design is similar to database’s Change Data Capture (CDC) design. The new backup system is not production-ready yet.+* **Step 8 (Make mutation messages durable in the queuing system)**: Proxy sends serialized mutation messages to the queuing system. The queuing system will append the mutation to an append-only file, fsync it, and send the respnose back. Each message has a tag, which decides which process in the queuing system the message should be sent to. The queuing system returns to the proxy the minimum known committed version, which is the smallest commit version among all proxies. The minimum known commit version is used when the system recovers from fault.+    * Sending messages to the queuing system is abstracted into a push() operation: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L1045+    * The minimum known committed version is called minKnownCommittedVersion. It is updated for each commit: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L1067+* **Step 9 (Reply to client)**: Proxy replies the transaction’s result to client. If the transaction fails (say due to transaction conflicts), proxy sends the error message to the client. +    * Reply to clients based on different transaction’s results: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L1117-L1138+* **Step 10 (Storage systems pull data from queuing system)**: Storage system asynchronously pulls data from queuing system and indexes data for read path. +    * Each SS has a primary process (called primary tLog) in the queuing system to pull data from the SS’s data from the queuing system. Each SS only gets in-ordered streams of mutations that are owned by the SS. +    * In failure scenario when a SS cannot reach the primary tLog, the SS will pull data from different tLogs that have part of the SS’s data. The SS will then merge the stream of data from different tLogs.+    * Each SS does not make its pulled data durable to disk until the data becomes at least 5 seconds older than the most recent data the SS has pulled. This allows each SS to roll back at least 5 seconds of mutations. +    * Why do we need roll back feature for SS? This comes from an optimization used in FDB. To make a mutation available in a SS as soon as possible, a SS may fetch a mutation from the queuing system that has not been fully replicated. The mutation’s transaction may be aborted in rare situations, such as when FDB has to recover from faults and decides to throw away the last few non-fully-durable transactions. SSes must throw away data in the aborted transactions. +    * Why does SS not make data durable until 5 seconds later? This is because today’s SS does not support rolling back data that has already been made durable on disk. To support roll back, SS keeps data that might be rolled back in memory. When roll-back is needed, SS just throws away the in-memory data. This simplifies the SS implementation.+++    * Each storage process pulls data from the queuing system: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/storageserver.actor.cpp#L3593-L3599++Read write path of a transaction+====================================++This section uses an example transaction to describe how a transaction with both read and write operation works in FDB.++Suppose application creates the following transaction, where *Future<int>* is an object that holds an asynchronous call and becomes ready when the async call returns, and *wait()* is a synchronous point when the code waits for futures to be ready.+The following code reads key k1 and k2 from database,  increases k1’s value by 1 and write back k1’s new value into database.++**Example Transaction** ::++    Line1: Transaction tr;+    Line2: Future<int> fv1 = tr.get(k1);+    Line3: Future<int> fv2 = tr.get(k2);+    Line4: v1 = wait(fv1);+    Line5: v2 = wait(fv2);+    Line6: tr.set(v1+v2);+    Line7: tr.commit();++The transaction starts with the read path:++* When tr.get() is called, FDB client issues a timestamp request to proxies *if* the transaction has not set its start timestamp. The logic is the Step 1 in the read path;+* Batching timestamp requests. When another tr.get() is called, it will try to get a timestamp as well. If we let every get request to follow the Step 1 in the read path, the performance overhead (especially network communication) will be a lot. In addition, this is not necessary because a transaction has only one start timestamp. To solve this problem, client chooses to batch timestamp requests from the same transaction and only issues one timestamp request when the transaction size reaches a preconfigured threshold or when the transaction duration reaches the batching timeout threshold. +    * Timestamp requests are batched: https://github.com/xumengpanda/foundationdb/blob/4086e3a2750b776cc8bfb0f0e463fe00ac905595/fdbclient/NativeAPI.actor.cpp#L3185+    * Thresholds for client to send the timestamp request: https://github.com/xumengpanda/foundationdb/blob/4086e3a2750b776cc8bfb0f0e463fe00ac905595/fdbclient/NativeAPI.actor.cpp#L3095-L3098
    * Thresholds for client to send the timestamp request: https://github.com/apple/foundationdb/blob/4086e3a2750b776cc8bfb0f0e463fe00ac905595/fdbclient/NativeAPI.actor.cpp#L3095-L3098
xumengpanda

comment created time in a day

Pull request review commentapple/foundationdb

Documentation: How FDB read and write path works in FDB 6.2

+##############################+FDB Read and Write Path (WiP)+##############################++| Author: Meng Xu+| Reviewer: Evan Tschannen, Jingyu Zhou+| Audience: FDB developers, SREs and expert users.++This document explains how FDB works at high level in database terms without mentioning FDB internal concepts.++We first discuss the read path and the write path separately for a single transaction. We then describe how the read path and write path work together for a read and write transaction. At the last section, we illustrate how multiple outstanding write transactions are processed and *ordered* in FDB. The processing order of multiple transactions is important because it affects the parallelism of transaction processing and the write throughput.++The content is based on FDB 6.2 and is true for FDB 6.3.  A new timestamp proxy role is introduced in FDB 6.4+, which affects the read path. We will discuss the timestamp proxy role in the future version of this document.++.. image:: /images/FDB_read_path.png++Components+=================++FDB is built on top of several key components. The terms below are common database or distributed system terms, instead of FDB specific terms.++**Timestamp generator.** It serves logical time,  which defines happen-before relation: An event at t1 happens before another event at t2, if t1 < t2. The logic time is used to order events in FDB distributed systems and it is used by concurrency control to decide if two transactions have conflicts.  The logical time is the timestamp for a transaction. ++* A read-only transaction has only one timestamp which is assigned when the transaction is created;+* A read-write transaction has one timestamp at the transaction’s creation time and one timestamp at its commit time.+++**Concurrency Control.** It decides if two transactions can be executed concurrently without violating Strict Serializable Isolation (SSI) property.  It uses the Optimistic Concurrency Control (OCC) mechanism described in [SSI] to achieve that. ++**Client.** It is a library, an FDB application (e.g., CK) uses, to access the database. It exposes the transaction concept to applications.  Client in FDB is a *fat* client that does multiple complex operations: (1) It calculates read and write conflict ranges for transactions; (2) it batches transaction requests for better throughput; (3) it automatically retries failed transactions.++**Proxies.** It is a subsystem that acts like reverse proxies to serve clients’ requests. Its main purposes is:++* Serve for read request by (1) serving the logical time to client; and (2) providing which storage server has data for a key;+* Process write transactions on behalf of clients and return the results;++Each proxy has the system’s metadata, called transaction state store (txnStateStore). The metadata decides: (1) which key should go to which storage servers in the storage system; (2) which key should go to which processes in the durable queuing system; (3) is the database locked; etc. ++The metadata on all proxies are consistent at any given timestamp.  To achieve that, when a proxy has a metadata mutation that changes the metadata at the timestamp V1, the mutation is propagated to all proxies (through the concurrency control component) and its effect is applied on all proxies before any proxy can process transactions after the timestamp V1.++**Durable queuing system.** It is a queuing system for write traffic. Its producers are proxies that send transaction mutation data for durability purpose. Its consumers are storage systems that index data and serve read request. The queuing system is partitioned for the key-space. A shard (i.e., key-range) is mapped to *k* log processes in the queuing system, where *k* is the replication factor. The mapping between shard and storage servers decides the mapping between shard and log processes.++**Storage system.** It is a collection of storage servers (SS), each of which is a sqlite database running on a single thread. It indexes data and serves read requests. Each SS has an in-memory p-tree data structure that stores the past 5-second mutations and an on-disk sqlite data. The in-memory data structure can serve multiple versions of key-values in the past 5 seconds. Due to memory limit, the in-memory data cannot hold more than 5 seconds’ multi-version key-values, which is the root cause why FDB’s transactions cannot be longer than 5 seconds. The on-disk sqlite data has only the most-recent key-value. ++**Zookeeper like system.** The system solves two main problems:++* Store the configuration of the transaction system, which includes information such as which processes are proxies, which processes belong to the queuing system, and which processes belong to the concurrency control, etc.  The system used to be zookeeper. FDB later replaced it with its own implementation. +* Service discovery. Processes in the zookeeper-like system serve as well-known endpoints for clients to connect to the cluster. These well-known endpoint returns the list of proxies to clients.++++Read path of a transaction+==================================++Fig. 1 above shows a high-level view of the read path. An application uses FDB client library to read data. It creates a transaction and calls its read() function. The read() operation will lead to several steps.++* **Step 1 (Timestamp request)**: The read operation needs a timestamp. The client initiates the timestamp request through an RPC to proxy. The request will trigger Step 2 and Step 3;+    * To improve throughput and reduce load on the server side, each client dynamically batches the timestamp requests. A client keeps adding requests to the current batch until *when* the number of requests in a batch exceeds a configurable threshold or *when* the batching times out at a dynamically computed threshold. Each batch sends only one timestamp request to proxy and all requests in the same batch share the same timestamp.+* **Step 2 ****(Get latest commit version)**: When the timestamp request arrives at a proxy, the proxy wants to get the largest commit version as the return value. So it contacts the rest of (n-1) proxies for their latest commit versions and uses the largest one as the return value for Step 1. +    * O(n^2) communication cost: Because each proxy needs to contact the rest of (n-1) proxies to serve clients’ timestamp request, the communication cost is n*(n-1), where n is the number of proxies;+    * Batching: To reduce communication cost, each proxy batches clients’ timestamp requests for a configurable time period (say 1ms) and return the same timestamp for requests in the same batch.+* **Step 3 (Confirm proxy’s liveness)**: To prevent proxies that are no longer a part of the system (such as due to network partition) from serving requests, each proxy contacts the queuing system for each timestamp request to confirm it is still valid proxies. This is based on the FDB property that at most one active queuing system is available at any given time.+    * Why do we need this step? This is to achieve consensus (i.e., external consistency). Compared to serializable isolation, Strict Serializable Isolation (SSI) requires external consistency. It means the timestamp received by clients cannot decrease. If we do not have step and network partition happens, a set of old proxies that are disconnected from the rest of systems can still serve timestamp requests to clients. These timestamps can be smaller than the new generation of proxies, which breaks the external consistency in SSI.+    * O(n * m) communication cost: To confirm a proxy’s liveness, the proxy has to contact all members in the queuing system to ensure the queuing system is still active. This causes *m* network communication, where *m* is the number of processes in the queuing system. A system with n proxies will have O(n * m) network communications at the step 3. In our deployment, n is typically equal to m;+    * Do FDB production clusters have this overhead? No. Our production clusters disable the external consistency by configuring the knob ALWAYS_CAUSAL_READ_RISKY.+* **Step 4 (Locality request)**: The client gets which storage servers have its requested keys by sending another RPC to proxy. This step returns a set of  *k* storage server interfaces, where k is the replication factor;+    * Client cache mechanism: The key location will be cached in client. Future requests will use the cache to directly read from storage servers, which saves a trip to proxy. If location is stale, read will return error and client will retry and refresh the cache.++* **Step 5 (Get data request)**: The client uses the interfaces from step 4 to directly get its keys from those storage servers. +    * Direct read from client’s memory: If a key’s value exists in the client’s memory, the client reads it directly from its local memory. This happens when a client updates a key’s value and later reads it. This optimization reduces the amount of unnecessary requests to storage servers.+    * Load balance: Each data exists on k storage servers, where k is the replication factor. To balance the load across the k replicas, client has a load balancing algorithm to balance the number of requests to each replica.+    * Transaction succeed: If the storage server has the data at the read timestamp, the client will receive the data and return succeed. +    * Transaction too old error: If the read request’s timestamp is older than 5 seconds, storage server may have already flushed the data from its in-memory multi-version data structure to its on-disk single-version data structure. This means storage server does not have the data older than 5 seconds. So client will receive transaction too old error. The client will retry with a new timestamp. One scenario that can lead to the error is when it takes too long for a client to send the read request after it gets the timestamp.+    * Future transaction error: Each storage server pulls data in increasing order of data’s timestamp from the queuing system. Let’s define a storage server’s timestamp as the largest timestamp of data the storage server has. If the read request’s timestamp is larger than the storage server’s timestamp, the storage server will reply future-transaction-error to the client. The client will retry. One scenario that can lead to the error is when the connection between the SS and the queuing system is slow.+    * Wrong shard error: If keys in the request or result depend on data outside this storage server OR if a large selector offset prevents all data from being read in one range read. Client will invalidate its locality cache for the key and retry the read request at the failed key.++Implementation of FDB read path+------------------------------------------++* **Step 1 (Timestamp request)**: +    * Each read request tries to get a timestamp if its transaction has not got one: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L2104+    * Client batches the get-timestamp requests: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L3172+    * Dynamic batching algorithm: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L3101-L3104+* **Step 2 (Get latest commit version)**: Contacting (n-1) proxies for commit version: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L1196+* **Step 3 (Confirm proxy’s liveness)**:+    * We typically set our clusters’ knob ALWAYS_CAUSAL_READ_RISKY to 1 to skip this step+    * Proxy confirm queuing system is alive: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L1199+    * How is confirmEpochLive(..) implemented for the above item. https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/TagPartitionedLogSystem.actor.cpp#L1216-L1225+* **Step 4 (Locality request)**: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L1312-L1313+* **Step 5 (Get data request)**:+    * Logics of handling get value request: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L1306-L1396+    * Load balance algorithm: The loadBalance() at https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L1342-L1344++++Write path of a transaction+================================++Suppose a client has a write-only transaction. Fig. 2 below shows the write path in a non-HA cluster. We will discuss how a transaction with both read and write works in the next section.++.. image:: /images/FDB_write_path.png++To simplify the explanation, the steps below do not include transaction batching on proxy, which is a typical database technique to increase transaction throughput.++* **Step 1 (Client buffers write mutations):** Client buffers all writes in a transaction until commit is called on the transaction. In the rest of document, a write is also named as a mutation.+    * Client is a fat client that preprocess transactions: (a) For atomic operations, if client knows the key value, it will convert atomic operations to set operations; (b) For version stamp atomic operations, client adds extra bytes to key for the version stamp; (c) If a key has multiple operations, client coalesces them to one operation, etc.+    * How client buffers mutations: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbclient/NativeAPI.actor.cpp#L2345-L2361+* **Step 2 (Client commits the transaction):** When a client calls commit(), it performs several operations:+    * **Step 2-1**: Add extra conflict ranges that are added by user but cannot be calculated from mutations. +    * **Step 2-2**: Get a timestamp as the transaction’s start time. The timestamp does not need causal consistency because the transaction has no read. +        * This request goes to one of proxies. The proxy will contact all other (n-1) proxies to get the most recent commit version as it does in read path. The proxy does not need to contact log systems to confirm its activeness because it does not need causal consistency.+    * **Step 2-3**:  Sends the transaction’s information to a proxy. Load balancer in client decides which proxy will be used to handle a transaction. A transaction’s information includes:+        * All of its mutations;+        * Read and write conflict range;+        * Transaction options that control a transaction’s behavior. For example, should the transaction write when the DB is locked? Shall the transaction uses the first proxy in the proxy list to commit? +    * Implementation:+        * Transaction commit function: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbclient/NativeAPI.actor.cpp#L2895-L2899+        * Major work of commit in client side is done at here: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbclient/NativeAPI.actor.cpp#L2784-L2868+        * Step 2-1: Add extra conflict ranges: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbclient/NativeAPI.actor.cpp#L2826-L2828+        * Step 2-2: getReadVersion at commit which does not need external consistency because we do not have read in the transaction: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbclient/NativeAPI.actor.cpp#L2822-L2823+        * Step 2-3: Send transaction to a proxy via RPC: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbclient/NativeAPI.actor.cpp#L2691-L2700+* When a proxy receives clients’ transactions, it commits the transaction on behalf of clients with Step 3 - 9.+* **Step 3 (Proxy gets commit timestamp)**: The proxy gets the timestamp of the transaction’s commit time from the time oracle through an RPC call.+    * To improve transaction throughput and reduce network communication overhead, each proxy dynamically batch transactions and process transactions in batches. A proxy keeps batching transactions until the batch time exceeds a configurable timeout value or until the number of transactions exceed a configurable value or until the total bytes of the batch exceeds a dynamically calculated desired size. +    * The network overhead is 1 network communication per batch of commit transactions;+    * How is the dynamically calculated batch size calculated: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L1770-L1774+    * How commit transactions are batched: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L416-L486+    * How each transaction batch is handled: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L523-L1174+    * Where does proxy sends commit timestamp request to the timestamp generator:  https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L586-L587+* **Step 4 (Proxy builds transactions’ conflict ranges)**: Because the concurrency control component may have multiple processes, each of which is responsible for resolving conflicts in a key range, the proxy needs to build one transaction-conflict-resolution request for each concurrency control process: For each transaction, the proxy splits its read and write conflict ranges based on concurrency control process’ responsible ranges. The proxy will create k conflict resolution requests for each transaction, where k is the number of processes in the concurrency control component. +    * Implementation: https://github.com/xumengpanda/foundationdb/blob/4086e3a2750b776cc8bfb0f0e463fe00ac905595/fdbserver/MasterProxyServer.actor.cpp#L607-L618+* **Step 5 (Proxy sends conflict resolution requests to concurrency control)**: Each concurrency control process is responsible for checking conflicts in a key range. Each process checks if the transaction has conflicts with other transactions in its  key-range. Each process returns the conflict checking result back to the proxy.+    * What is conflict range? +        * A transaction’s write conflict range includes any key and key-ranges that are modified in the transactions. +        * A transaction’s read conflict range includes any key and key-ranges that are read in the transaction. +        * Client can also use transaction options to add explicit read-conflict-range or write-conflict-range. Example: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L2634-L2635+    * **Piggy-back metadata change**. If the transaction changes database’s metadata, such as locking the database, the change is considered as a special mutation and also checked for conflicts by the concurrency control component.  The primary difference between metadata mutation and normal mutations is that the metadata change must be propagated to all proxies so that all proxies have a consistent view of database’s metadata. This is achieved by piggy-backing metadata change in the reply from resolver to proxies.  +    * Implementation+        * Create conflict resolution requests for a batch of transactions: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L607-L618+        * Metadata mutations are sent from proxy to concurrency control processes: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L366-L369+* **Step 6 (Resolve conflicts among concurrent transactions)**: Each concurrency control process checks conflicts among transactions based on the theory in [1]. In a nutshell, it checks for read-write conflicts. Suppose two transactions operates on the same key. If a write transaction’s time overlaps between another read-write transaction’s start time and commit time, only one transaction can commit: the one that arrives first at all concurrency control processes will commit.+    * Implementation+        * Proxy sends conflict checking request: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L626-L629+        * Concurrency control process handles the request: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/Resolver.actor.cpp#L320-L322+* **Step 7 (Proxy’s post resolution processing)**: Once the proxy receives conflict-resolution replies from all concurrency control processes, it performs three steps+    * **Step 7-1 (Apply metadata effect caused by other proxies)**: As mentioned above, when a proxy changes database’s metadata, the metadata mutations will be propagated via the concurrency control component to other proxies. So the proxy needs to first compute and apply these metadata mutations onto the proxy’s local states. Otherwise, the proxy will operate in a different view of database’s metadata.+        * For example, if one proxy locks the database in a committed transaction at time t1, all other proxies should have seen the lock immediately after t1. Since another proxy may have transactions in flight already at t1, the proxy must first apply the “lock“ effect before it can process its in-flight transactions.+        * How metadata effect is applied in implementation:  https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L678-L719+    * **Step 7-2 (Determine which transactions are committed)**: Proxy combines results from all concurrency control processes. Only if all concurrency control processes say a transaction is committed, will the transaction be considered as committed by the proxy. +        * Implementation: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L721-L757+    * **Step 7-3 (Apply metadata effect caused by this proxy)**: For each committed transaction, this proxy applies its metadata mutations to the proxy’s local state. +        * Note: These metadata mutations are also sent to concurrency control processes and propagated to other proxies at Step 5. This step is to apply metadata effect on its own proxy’s states.+        * Implementation: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L763-L777+    * **Step 7-4 (Assign mutations to storage servers and serialize them)**: In order to let the rest of system (the queuing system and storage system) know which process a mutation should be routed to, the proxy needs to add tags to mutations. The proxy serializes mutations with the same tag into the same message and sends the serialized message to the queuing system.+        * Implementation of adding tags and serializing mutations into messages: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L800-L910+        * The lines that add tags to a mutation and serialize it: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L846-L847+    * **Step 7-5 (Duplicate and serialize mutations to backup system keyspace)**:  When backup or disaster recovery (DR) is enabled, each proxy captures mutation streams into a dedicated system keyspace. Mutations in a transaction batch are serialized as a single mutation in a dedicated system keyspace. +        * How mutations are duplicated for backup and DR: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L912-L986+        * Note: FDB will have a new backup system that avoids duplicating mutations to the system keyspace. Its design is similar to database’s Change Data Capture (CDC) design. The new backup system is not production-ready yet.+* **Step 8 (Make mutation messages durable in the queuing system)**: Proxy sends serialized mutation messages to the queuing system. The queuing system will append the mutation to an append-only file, fsync it, and send the respnose back. Each message has a tag, which decides which process in the queuing system the message should be sent to. The queuing system returns to the proxy the minimum known committed version, which is the smallest commit version among all proxies. The minimum known commit version is used when the system recovers from fault.+    * Sending messages to the queuing system is abstracted into a push() operation: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L1045+    * The minimum known committed version is called minKnownCommittedVersion. It is updated for each commit: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L1067+* **Step 9 (Reply to client)**: Proxy replies the transaction’s result to client. If the transaction fails (say due to transaction conflicts), proxy sends the error message to the client. +    * Reply to clients based on different transaction’s results: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/MasterProxyServer.actor.cpp#L1117-L1138+* **Step 10 (Storage systems pull data from queuing system)**: Storage system asynchronously pulls data from queuing system and indexes data for read path. +    * Each SS has a primary process (called primary tLog) in the queuing system to pull data from the SS’s data from the queuing system. Each SS only gets in-ordered streams of mutations that are owned by the SS. +    * In failure scenario when a SS cannot reach the primary tLog, the SS will pull data from different tLogs that have part of the SS’s data. The SS will then merge the stream of data from different tLogs.+    * Each SS does not make its pulled data durable to disk until the data becomes at least 5 seconds older than the most recent data the SS has pulled. This allows each SS to roll back at least 5 seconds of mutations. +    * Why do we need roll back feature for SS? This comes from an optimization used in FDB. To make a mutation available in a SS as soon as possible, a SS may fetch a mutation from the queuing system that has not been fully replicated. The mutation’s transaction may be aborted in rare situations, such as when FDB has to recover from faults and decides to throw away the last few non-fully-durable transactions. SSes must throw away data in the aborted transactions. +    * Why does SS not make data durable until 5 seconds later? This is because today’s SS does not support rolling back data that has already been made durable on disk. To support roll back, SS keeps data that might be rolled back in memory. When roll-back is needed, SS just throws away the in-memory data. This simplifies the SS implementation.+++    * Each storage process pulls data from the queuing system: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbserver/storageserver.actor.cpp#L3593-L3599++Read write path of a transaction+====================================++This section uses an example transaction to describe how a transaction with both read and write operation works in FDB.++Suppose application creates the following transaction, where *Future<int>* is an object that holds an asynchronous call and becomes ready when the async call returns, and *wait()* is a synchronous point when the code waits for futures to be ready.+The following code reads key k1 and k2 from database,  increases k1’s value by 1 and write back k1’s new value into database.++**Example Transaction** ::++    Line1: Transaction tr;+    Line2: Future<int> fv1 = tr.get(k1);+    Line3: Future<int> fv2 = tr.get(k2);+    Line4: v1 = wait(fv1);+    Line5: v2 = wait(fv2);+    Line6: tr.set(v1+v2);+    Line7: tr.commit();++The transaction starts with the read path:++* When tr.get() is called, FDB client issues a timestamp request to proxies *if* the transaction has not set its start timestamp. The logic is the Step 1 in the read path;+* Batching timestamp requests. When another tr.get() is called, it will try to get a timestamp as well. If we let every get request to follow the Step 1 in the read path, the performance overhead (especially network communication) will be a lot. In addition, this is not necessary because a transaction has only one start timestamp. To solve this problem, client chooses to batch timestamp requests from the same transaction and only issues one timestamp request when the transaction size reaches a preconfigured threshold or when the transaction duration reaches the batching timeout threshold. +    * Timestamp requests are batched: https://github.com/xumengpanda/foundationdb/blob/4086e3a2750b776cc8bfb0f0e463fe00ac905595/fdbclient/NativeAPI.actor.cpp#L3185
    * Timestamp requests are batched: https://github.com/apple/foundationdb/blob/4086e3a2750b776cc8bfb0f0e463fe00ac905595/fdbclient/NativeAPI.actor.cpp#L3185
xumengpanda

comment created time in a day

Pull request review commentapple/foundationdb

Documentation: How FDB read and write path works in FDB 6.2

+##############################+FDB Read and Write Path (WiP)+##############################++| Author: Meng Xu+| Reviewer: Evan Tschannen, Jingyu Zhou+| Audience: FDB developers, SREs and expert users.++This document explains how FDB works at high level in database terms without mentioning FDB internal concepts.++We first discuss the read path and the write path separately for a single transaction. We then describe how the read path and write path work together for a read and write transaction. At the last section, we illustrate how multiple outstanding write transactions are processed and *ordered* in FDB. The processing order of multiple transactions is important because it affects the parallelism of transaction processing and the write throughput.++The content is based on FDB 6.2 and is true for FDB 6.3.  A new timestamp proxy role is introduced in FDB 6.4+, which affects the read path. We will discuss the timestamp proxy role in the future version of this document.++.. image:: /images/FDB_read_path.png++Components+=================++FDB is built on top of several key components. The terms below are common database or distributed system terms, instead of FDB specific terms.++**Timestamp generator.** It serves logical time,  which defines happen-before relation: An event at t1 happens before another event at t2, if t1 < t2. The logic time is used to order events in FDB distributed systems and it is used by concurrency control to decide if two transactions have conflicts.  The logical time is the timestamp for a transaction. ++* A read-only transaction has only one timestamp which is assigned when the transaction is created;+* A read-write transaction has one timestamp at the transaction’s creation time and one timestamp at its commit time.+++**Concurrency Control.** It decides if two transactions can be executed concurrently without violating Strict Serializable Isolation (SSI) property.  It uses the Optimistic Concurrency Control (OCC) mechanism described in [SSI] to achieve that. ++**Client.** It is a library, an FDB application (e.g., CK) uses, to access the database. It exposes the transaction concept to applications.  Client in FDB is a *fat* client that does multiple complex operations: (1) It calculates read and write conflict ranges for transactions; (2) it batches transaction requests for better throughput; (3) it automatically retries failed transactions.++**Proxies.** It is a subsystem that acts like reverse proxies to serve clients’ requests. Its main purposes is:++* Serve for read request by (1) serving the logical time to client; and (2) providing which storage server has data for a key;+* Process write transactions on behalf of clients and return the results;++Each proxy has the system’s metadata, called transaction state store (txnStateStore). The metadata decides: (1) which key should go to which storage servers in the storage system; (2) which key should go to which processes in the durable queuing system; (3) is the database locked; etc. ++The metadata on all proxies are consistent at any given timestamp.  To achieve that, when a proxy has a metadata mutation that changes the metadata at the timestamp V1, the mutation is propagated to all proxies (through the concurrency control component) and its effect is applied on all proxies before any proxy can process transactions after the timestamp V1.++**Durable queuing system.** It is a queuing system for write traffic. Its producers are proxies that send transaction mutation data for durability purpose. Its consumers are storage systems that index data and serve read request. The queuing system is partitioned for the key-space. A shard (i.e., key-range) is mapped to *k* log processes in the queuing system, where *k* is the replication factor. The mapping between shard and storage servers decides the mapping between shard and log processes.++**Storage system.** It is a collection of storage servers (SS), each of which is a sqlite database running on a single thread. It indexes data and serves read requests. Each SS has an in-memory p-tree data structure that stores the past 5-second mutations and an on-disk sqlite data. The in-memory data structure can serve multiple versions of key-values in the past 5 seconds. Due to memory limit, the in-memory data cannot hold more than 5 seconds’ multi-version key-values, which is the root cause why FDB’s transactions cannot be longer than 5 seconds. The on-disk sqlite data has only the most-recent key-value. ++**Zookeeper like system.** The system solves two main problems:++* Store the configuration of the transaction system, which includes information such as which processes are proxies, which processes belong to the queuing system, and which processes belong to the concurrency control, etc.  The system used to be zookeeper. FDB later replaced it with its own implementation. +* Service discovery. Processes in the zookeeper-like system serve as well-known endpoints for clients to connect to the cluster. These well-known endpoint returns the list of proxies to clients.++++Read path of a transaction+==================================++Fig. 1 above shows a high-level view of the read path. An application uses FDB client library to read data. It creates a transaction and calls its read() function. The read() operation will lead to several steps.++* **Step 1 (Timestamp request)**: The read operation needs a timestamp. The client initiates the timestamp request through an RPC to proxy. The request will trigger Step 2 and Step 3;+    * To improve throughput and reduce load on the server side, each client dynamically batches the timestamp requests. A client keeps adding requests to the current batch until *when* the number of requests in a batch exceeds a configurable threshold or *when* the batching times out at a dynamically computed threshold. Each batch sends only one timestamp request to proxy and all requests in the same batch share the same timestamp.+* **Step 2 ****(Get latest commit version)**: When the timestamp request arrives at a proxy, the proxy wants to get the largest commit version as the return value. So it contacts the rest of (n-1) proxies for their latest commit versions and uses the largest one as the return value for Step 1. +    * O(n^2) communication cost: Because each proxy needs to contact the rest of (n-1) proxies to serve clients’ timestamp request, the communication cost is n*(n-1), where n is the number of proxies;+    * Batching: To reduce communication cost, each proxy batches clients’ timestamp requests for a configurable time period (say 1ms) and return the same timestamp for requests in the same batch.+* **Step 3 (Confirm proxy’s liveness)**: To prevent proxies that are no longer a part of the system (such as due to network partition) from serving requests, each proxy contacts the queuing system for each timestamp request to confirm it is still valid proxies. This is based on the FDB property that at most one active queuing system is available at any given time.+    * Why do we need this step? This is to achieve consensus (i.e., external consistency). Compared to serializable isolation, Strict Serializable Isolation (SSI) requires external consistency. It means the timestamp received by clients cannot decrease. If we do not have step and network partition happens, a set of old proxies that are disconnected from the rest of systems can still serve timestamp requests to clients. These timestamps can be smaller than the new generation of proxies, which breaks the external consistency in SSI.+    * O(n * m) communication cost: To confirm a proxy’s liveness, the proxy has to contact all members in the queuing system to ensure the queuing system is still active. This causes *m* network communication, where *m* is the number of processes in the queuing system. A system with n proxies will have O(n * m) network communications at the step 3. In our deployment, n is typically equal to m;+    * Do FDB production clusters have this overhead? No. Our production clusters disable the external consistency by configuring the knob ALWAYS_CAUSAL_READ_RISKY.+* **Step 4 (Locality request)**: The client gets which storage servers have its requested keys by sending another RPC to proxy. This step returns a set of  *k* storage server interfaces, where k is the replication factor;+    * Client cache mechanism: The key location will be cached in client. Future requests will use the cache to directly read from storage servers, which saves a trip to proxy. If location is stale, read will return error and client will retry and refresh the cache.++* **Step 5 (Get data request)**: The client uses the interfaces from step 4 to directly get its keys from those storage servers. +    * Direct read from client’s memory: If a key’s value exists in the client’s memory, the client reads it directly from its local memory. This happens when a client updates a key’s value and later reads it. This optimization reduces the amount of unnecessary requests to storage servers.+    * Load balance: Each data exists on k storage servers, where k is the replication factor. To balance the load across the k replicas, client has a load balancing algorithm to balance the number of requests to each replica.+    * Transaction succeed: If the storage server has the data at the read timestamp, the client will receive the data and return succeed. +    * Transaction too old error: If the read request’s timestamp is older than 5 seconds, storage server may have already flushed the data from its in-memory multi-version data structure to its on-disk single-version data structure. This means storage server does not have the data older than 5 seconds. So client will receive transaction too old error. The client will retry with a new timestamp. One scenario that can lead to the error is when it takes too long for a client to send the read request after it gets the timestamp.+    * Future transaction error: Each storage server pulls data in increasing order of data’s timestamp from the queuing system. Let’s define a storage server’s timestamp as the largest timestamp of data the storage server has. If the read request’s timestamp is larger than the storage server’s timestamp, the storage server will reply future-transaction-error to the client. The client will retry. One scenario that can lead to the error is when the connection between the SS and the queuing system is slow.+    * Wrong shard error: If keys in the request or result depend on data outside this storage server OR if a large selector offset prevents all data from being read in one range read. Client will invalidate its locality cache for the key and retry the read request at the failed key.++Implementation of FDB read path+------------------------------------------++* **Step 1 (Timestamp request)**: +    * Each read request tries to get a timestamp if its transaction has not got one: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L2104+    * Client batches the get-timestamp requests: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L3172+    * Dynamic batching algorithm: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L3101-L3104+* **Step 2 (Get latest commit version)**: Contacting (n-1) proxies for commit version: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L1196+* **Step 3 (Confirm proxy’s liveness)**:+    * We typically set our clusters’ knob ALWAYS_CAUSAL_READ_RISKY to 1 to skip this step+    * Proxy confirm queuing system is alive: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L1199+    * How is confirmEpochLive(..) implemented for the above item. https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/TagPartitionedLogSystem.actor.cpp#L1216-L1225+* **Step 4 (Locality request)**: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L1312-L1313+* **Step 5 (Get data request)**:+    * Logics of handling get value request: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L1306-L1396+    * Load balance algorithm: The loadBalance() at https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbclient/NativeAPI.actor.cpp#L1342-L1344++++Write path of a transaction+================================++Suppose a client has a write-only transaction. Fig. 2 below shows the write path in a non-HA cluster. We will discuss how a transaction with both read and write works in the next section.++.. image:: /images/FDB_write_path.png++To simplify the explanation, the steps below do not include transaction batching on proxy, which is a typical database technique to increase transaction throughput.++* **Step 1 (Client buffers write mutations):** Client buffers all writes in a transaction until commit is called on the transaction. In the rest of document, a write is also named as a mutation.+    * Client is a fat client that preprocess transactions: (a) For atomic operations, if client knows the key value, it will convert atomic operations to set operations; (b) For version stamp atomic operations, client adds extra bytes to key for the version stamp; (c) If a key has multiple operations, client coalesces them to one operation, etc.+    * How client buffers mutations: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbclient/NativeAPI.actor.cpp#L2345-L2361+* **Step 2 (Client commits the transaction):** When a client calls commit(), it performs several operations:+    * **Step 2-1**: Add extra conflict ranges that are added by user but cannot be calculated from mutations. +    * **Step 2-2**: Get a timestamp as the transaction’s start time. The timestamp does not need causal consistency because the transaction has no read. +        * This request goes to one of proxies. The proxy will contact all other (n-1) proxies to get the most recent commit version as it does in read path. The proxy does not need to contact log systems to confirm its activeness because it does not need causal consistency.+    * **Step 2-3**:  Sends the transaction’s information to a proxy. Load balancer in client decides which proxy will be used to handle a transaction. A transaction’s information includes:+        * All of its mutations;+        * Read and write conflict range;+        * Transaction options that control a transaction’s behavior. For example, should the transaction write when the DB is locked? Shall the transaction uses the first proxy in the proxy list to commit? +    * Implementation:+        * Transaction commit function: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbclient/NativeAPI.actor.cpp#L2895-L2899+        * Major work of commit in client side is done at here: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbclient/NativeAPI.actor.cpp#L2784-L2868+        * Step 2-1: Add extra conflict ranges: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbclient/NativeAPI.actor.cpp#L2826-L2828+        * Step 2-2: getReadVersion at commit which does not need external consistency because we do not have read in the transaction: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbclient/NativeAPI.actor.cpp#L2822-L2823+        * Step 2-3: Send transaction to a proxy via RPC: https://github.com/apple/foundationdb/blob/07e354c499158630d760283aa845440cbeaaa1ca/fdbclient/NativeAPI.actor.cpp#L2691-L2700+* When a proxy receives clients’ transactions, it commits the transaction on behalf of clients with Step 3 - 9.+* **Step 3 (Proxy gets commit timestamp)**: The proxy gets the timestamp of the transaction’s commit time from the time oracle through an RPC call.+    * To improve transaction throughput and reduce network communication overhead, each proxy dynamically batch transactions and process transactions in batches. A proxy keeps batching transactions until the batch time exceeds a configurable timeout value or until the number of transactions exceed a configurable value or until the total bytes of the batch exceeds a dynamically calculated desired size. +    * The network overhead is 1 network communication per batch of commit transactions;+    * How is the dynamically calculated batch size calculated: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L1770-L1774+    * How commit transactions are batched: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L416-L486+    * How each transaction batch is handled: https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L523-L1174+    * Where does proxy sends commit timestamp request to the timestamp generator:  https://github.com/apple/foundationdb/blob/4b0fba6ea89b51b82df7868ca24b81f6997db4e4/fdbserver/MasterProxyServer.actor.cpp#L586-L587+* **Step 4 (Proxy builds transactions’ conflict ranges)**: Because the concurrency control component may have multiple processes, each of which is responsible for resolving conflicts in a key range, the proxy needs to build one transaction-conflict-resolution request for each concurrency control process: For each transaction, the proxy splits its read and write conflict ranges based on concurrency control process’ responsible ranges. The proxy will create k conflict resolution requests for each transaction, where k is the number of processes in the concurrency control component. +    * Implementation: https://github.com/xumengpanda/foundationdb/blob/4086e3a2750b776cc8bfb0f0e463fe00ac905595/fdbserver/MasterProxyServer.actor.cpp#L607-L618
    * Implementation: https://github.com/apple/foundationdb/blob/4086e3a2750b776cc8bfb0f0e463fe00ac905595/fdbserver/MasterProxyServer.actor.cpp#L607-L618
xumengpanda

comment created time in a day

Pull request review commentapple/foundationdb

Documentation: How FDB read and write path works in FDB 6.2

+##############################+FDB Read and Write Path (WiP)+##############################++| Author: Meng Xu+| Reviewer: Evan Tschannen, Jingyu Zhou+| Audience: FDB developers, SREs and expert users.++This document explains how FDB works at high level in database terms without mentioning FDB internal concepts.++We first discuss the read path and the write path separately for a single transaction. We then describe how the read path and write path work together for a read and write transaction. At the last section, we illustrate how multiple outstanding write transactions are processed and *ordered* in FDB. The processing order of multiple transactions is important because it affects the parallelism of transaction processing and the write throughput.++The content is based on FDB 6.2 and is true for FDB 6.3.  A new timestamp proxy role is introduced in FDB 6.4+, which affects the read path. We will discuss the timestamp proxy role in the future version of this document.

Instead of saying 6.4+, maybe use post FDB 6.3, because we might just release 7.0.

xumengpanda

comment created time in a day

Pull request review commentapple/foundationdb

Documentation: How FDB read and write path works in FDB 6.2

+##############################+FDB Read and Write Path (WiP)+##############################++| Author: Meng Xu+| Reviewer: Evan Tschannen, Jingyu Zhou+| Audience: FDB developers, SREs and expert users.++This document explains how FDB works at high level in database terms without mentioning FDB internal concepts.++We first discuss the read path and the write path separately for a single transaction. We then describe how the read path and write path work together for a read and write transaction. At the last section, we illustrate how multiple outstanding write transactions are processed and *ordered* in FDB. The processing order of multiple transactions is important because it affects the parallelism of transaction processing and the write throughput.++The content is based on FDB 6.2 and is true for FDB 6.3.  A new timestamp proxy role is introduced in FDB 6.4+, which affects the read path. We will discuss the timestamp proxy role in the future version of this document.++.. image:: /images/FDB_read_path.png++Components+=================++FDB is built on top of several key components. The terms below are common database or distributed system terms, instead of FDB specific terms.++**Timestamp generator.** It serves logical time,  which defines happen-before relation: An event at t1 happens before another event at t2, if t1 < t2. The logic time is used to order events in FDB distributed systems and it is used by concurrency control to decide if two transactions have conflicts.  The logical time is the timestamp for a transaction. ++* A read-only transaction has only one timestamp which is assigned when the transaction is created;+* A read-write transaction has one timestamp at the transaction’s creation time and one timestamp at its commit time.+++**Concurrency Control.** It decides if two transactions can be executed concurrently without violating Strict Serializable Isolation (SSI) property.  It uses the Optimistic Concurrency Control (OCC) mechanism described in [SSI] to achieve that. ++**Client.** It is a library, an FDB application (e.g., CK) uses, to access the database. It exposes the transaction concept to applications.  Client in FDB is a *fat* client that does multiple complex operations: (1) It calculates read and write conflict ranges for transactions; (2) it batches transaction requests for better throughput; (3) it automatically retries failed transactions.++**Proxies.** It is a subsystem that acts like reverse proxies to serve clients’ requests. Its main purposes is:++* Serve for read request by (1) serving the logical time to client; and (2) providing which storage server has data for a key;+* Process write transactions on behalf of clients and return the results;++Each proxy has the system’s metadata, called transaction state store (txnStateStore). The metadata decides: (1) which key should go to which storage servers in the storage system; (2) which key should go to which processes in the durable queuing system; (3) is the database locked; etc. ++The metadata on all proxies are consistent at any given timestamp.  To achieve that, when a proxy has a metadata mutation that changes the metadata at the timestamp V1, the mutation is propagated to all proxies (through the concurrency control component) and its effect is applied on all proxies before any proxy can process transactions after the timestamp V1.++**Durable queuing system.** It is a queuing system for write traffic. Its producers are proxies that send transaction mutation data for durability purpose. Its consumers are storage systems that index data and serve read request. The queuing system is partitioned for the key-space. A shard (i.e., key-range) is mapped to *k* log processes in the queuing system, where *k* is the replication factor. The mapping between shard and storage servers decides the mapping between shard and log processes.++**Storage system.** It is a collection of storage servers (SS), each of which is a sqlite database running on a single thread. It indexes data and serves read requests. Each SS has an in-memory p-tree data structure that stores the past 5-second mutations and an on-disk sqlite data. The in-memory data structure can serve multiple versions of key-values in the past 5 seconds. Due to memory limit, the in-memory data cannot hold more than 5 seconds’ multi-version key-values, which is the root cause why FDB’s transactions cannot be longer than 5 seconds. The on-disk sqlite data has only the most-recent key-value. ++**Zookeeper like system.** The system solves two main problems:++* Store the configuration of the transaction system, which includes information such as which processes are proxies, which processes belong to the queuing system, and which processes belong to the concurrency control, etc.  The system used to be zookeeper. FDB later replaced it with its own implementation. 
* Store the configuration of the transaction system, which includes information such as generations of queuing systems and their processes. The system used to be zookeeper. FDB later replaced it with its own implementation. 

Information of stateless processes such as proxies and resolvers, is not stored.

xumengpanda

comment created time in a day

Pull request review commentapple/foundationdb

Documentation: How FDB read and write path works in FDB 6.2

+##############################+FDB Read and Write Path (WiP)+##############################++| Author: Meng Xu+| Reviewer: Evan Tschannen, Jingyu Zhou+| Audience: FDB developers, SREs and expert users.++This document explains how FDB works at high level in database terms without mentioning FDB internal concepts.++We first discuss the read path and the write path separately for a single transaction. We then describe how the read path and write path work together for a read and write transaction. At the last section, we illustrate how multiple outstanding write transactions are processed and *ordered* in FDB. The processing order of multiple transactions is important because it affects the parallelism of transaction processing and the write throughput.++The content is based on FDB 6.2 and is true for FDB 6.3.  A new timestamp proxy role is introduced in FDB 6.4+, which affects the read path. We will discuss the timestamp proxy role in the future version of this document.++.. image:: /images/FDB_read_path.png++Components+=================++FDB is built on top of several key components. The terms below are common database or distributed system terms, instead of FDB specific terms.++**Timestamp generator.** It serves logical time,  which defines happen-before relation: An event at t1 happens before another event at t2, if t1 < t2. The logic time is used to order events in FDB distributed systems and it is used by concurrency control to decide if two transactions have conflicts.  The logical time is the timestamp for a transaction. ++* A read-only transaction has only one timestamp which is assigned when the transaction is created;+* A read-write transaction has one timestamp at the transaction’s creation time and one timestamp at its commit time.+++**Concurrency Control.** It decides if two transactions can be executed concurrently without violating Strict Serializable Isolation (SSI) property.  It uses the Optimistic Concurrency Control (OCC) mechanism described in [SSI] to achieve that. ++**Client.** It is a library, an FDB application (e.g., CK) uses, to access the database. It exposes the transaction concept to applications.  Client in FDB is a *fat* client that does multiple complex operations: (1) It calculates read and write conflict ranges for transactions; (2) it batches transaction requests for better throughput; (3) it automatically retries failed transactions.++**Proxies.** It is a subsystem that acts like reverse proxies to serve clients’ requests. Its main purposes is:++* Serve for read request by (1) serving the logical time to client; and (2) providing which storage server has data for a key;+* Process write transactions on behalf of clients and return the results;++Each proxy has the system’s metadata, called transaction state store (txnStateStore). The metadata decides: (1) which key should go to which storage servers in the storage system; (2) which key should go to which processes in the durable queuing system; (3) is the database locked; etc. ++The metadata on all proxies are consistent at any given timestamp.  To achieve that, when a proxy has a metadata mutation that changes the metadata at the timestamp V1, the mutation is propagated to all proxies (through the concurrency control component) and its effect is applied on all proxies before any proxy can process transactions after the timestamp V1.++**Durable queuing system.** It is a queuing system for write traffic. Its producers are proxies that send transaction mutation data for durability purpose. Its consumers are storage systems that index data and serve read request. The queuing system is partitioned for the key-space. A shard (i.e., key-range) is mapped to *k* log processes in the queuing system, where *k* is the replication factor. The mapping between shard and storage servers decides the mapping between shard and log processes.++**Storage system.** It is a collection of storage servers (SS), each of which is a sqlite database running on a single thread. It indexes data and serves read requests. Each SS has an in-memory p-tree data structure that stores the past 5-second mutations and an on-disk sqlite data. The in-memory data structure can serve multiple versions of key-values in the past 5 seconds. Due to memory limit, the in-memory data cannot hold more than 5 seconds’ multi-version key-values, which is the root cause why FDB’s transactions cannot be longer than 5 seconds. The on-disk sqlite data has only the most-recent key-value. ++**Zookeeper like system.** The system solves two main problems:++* Store the configuration of the transaction system, which includes information such as which processes are proxies, which processes belong to the queuing system, and which processes belong to the concurrency control, etc.  The system used to be zookeeper. FDB later replaced it with its own implementation. +* Service discovery. Processes in the zookeeper-like system serve as well-known endpoints for clients to connect to the cluster. These well-known endpoint returns the list of proxies to clients.++++Read path of a transaction+==================================++Fig. 1 above shows a high-level view of the read path. An application uses FDB client library to read data. It creates a transaction and calls its read() function. The read() operation will lead to several steps.++* **Step 1 (Timestamp request)**: The read operation needs a timestamp. The client initiates the timestamp request through an RPC to proxy. The request will trigger Step 2 and Step 3;+    * To improve throughput and reduce load on the server side, each client dynamically batches the timestamp requests. A client keeps adding requests to the current batch until *when* the number of requests in a batch exceeds a configurable threshold or *when* the batching times out at a dynamically computed threshold. Each batch sends only one timestamp request to proxy and all requests in the same batch share the same timestamp.+* **Step 2 ****(Get latest commit version)**: When the timestamp request arrives at a proxy, the proxy wants to get the largest commit version as the return value. So it contacts the rest of (n-1) proxies for their latest commit versions and uses the largest one as the return value for Step 1. +    * O(n^2) communication cost: Because each proxy needs to contact the rest of (n-1) proxies to serve clients’ timestamp request, the communication cost is n*(n-1), where n is the number of proxies;+    * Batching: To reduce communication cost, each proxy batches clients’ timestamp requests for a configurable time period (say 1ms) and return the same timestamp for requests in the same batch.+* **Step 3 (Confirm proxy’s liveness)**: To prevent proxies that are no longer a part of the system (such as due to network partition) from serving requests, each proxy contacts the queuing system for each timestamp request to confirm it is still valid proxies. This is based on the FDB property that at most one active queuing system is available at any given time.+    * Why do we need this step? This is to achieve consensus (i.e., external consistency). Compared to serializable isolation, Strict Serializable Isolation (SSI) requires external consistency. It means the timestamp received by clients cannot decrease. If we do not have step and network partition happens, a set of old proxies that are disconnected from the rest of systems can still serve timestamp requests to clients. These timestamps can be smaller than the new generation of proxies, which breaks the external consistency in SSI.+    * O(n * m) communication cost: To confirm a proxy’s liveness, the proxy has to contact all members in the queuing system to ensure the queuing system is still active. This causes *m* network communication, where *m* is the number of processes in the queuing system. A system with n proxies will have O(n * m) network communications at the step 3. In our deployment, n is typically equal to m;+    * Do FDB production clusters have this overhead? No. Our production clusters disable the external consistency by configuring the knob ALWAYS_CAUSAL_READ_RISKY.+* **Step 4 (Locality request)**: The client gets which storage servers have its requested keys by sending another RPC to proxy. This step returns a set of  *k* storage server interfaces, where k is the replication factor;+    * Client cache mechanism: The key location will be cached in client. Future requests will use the cache to directly read from storage servers, which saves a trip to proxy. If location is stale, read will return error and client will retry and refresh the cache.++* **Step 5 (Get data request)**: The client uses the interfaces from step 4 to directly get its keys from those storage servers. 
* **Step 5 (Get data request)**: The client uses the location information from step 4 to directly query keys from corresponding storage servers. 
xumengpanda

comment created time in a day

Pull request review commentapple/foundationdb

Documentation: How FDB read and write path works in FDB 6.2

+##############################+FDB Read and Write Path (WiP)+##############################++| Author: Meng Xu+| Reviewer: Evan Tschannen, Jingyu Zhou+| Audience: FDB developers, SREs and expert users.++This document explains how FDB works at high level in database terms without mentioning FDB internal concepts.++We first discuss the read path and the write path separately for a single transaction. We then describe how the read path and write path work together for a read and write transaction. At the last section, we illustrate how multiple outstanding write transactions are processed and *ordered* in FDB. The processing order of multiple transactions is important because it affects the parallelism of transaction processing and the write throughput.++The content is based on FDB 6.2 and is true for FDB 6.3.  A new timestamp proxy role is introduced in FDB 6.4+, which affects the read path. We will discuss the timestamp proxy role in the future version of this document.++.. image:: /images/FDB_read_path.png++Components+=================++FDB is built on top of several key components. The terms below are common database or distributed system terms, instead of FDB specific terms.++**Timestamp generator.** It serves logical time,  which defines happen-before relation: An event at t1 happens before another event at t2, if t1 < t2. The logic time is used to order events in FDB distributed systems and it is used by concurrency control to decide if two transactions have conflicts.  The logical time is the timestamp for a transaction. ++* A read-only transaction has only one timestamp which is assigned when the transaction is created;+* A read-write transaction has one timestamp at the transaction’s creation time and one timestamp at its commit time.+++**Concurrency Control.** It decides if two transactions can be executed concurrently without violating Strict Serializable Isolation (SSI) property.  It uses the Optimistic Concurrency Control (OCC) mechanism described in [SSI] to achieve that. ++**Client.** It is a library, an FDB application (e.g., CK) uses, to access the database. It exposes the transaction concept to applications.  Client in FDB is a *fat* client that does multiple complex operations: (1) It calculates read and write conflict ranges for transactions; (2) it batches transaction requests for better throughput; (3) it automatically retries failed transactions.++**Proxies.** It is a subsystem that acts like reverse proxies to serve clients’ requests. Its main purposes is:++* Serve for read request by (1) serving the logical time to client; and (2) providing which storage server has data for a key;+* Process write transactions on behalf of clients and return the results;++Each proxy has the system’s metadata, called transaction state store (txnStateStore). The metadata decides: (1) which key should go to which storage servers in the storage system; (2) which key should go to which processes in the durable queuing system; (3) is the database locked; etc. ++The metadata on all proxies are consistent at any given timestamp.  To achieve that, when a proxy has a metadata mutation that changes the metadata at the timestamp V1, the mutation is propagated to all proxies (through the concurrency control component) and its effect is applied on all proxies before any proxy can process transactions after the timestamp V1.++**Durable queuing system.** It is a queuing system for write traffic. Its producers are proxies that send transaction mutation data for durability purpose. Its consumers are storage systems that index data and serve read request. The queuing system is partitioned for the key-space. A shard (i.e., key-range) is mapped to *k* log processes in the queuing system, where *k* is the replication factor. The mapping between shard and storage servers decides the mapping between shard and log processes.++**Storage system.** It is a collection of storage servers (SS), each of which is a sqlite database running on a single thread. It indexes data and serves read requests. Each SS has an in-memory p-tree data structure that stores the past 5-second mutations and an on-disk sqlite data. The in-memory data structure can serve multiple versions of key-values in the past 5 seconds. Due to memory limit, the in-memory data cannot hold more than 5 seconds’ multi-version key-values, which is the root cause why FDB’s transactions cannot be longer than 5 seconds. The on-disk sqlite data has only the most-recent key-value. ++**Zookeeper like system.** The system solves two main problems:++* Store the configuration of the transaction system, which includes information such as which processes are proxies, which processes belong to the queuing system, and which processes belong to the concurrency control, etc.  The system used to be zookeeper. FDB later replaced it with its own implementation. +* Service discovery. Processes in the zookeeper-like system serve as well-known endpoints for clients to connect to the cluster. These well-known endpoint returns the list of proxies to clients.++++Read path of a transaction+==================================++Fig. 1 above shows a high-level view of the read path. An application uses FDB client library to read data. It creates a transaction and calls its read() function. The read() operation will lead to several steps.++* **Step 1 (Timestamp request)**: The read operation needs a timestamp. The client initiates the timestamp request through an RPC to proxy. The request will trigger Step 2 and Step 3;+    * To improve throughput and reduce load on the server side, each client dynamically batches the timestamp requests. A client keeps adding requests to the current batch until *when* the number of requests in a batch exceeds a configurable threshold or *when* the batching times out at a dynamically computed threshold. Each batch sends only one timestamp request to proxy and all requests in the same batch share the same timestamp.+* **Step 2 ****(Get latest commit version)**: When the timestamp request arrives at a proxy, the proxy wants to get the largest commit version as the return value. So it contacts the rest of (n-1) proxies for their latest commit versions and uses the largest one as the return value for Step 1. +    * O(n^2) communication cost: Because each proxy needs to contact the rest of (n-1) proxies to serve clients’ timestamp request, the communication cost is n*(n-1), where n is the number of proxies;+    * Batching: To reduce communication cost, each proxy batches clients’ timestamp requests for a configurable time period (say 1ms) and return the same timestamp for requests in the same batch.+* **Step 3 (Confirm proxy’s liveness)**: To prevent proxies that are no longer a part of the system (such as due to network partition) from serving requests, each proxy contacts the queuing system for each timestamp request to confirm it is still valid proxies. This is based on the FDB property that at most one active queuing system is available at any given time.
* **Step 3 (Confirm proxy’s liveness)**: To prevent proxies that are no longer a part of the system (such as due to network partition) from serving requests, each proxy contacts the queuing system for each timestamp request to confirm it is still a valid proxy (i.e., not replaced by a newer generation proxy process). This is based on the FDB property that at most one active queuing system is available at any given time.
xumengpanda

comment created time in a day

Pull request review commentapple/foundationdb

Documentation: How FDB read and write path works in FDB 6.2

+##############################+FDB Read and Write Path (WiP)+##############################++| Author: Meng Xu+| Reviewer: Evan Tschannen, Jingyu Zhou+| Audience: FDB developers, SREs and expert users.++This document explains how FDB works at high level in database terms without mentioning FDB internal concepts.++We first discuss the read path and the write path separately for a single transaction. We then describe how the read path and write path work together for a read and write transaction. At the last section, we illustrate how multiple outstanding write transactions are processed and *ordered* in FDB. The processing order of multiple transactions is important because it affects the parallelism of transaction processing and the write throughput.++The content is based on FDB 6.2 and is true for FDB 6.3.  A new timestamp proxy role is introduced in FDB 6.4+, which affects the read path. We will discuss the timestamp proxy role in the future version of this document.++.. image:: /images/FDB_read_path.png++Components+=================++FDB is built on top of several key components. The terms below are common database or distributed system terms, instead of FDB specific terms.++**Timestamp generator.** It serves logical time,  which defines happen-before relation: An event at t1 happens before another event at t2, if t1 < t2. The logic time is used to order events in FDB distributed systems and it is used by concurrency control to decide if two transactions have conflicts.  The logical time is the timestamp for a transaction. ++* A read-only transaction has only one timestamp which is assigned when the transaction is created;+* A read-write transaction has one timestamp at the transaction’s creation time and one timestamp at its commit time.+++**Concurrency Control.** It decides if two transactions can be executed concurrently without violating Strict Serializable Isolation (SSI) property.  It uses the Optimistic Concurrency Control (OCC) mechanism described in [SSI] to achieve that. ++**Client.** It is a library, an FDB application (e.g., CK) uses, to access the database. It exposes the transaction concept to applications.  Client in FDB is a *fat* client that does multiple complex operations: (1) It calculates read and write conflict ranges for transactions; (2) it batches transaction requests for better throughput; (3) it automatically retries failed transactions.++**Proxies.** It is a subsystem that acts like reverse proxies to serve clients’ requests. Its main purposes is:++* Serve for read request by (1) serving the logical time to client; and (2) providing which storage server has data for a key;+* Process write transactions on behalf of clients and return the results;++Each proxy has the system’s metadata, called transaction state store (txnStateStore). The metadata decides: (1) which key should go to which storage servers in the storage system; (2) which key should go to which processes in the durable queuing system; (3) is the database locked; etc. ++The metadata on all proxies are consistent at any given timestamp.  To achieve that, when a proxy has a metadata mutation that changes the metadata at the timestamp V1, the mutation is propagated to all proxies (through the concurrency control component) and its effect is applied on all proxies before any proxy can process transactions after the timestamp V1.++**Durable queuing system.** It is a queuing system for write traffic. Its producers are proxies that send transaction mutation data for durability purpose. Its consumers are storage systems that index data and serve read request. The queuing system is partitioned for the key-space. A shard (i.e., key-range) is mapped to *k* log processes in the queuing system, where *k* is the replication factor. The mapping between shard and storage servers decides the mapping between shard and log processes.++**Storage system.** It is a collection of storage servers (SS), each of which is a sqlite database running on a single thread. It indexes data and serves read requests. Each SS has an in-memory p-tree data structure that stores the past 5-second mutations and an on-disk sqlite data. The in-memory data structure can serve multiple versions of key-values in the past 5 seconds. Due to memory limit, the in-memory data cannot hold more than 5 seconds’ multi-version key-values, which is the root cause why FDB’s transactions cannot be longer than 5 seconds. The on-disk sqlite data has only the most-recent key-value. ++**Zookeeper like system.** The system solves two main problems:++* Store the configuration of the transaction system, which includes information such as which processes are proxies, which processes belong to the queuing system, and which processes belong to the concurrency control, etc.  The system used to be zookeeper. FDB later replaced it with its own implementation. +* Service discovery. Processes in the zookeeper-like system serve as well-known endpoints for clients to connect to the cluster. These well-known endpoint returns the list of proxies to clients.++++Read path of a transaction+==================================++Fig. 1 above shows a high-level view of the read path. An application uses FDB client library to read data. It creates a transaction and calls its read() function. The read() operation will lead to several steps.++* **Step 1 (Timestamp request)**: The read operation needs a timestamp. The client initiates the timestamp request through an RPC to proxy. The request will trigger Step 2 and Step 3;+    * To improve throughput and reduce load on the server side, each client dynamically batches the timestamp requests. A client keeps adding requests to the current batch until *when* the number of requests in a batch exceeds a configurable threshold or *when* the batching times out at a dynamically computed threshold. Each batch sends only one timestamp request to proxy and all requests in the same batch share the same timestamp.+* **Step 2 ****(Get latest commit version)**: When the timestamp request arrives at a proxy, the proxy wants to get the largest commit version as the return value. So it contacts the rest of (n-1) proxies for their latest commit versions and uses the largest one as the return value for Step 1. +    * O(n^2) communication cost: Because each proxy needs to contact the rest of (n-1) proxies to serve clients’ timestamp request, the communication cost is n*(n-1), where n is the number of proxies;+    * Batching: To reduce communication cost, each proxy batches clients’ timestamp requests for a configurable time period (say 1ms) and return the same timestamp for requests in the same batch.+* **Step 3 (Confirm proxy’s liveness)**: To prevent proxies that are no longer a part of the system (such as due to network partition) from serving requests, each proxy contacts the queuing system for each timestamp request to confirm it is still valid proxies. This is based on the FDB property that at most one active queuing system is available at any given time.+    * Why do we need this step? This is to achieve consensus (i.e., external consistency). Compared to serializable isolation, Strict Serializable Isolation (SSI) requires external consistency. It means the timestamp received by clients cannot decrease. If we do not have step and network partition happens, a set of old proxies that are disconnected from the rest of systems can still serve timestamp requests to clients. These timestamps can be smaller than the new generation of proxies, which breaks the external consistency in SSI.+    * O(n * m) communication cost: To confirm a proxy’s liveness, the proxy has to contact all members in the queuing system to ensure the queuing system is still active. This causes *m* network communication, where *m* is the number of processes in the queuing system. A system with n proxies will have O(n * m) network communications at the step 3. In our deployment, n is typically equal to m;+    * Do FDB production clusters have this overhead? No. Our production clusters disable the external consistency by configuring the knob ALWAYS_CAUSAL_READ_RISKY.+* **Step 4 (Locality request)**: The client gets which storage servers have its requested keys by sending another RPC to proxy. This step returns a set of  *k* storage server interfaces, where k is the replication factor;+    * Client cache mechanism: The key location will be cached in client. Future requests will use the cache to directly read from storage servers, which saves a trip to proxy. If location is stale, read will return error and client will retry and refresh the cache.++* **Step 5 (Get data request)**: The client uses the interfaces from step 4 to directly get its keys from those storage servers. +    * Direct read from client’s memory: If a key’s value exists in the client’s memory, the client reads it directly from its local memory. This happens when a client updates a key’s value and later reads it. This optimization reduces the amount of unnecessary requests to storage servers.+    * Load balance: Each data exists on k storage servers, where k is the replication factor. To balance the load across the k replicas, client has a load balancing algorithm to balance the number of requests to each replica.+    * Transaction succeed: If the storage server has the data at the read timestamp, the client will receive the data and return succeed. 
    * Successful read: If the storage server has the data at the read timestamp, the client will receive the data successfully.
xumengpanda

comment created time in a day

Pull request review commentapple/foundationdb

Documentation: How FDB read and write path works in FDB 6.2

+##############################+FDB Read and Write Path (WiP)+##############################++| Author: Meng Xu+| Reviewer: Evan Tschannen, Jingyu Zhou+| Audience: FDB developers, SREs and expert users.++This document explains how FDB works at high level in database terms without mentioning FDB internal concepts.++We first discuss the read path and the write path separately for a single transaction. We then describe how the read path and write path work together for a read and write transaction. At the last section, we illustrate how multiple outstanding write transactions are processed and *ordered* in FDB. The processing order of multiple transactions is important because it affects the parallelism of transaction processing and the write throughput.
We first discuss the read path and the write path separately for a single transaction. We then describe how the read path and write path work together for a read and write transaction. In the last section, we illustrate how multiple outstanding write transactions are processed and *ordered* in FDB. The processing order of multiple transactions is important because it affects the parallelism of transaction processing and the write throughput.
xumengpanda

comment created time in a day

Pull request review commentapple/foundationdb

Documentation: How FDB read and write path works in FDB 6.2

+##############################+FDB Read and Write Path (WiP)+##############################++| Author: Meng Xu+| Reviewer: Evan Tschannen, Jingyu Zhou+| Audience: FDB developers, SREs and expert users.++This document explains how FDB works at high level in database terms without mentioning FDB internal concepts.++We first discuss the read path and the write path separately for a single transaction. We then describe how the read path and write path work together for a read and write transaction. At the last section, we illustrate how multiple outstanding write transactions are processed and *ordered* in FDB. The processing order of multiple transactions is important because it affects the parallelism of transaction processing and the write throughput.++The content is based on FDB 6.2 and is true for FDB 6.3.  A new timestamp proxy role is introduced in FDB 6.4+, which affects the read path. We will discuss the timestamp proxy role in the future version of this document.++.. image:: /images/FDB_read_path.png++Components+=================++FDB is built on top of several key components. The terms below are common database or distributed system terms, instead of FDB specific terms.++**Timestamp generator.** It serves logical time,  which defines happen-before relation: An event at t1 happens before another event at t2, if t1 < t2. The logic time is used to order events in FDB distributed systems and it is used by concurrency control to decide if two transactions have conflicts.  The logical time is the timestamp for a transaction. ++* A read-only transaction has only one timestamp which is assigned when the transaction is created;+* A read-write transaction has one timestamp at the transaction’s creation time and one timestamp at its commit time.+++**Concurrency Control.** It decides if two transactions can be executed concurrently without violating Strict Serializable Isolation (SSI) property.  It uses the Optimistic Concurrency Control (OCC) mechanism described in [SSI] to achieve that. ++**Client.** It is a library, an FDB application (e.g., CK) uses, to access the database. It exposes the transaction concept to applications.  Client in FDB is a *fat* client that does multiple complex operations: (1) It calculates read and write conflict ranges for transactions; (2) it batches transaction requests for better throughput; (3) it automatically retries failed transactions.

s/CK/CloudKit/ or simply remove it.

"batches transaction requests" seems confusing to me. AFAIK, client doesn't batch many transactions into one request. Maybe you mean commit will batch many operations into one request?

Maybe mention more functionality such as: storage/shard location cache, read-your-write cache.

xumengpanda

comment created time in a day

delete branch hairyhenderson/gomplate

delete branch : dependabot/go_modules/github.com/aws/aws-sdk-go-1.35.33

delete time in 2 days

PR closed hairyhenderson/gomplate

Reviewers
chore(deps): bump github.com/aws/aws-sdk-go from 1.35.14 to 1.35.33 dependencies go

Bumps github.com/aws/aws-sdk-go from 1.35.14 to 1.35.33. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/aws/aws-sdk-go/releases">github.com/aws/aws-sdk-go's releases</a>.</em></p> <blockquote> <h1>Release v1.35.33 (2020-11-20)</h1> <h3>Service Client Updates</h3> <ul> <li><code>service/appmesh</code>: Updates service API</li> <li><code>service/chime</code>: Updates service API, documentation, and paginators <ul> <li>The Amazon Chime SDK for messaging provides the building blocks needed to build chat and other real-time collaboration features.</li> </ul> </li> <li><code>service/cloudhsmv2</code>: Updates service API and documentation <ul> <li>Added managed backup retention, a feature that enables customers to retain backups for a configurable period after which CloudHSM service will automatically delete them.</li> </ul> </li> <li><code>service/codeguru-reviewer</code>: Updates service API and documentation</li> <li><code>service/cognito-identity</code>: Updates service API and paginators</li> <li><code>service/connect</code>: Updates service API, documentation, and paginators</li> <li><code>service/kafka</code>: Updates service API and documentation</li> <li><code>service/macie2</code>: Updates service API and documentation</li> <li><code>service/s3</code>: Updates service API, documentation, and examples <ul> <li>Add new documentation regarding automatically generated Content-MD5 headers when using the SDK or CLI.</li> </ul> </li> <li><code>service/servicecatalog-appregistry</code>: Updates service API and documentation</li> </ul> <h1>Release v1.35.32 (2020-11-19)</h1> <h3>Service Client Updates</h3> <ul> <li><code>service/autoscaling</code>: Updates service API and documentation <ul> <li>You can now create Auto Scaling groups with multiple launch templates using a mixed instances policy, making it easy to deploy an AMI with an architecture that is different from the rest of the group.</li> </ul> </li> <li><code>service/ce</code>: Updates service API and documentation</li> <li><code>service/ds</code>: Updates service API and documentation <ul> <li>Adding multi-region replication feature for AWS Managed Microsoft AD</li> </ul> </li> <li><code>service/eventbridge</code>: Updates service API and documentation</li> <li><code>service/events</code>: Updates service API and documentation <ul> <li>EventBridge now supports Resource-based policy authorization on event buses. This enables cross-account PutEvents API calls, creating cross-account rules, and simplifies permission management.</li> </ul> </li> <li><code>service/glue</code>: Updates service API, documentation, and paginators <ul> <li>Adding support for Glue Schema Registry. The AWS Glue Schema Registry is a new feature that allows you to centrally discover, control, and evolve data stream schemas.</li> </ul> </li> <li><code>service/kinesisanalyticsv2</code>: Updates service API and documentation</li> <li><code>service/lambda</code>: Updates service API and documentation <ul> <li>Added the starting position and starting position timestamp to ESM Configuration. Now customers will be able to view these fields for their ESM.</li> </ul> </li> <li><code>service/lex-models</code>: Updates service API and documentation</li> <li><code>service/medialive</code>: Updates service API and documentation <ul> <li>The AWS Elemental MediaLive APIs and SDKs now support the ability to see the software update status on Link devices</li> </ul> </li> <li><code>service/redshift</code>: Updates service API, documentation, and paginators <ul> <li>Amazon Redshift support for returning ClusterNamespaceArn in describeClusters</li> </ul> </li> <li><code>service/runtime.lex</code>: Updates service API and documentation</li> </ul> <h1>Release v1.35.31 (2020-11-18)</h1> <h3>Service Client Updates</h3> <ul> <li><code>service/backup</code>: Updates service API and documentation</li> <li><code>service/cloudformation</code>: Updates service API and documentation <ul> <li>This release adds ChangeSets support for Nested Stacks. ChangeSets offer a preview of how proposed changes to a stack might impact existing resources or create new ones.</li> </ul> </li> <li><code>service/codebuild</code>: Updates service API and documentation</li> </ul> <!-- raw HTML omitted --> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/aws/aws-sdk-go/blob/master/CHANGELOG.md">github.com/aws/aws-sdk-go's changelog</a>.</em></p> <blockquote> <h1>Release v1.35.33 (2020-11-20)</h1> <h3>Service Client Updates</h3> <ul> <li><code>service/appmesh</code>: Updates service API</li> <li><code>service/chime</code>: Updates service API, documentation, and paginators <ul> <li>The Amazon Chime SDK for messaging provides the building blocks needed to build chat and other real-time collaboration features.</li> </ul> </li> <li><code>service/cloudhsmv2</code>: Updates service API and documentation <ul> <li>Added managed backup retention, a feature that enables customers to retain backups for a configurable period after which CloudHSM service will automatically delete them.</li> </ul> </li> <li><code>service/codeguru-reviewer</code>: Updates service API and documentation</li> <li><code>service/cognito-identity</code>: Updates service API and paginators</li> <li><code>service/connect</code>: Updates service API, documentation, and paginators</li> <li><code>service/kafka</code>: Updates service API and documentation</li> <li><code>service/macie2</code>: Updates service API and documentation</li> <li><code>service/s3</code>: Updates service API, documentation, and examples <ul> <li>Add new documentation regarding automatically generated Content-MD5 headers when using the SDK or CLI.</li> </ul> </li> <li><code>service/servicecatalog-appregistry</code>: Updates service API and documentation</li> </ul> <h1>Release v1.35.32 (2020-11-19)</h1> <h3>Service Client Updates</h3> <ul> <li><code>service/autoscaling</code>: Updates service API and documentation <ul> <li>You can now create Auto Scaling groups with multiple launch templates using a mixed instances policy, making it easy to deploy an AMI with an architecture that is different from the rest of the group.</li> </ul> </li> <li><code>service/ce</code>: Updates service API and documentation</li> <li><code>service/ds</code>: Updates service API and documentation <ul> <li>Adding multi-region replication feature for AWS Managed Microsoft AD</li> </ul> </li> <li><code>service/eventbridge</code>: Updates service API and documentation</li> <li><code>service/events</code>: Updates service API and documentation <ul> <li>EventBridge now supports Resource-based policy authorization on event buses. This enables cross-account PutEvents API calls, creating cross-account rules, and simplifies permission management.</li> </ul> </li> <li><code>service/glue</code>: Updates service API, documentation, and paginators <ul> <li>Adding support for Glue Schema Registry. The AWS Glue Schema Registry is a new feature that allows you to centrally discover, control, and evolve data stream schemas.</li> </ul> </li> <li><code>service/kinesisanalyticsv2</code>: Updates service API and documentation</li> <li><code>service/lambda</code>: Updates service API and documentation <ul> <li>Added the starting position and starting position timestamp to ESM Configuration. Now customers will be able to view these fields for their ESM.</li> </ul> </li> <li><code>service/lex-models</code>: Updates service API and documentation</li> <li><code>service/medialive</code>: Updates service API and documentation <ul> <li>The AWS Elemental MediaLive APIs and SDKs now support the ability to see the software update status on Link devices</li> </ul> </li> <li><code>service/redshift</code>: Updates service API, documentation, and paginators <ul> <li>Amazon Redshift support for returning ClusterNamespaceArn in describeClusters</li> </ul> </li> <li><code>service/runtime.lex</code>: Updates service API and documentation</li> </ul> <h1>Release v1.35.31 (2020-11-18)</h1> <h3>Service Client Updates</h3> <ul> <li><code>service/backup</code>: Updates service API and documentation</li> <li><code>service/cloudformation</code>: Updates service API and documentation <ul> <li>This release adds ChangeSets support for Nested Stacks. ChangeSets offer a preview of how proposed changes to a stack might impact existing resources or create new ones.</li> </ul> </li> <li><code>service/codebuild</code>: Updates service API and documentation</li> </ul> <!-- raw HTML omitted --> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/aws/aws-sdk-go/commit/b6ab7f8d2ef9cce9ffe55475af0aae9445e4ec98"><code>b6ab7f8</code></a> Release v1.35.33 (2020-11-20)</li> <li><a href="https://github.com/aws/aws-sdk-go/commit/c55377df3794ec871f3f1489d9f67dc917b279ff"><code>c55377d</code></a> Release v1.35.32 (2020-11-19) (<a href="https://github-redirect.dependabot.com/aws/aws-sdk-go/issues/3645">#3645</a>)</li> <li><a href="https://github.com/aws/aws-sdk-go/commit/8d7ca442033a15495e5db75c6b04b7990d55f4d2"><code>8d7ca44</code></a> Release v1.35.31 (2020-11-18) (<a href="https://github-redirect.dependabot.com/aws/aws-sdk-go/issues/3642">#3642</a>)</li> <li><a href="https://github.com/aws/aws-sdk-go/commit/334015214253f83309e2c6a49d46e8b034c9ab1a"><code>3340152</code></a> Update golang.org/x/net dependency (<a href="https://github-redirect.dependabot.com/aws/aws-sdk-go/issues/3638">#3638</a>)</li> <li><a href="https://github.com/aws/aws-sdk-go/commit/92126a99ecca63a08a18a15f8469dec2c8d4c319"><code>92126a9</code></a> Release v1.35.30 (2020-11-17) (<a href="https://github-redirect.dependabot.com/aws/aws-sdk-go/issues/3639">#3639</a>)</li> <li><a href="https://github.com/aws/aws-sdk-go/commit/b252e240226dce08605dc821535339fec46d6de4"><code>b252e24</code></a> Release v1.35.29 (2020-11-16) (<a href="https://github-redirect.dependabot.com/aws/aws-sdk-go/issues/3636">#3636</a>)</li> <li><a href="https://github.com/aws/aws-sdk-go/commit/92ba103b26312a869efac5d3735a06c26d1c04c7"><code>92ba103</code></a> Release v1.35.28 (2020-11-13) (<a href="https://github-redirect.dependabot.com/aws/aws-sdk-go/issues/3635">#3635</a>)</li> <li><a href="https://github.com/aws/aws-sdk-go/commit/0a625f5c400ca14ba82a19df32c8474caa4e0c4e"><code>0a625f5</code></a> Release v1.35.27 (2020-11-12) (<a href="https://github-redirect.dependabot.com/aws/aws-sdk-go/issues/3634">#3634</a>)</li> <li><a href="https://github.com/aws/aws-sdk-go/commit/1ac999299741b437843177406c7a20c58a1769f8"><code>1ac9992</code></a> Release v1.35.26 (2020-11-11) (<a href="https://github-redirect.dependabot.com/aws/aws-sdk-go/issues/3633">#3633</a>)</li> <li><a href="https://github.com/aws/aws-sdk-go/commit/77c1650830fc9634a499b8f93ad276e706b40a3e"><code>77c1650</code></a> Release v1.35.25 (2020-11-10) (<a href="https://github-redirect.dependabot.com/aws/aws-sdk-go/issues/3632">#3632</a>)</li> <li>Additional commits viewable in <a href="https://github.com/aws/aws-sdk-go/compare/v1.35.14...v1.35.33">compare view</a></li> </ul> </details> <br />

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


<details> <summary>Dependabot commands and options</summary> <br />

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

Additionally, you can set the following in the .dependabot/config.yml file in this repo:

  • Update frequency
  • Automerge options (never/patch/minor, and dev/runtime dependencies)
  • Out-of-range updates (receive only lockfile updates, if desired)
  • Security updates (receive only security updates, if desired)

</details>

+5 -1

1 comment

2 changed files

dependabot-preview[bot]

pr closed time in 2 days

PR opened hairyhenderson/gomplate

chore(deps): bump github.com/aws/aws-sdk-go from 1.35.14 to 1.35.35

Bumps github.com/aws/aws-sdk-go from 1.35.14 to 1.35.35. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/aws/aws-sdk-go/releases">github.com/aws/aws-sdk-go's releases</a>.</em></p> <blockquote> <h1>Release v1.35.35 (2020-11-24)</h1> <h3>Service Client Updates</h3> <ul> <li><code>service/appflow</code>: Updates service API and documentation</li> <li><code>service/batch</code>: Updates service API and documentation <ul> <li>Add Ec2Configuration in ComputeEnvironment.ComputeResources. Use in CreateComputeEnvironment API to enable AmazonLinux2 support.</li> </ul> </li> <li><code>service/cloudformation</code>: Updates service API and documentation <ul> <li>Adds support for the new Modules feature for CloudFormation. A module encapsulates one or more resources and their respective configurations for reuse across your organization.</li> </ul> </li> <li><code>service/cloudtrail</code>: Updates service API and documentation <ul> <li>CloudTrail now includes advanced event selectors, which give you finer-grained control over the events that are logged to your trail.</li> </ul> </li> <li><code>service/codebuild</code>: Updates service API and documentation <ul> <li>Adding GetReportGroupTrend API for Test Reports.</li> </ul> </li> <li><code>service/cognito-idp</code>: Updates service API and documentation</li> <li><code>service/comprehend</code>: Updates service API, documentation, and paginators</li> <li><code>service/elasticbeanstalk</code>: Updates service API and documentation <ul> <li>Updates the Integer constraint of DescribeEnvironmentManagedActionHistory's MaxItems parameter to [1, 100].</li> </ul> </li> <li><code>service/fsx</code>: Updates service API and documentation</li> <li><code>service/gamelift</code>: Updates service API and documentation <ul> <li>GameLift FlexMatch is now available as a standalone matchmaking solution. FlexMatch now provides customizable matchmaking for games hosted peer-to-peer, on-premises, or on cloud compute primitives.</li> </ul> </li> <li><code>service/iotsitewise</code>: Updates service API and documentation</li> <li><code>service/lex-models</code>: Updates service API</li> <li><code>service/mediaconvert</code>: Updates service API and documentation <ul> <li>AWS Elemental MediaConvert SDK has added support for Vorbis and Opus audio in OGG/OGA containers.</li> </ul> </li> <li><code>service/mwaa</code>: Adds new service</li> <li><code>service/quicksight</code>: Updates service API and documentation <ul> <li>Support for embedding without user registration. New enum EmbeddingIdentityType. A potential breaking change. Affects code that refers IdentityType enum type directly instead of literal string value.</li> </ul> </li> <li><code>service/states</code>: Updates service API and documentation <ul> <li>This release of the AWS Step Functions SDK introduces support for Synchronous Express Workflows</li> </ul> </li> <li><code>service/timestream-write</code>: Updates service API and documentation</li> <li><code>service/transcribe-streaming</code>: Updates service API and documentation</li> </ul> <h1>Release v1.35.34 (2020-11-23)</h1> <h3>Service Client Updates</h3> <ul> <li><code>service/application-insights</code>: Updates service API and documentation</li> <li><code>service/autoscaling</code>: Updates service documentation <ul> <li>Documentation updates and corrections for Amazon EC2 Auto Scaling API Reference and SDKs.</li> </ul> </li> <li><code>service/codeartifact</code>: Updates service API and documentation</li> <li><code>service/codestar-connections</code>: Updates service API and documentation</li> <li><code>service/dynamodb</code>: Updates service API and documentation <ul> <li>With this release, you can capture data changes in any Amazon DynamoDB table as an Amazon Kinesis data stream. You also can use PartiQL (SQL-compatible language) to manipulate data in DynamoDB tables.</li> </ul> </li> <li><code>service/ec2</code>: Updates service API and documentation <ul> <li>This release adds support for Multiple Private DNS names to DescribeVpcEndpointServices response.</li> </ul> </li> <li><code>service/ecs</code>: Updates service API and documentation <ul> <li>This release adds support for updating capacity providers, specifying custom instance warmup periods for capacity providers, and using deployment circuit breaker for your ECS Services.</li> </ul> </li> <li><code>service/elasticache</code>: Updates service documentation <ul> <li>Documentation updates for elasticache</li> </ul> </li> <li><code>service/elasticmapreduce</code>: Updates service API, documentation, and paginators</li> </ul> <!-- raw HTML omitted --> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/aws/aws-sdk-go/blob/master/CHANGELOG.md">github.com/aws/aws-sdk-go's changelog</a>.</em></p> <blockquote> <h1>Release v1.35.35 (2020-11-24)</h1> <h3>Service Client Updates</h3> <ul> <li><code>service/appflow</code>: Updates service API and documentation</li> <li><code>service/batch</code>: Updates service API and documentation <ul> <li>Add Ec2Configuration in ComputeEnvironment.ComputeResources. Use in CreateComputeEnvironment API to enable AmazonLinux2 support.</li> </ul> </li> <li><code>service/cloudformation</code>: Updates service API and documentation <ul> <li>Adds support for the new Modules feature for CloudFormation. A module encapsulates one or more resources and their respective configurations for reuse across your organization.</li> </ul> </li> <li><code>service/cloudtrail</code>: Updates service API and documentation <ul> <li>CloudTrail now includes advanced event selectors, which give you finer-grained control over the events that are logged to your trail.</li> </ul> </li> <li><code>service/codebuild</code>: Updates service API and documentation <ul> <li>Adding GetReportGroupTrend API for Test Reports.</li> </ul> </li> <li><code>service/cognito-idp</code>: Updates service API and documentation</li> <li><code>service/comprehend</code>: Updates service API, documentation, and paginators</li> <li><code>service/elasticbeanstalk</code>: Updates service API and documentation <ul> <li>Updates the Integer constraint of DescribeEnvironmentManagedActionHistory's MaxItems parameter to [1, 100].</li> </ul> </li> <li><code>service/fsx</code>: Updates service API and documentation</li> <li><code>service/gamelift</code>: Updates service API and documentation <ul> <li>GameLift FlexMatch is now available as a standalone matchmaking solution. FlexMatch now provides customizable matchmaking for games hosted peer-to-peer, on-premises, or on cloud compute primitives.</li> </ul> </li> <li><code>service/iotsitewise</code>: Updates service API and documentation</li> <li><code>service/lex-models</code>: Updates service API</li> <li><code>service/mediaconvert</code>: Updates service API and documentation <ul> <li>AWS Elemental MediaConvert SDK has added support for Vorbis and Opus audio in OGG/OGA containers.</li> </ul> </li> <li><code>service/mwaa</code>: Adds new service</li> <li><code>service/quicksight</code>: Updates service API and documentation <ul> <li>Support for embedding without user registration. New enum EmbeddingIdentityType. A potential breaking change. Affects code that refers IdentityType enum type directly instead of literal string value.</li> </ul> </li> <li><code>service/states</code>: Updates service API and documentation <ul> <li>This release of the AWS Step Functions SDK introduces support for Synchronous Express Workflows</li> </ul> </li> <li><code>service/timestream-write</code>: Updates service API and documentation</li> <li><code>service/transcribe-streaming</code>: Updates service API and documentation</li> </ul> <h1>Release v1.35.34 (2020-11-23)</h1> <h3>Service Client Updates</h3> <ul> <li><code>service/application-insights</code>: Updates service API and documentation</li> <li><code>service/autoscaling</code>: Updates service documentation <ul> <li>Documentation updates and corrections for Amazon EC2 Auto Scaling API Reference and SDKs.</li> </ul> </li> <li><code>service/codeartifact</code>: Updates service API and documentation</li> <li><code>service/codestar-connections</code>: Updates service API and documentation</li> <li><code>service/dynamodb</code>: Updates service API and documentation <ul> <li>With this release, you can capture data changes in any Amazon DynamoDB table as an Amazon Kinesis data stream. You also can use PartiQL (SQL-compatible language) to manipulate data in DynamoDB tables.</li> </ul> </li> <li><code>service/ec2</code>: Updates service API and documentation <ul> <li>This release adds support for Multiple Private DNS names to DescribeVpcEndpointServices response.</li> </ul> </li> <li><code>service/ecs</code>: Updates service API and documentation <ul> <li>This release adds support for updating capacity providers, specifying custom instance warmup periods for capacity providers, and using deployment circuit breaker for your ECS Services.</li> </ul> </li> <li><code>service/elasticache</code>: Updates service documentation <ul> <li>Documentation updates for elasticache</li> </ul> </li> <li><code>service/elasticmapreduce</code>: Updates service API, documentation, and paginators</li> </ul> <!-- raw HTML omitted --> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/aws/aws-sdk-go/commit/e8a296c5261dd24ead57cfde86c076608418f8e7"><code>e8a296c</code></a> Release v1.35.35 (2020-11-24)</li> <li><a href="https://github.com/aws/aws-sdk-go/commit/8e5531c107d27bb374b420048a65bc0898582be4"><code>8e5531c</code></a> codegen: Ensure API structs don't collide with API client type name (<a href="https://github-redirect.dependabot.com/aws/aws-sdk-go/issues/3651">#3651</a>)</li> <li><a href="https://github.com/aws/aws-sdk-go/commit/540127aae7974af9fe5f83ef67dc056c28d79927"><code>540127a</code></a> Release v1.35.34 (2020-11-23) (<a href="https://github-redirect.dependabot.com/aws/aws-sdk-go/issues/3650">#3650</a>)</li> <li><a href="https://github.com/aws/aws-sdk-go/commit/8e8a0d20ba2ab9ff93d418d28ff3f61c1b626e09"><code>8e8a0d2</code></a> Release v1.35.33 (2020-11-20) (<a href="https://github-redirect.dependabot.com/aws/aws-sdk-go/issues/3647">#3647</a>)</li> <li><a href="https://github.com/aws/aws-sdk-go/commit/c55377df3794ec871f3f1489d9f67dc917b279ff"><code>c55377d</code></a> Release v1.35.32 (2020-11-19) (<a href="https://github-redirect.dependabot.com/aws/aws-sdk-go/issues/3645">#3645</a>)</li> <li><a href="https://github.com/aws/aws-sdk-go/commit/8d7ca442033a15495e5db75c6b04b7990d55f4d2"><code>8d7ca44</code></a> Release v1.35.31 (2020-11-18) (<a href="https://github-redirect.dependabot.com/aws/aws-sdk-go/issues/3642">#3642</a>)</li> <li><a href="https://github.com/aws/aws-sdk-go/commit/334015214253f83309e2c6a49d46e8b034c9ab1a"><code>3340152</code></a> Update golang.org/x/net dependency (<a href="https://github-redirect.dependabot.com/aws/aws-sdk-go/issues/3638">#3638</a>)</li> <li><a href="https://github.com/aws/aws-sdk-go/commit/92126a99ecca63a08a18a15f8469dec2c8d4c319"><code>92126a9</code></a> Release v1.35.30 (2020-11-17) (<a href="https://github-redirect.dependabot.com/aws/aws-sdk-go/issues/3639">#3639</a>)</li> <li><a href="https://github.com/aws/aws-sdk-go/commit/b252e240226dce08605dc821535339fec46d6de4"><code>b252e24</code></a> Release v1.35.29 (2020-11-16) (<a href="https://github-redirect.dependabot.com/aws/aws-sdk-go/issues/3636">#3636</a>)</li> <li><a href="https://github.com/aws/aws-sdk-go/commit/92ba103b26312a869efac5d3735a06c26d1c04c7"><code>92ba103</code></a> Release v1.35.28 (2020-11-13) (<a href="https://github-redirect.dependabot.com/aws/aws-sdk-go/issues/3635">#3635</a>)</li> <li>Additional commits viewable in <a href="https://github.com/aws/aws-sdk-go/compare/v1.35.14...v1.35.35">compare view</a></li> </ul> </details> <br />

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


<details> <summary>Dependabot commands and options</summary> <br />

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

Additionally, you can set the following in the .dependabot/config.yml file in this repo:

  • Update frequency
  • Automerge options (never/patch/minor, and dev/runtime dependencies)
  • Out-of-range updates (receive only lockfile updates, if desired)
  • Security updates (receive only security updates, if desired)

</details>

+5 -1

0 comment

2 changed files

pr created time in 2 days

PR opened hairyhenderson/gomplate

Reviewers
chore(deps): bump github.com/hashicorp/consul/api from 1.7.0 to 1.8.0

Bumps github.com/hashicorp/consul/api from 1.7.0 to 1.8.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/hashicorp/consul/releases">github.com/hashicorp/consul/api's releases</a>.</em></p> <blockquote> <h2>v1.8.0</h2> <h2>1.8.0 (June 18, 2020)</h2> <p>BREAKING CHANGES:</p> <ul> <li>acl: Remove deprecated <code>acl_enforce_version_8</code> option [<a href="https://github-redirect.dependabot.com/hashicorp/consul/issues/7991">GH-7991</a>]</li> </ul> <p>FEATURES:</p> <ul> <li> <p><strong>Terminating Gateway</strong>: Envoy can now be run as a gateway to enable services in a Consul service mesh to connect to external services through their local proxy. Terminating gateways unlock several of the benefits of a service mesh in the cases where a sidecar proxy cannot be deployed alongside services such as legacy applications or managed cloud databases.</p> </li> <li> <p><strong>Ingress Gateway</strong>: Envoy can now be run as a gateway to ingress traffic into the Consul service mesh, enabling a more incremental transition for applications.</p> </li> <li> <p><strong>WAN Federation over Mesh Gateways</strong>: Allows Consul datacenters to federate by forwarding WAN gossip and RPC traffic through Mesh Gateways rather than requiring the servers to be exposed to the WAN directly.</p> </li> <li> <p><strong>JSON Web Token (JWT) Auth Method</strong>: Allows exchanging a signed JWT from a trusted external identity provider for a Consul ACL token.</p> </li> <li> <p><strong>Single Sign-On (SSO) [Enterprise]</strong>: Lets an operator configure Consul to use an external OpenID Connect (OIDC) provider to automatically handle the lifecycle of creating, distributing and managing ACL tokens for performing CLI operations or accessing the UI.</p> </li> <li> <p><strong>Audit Logging [Enterprise]</strong>: Adds instrumentation to record a trail of events (both attempted and authorized) by users of Consul’s HTTP API for purposes of regulatory compliance.</p> </li> <li> <p>acl: add DisplayName field to auth methods [<a href="https://github-redirect.dependabot.com/hashicorp/consul/issues/7769">GH-7769</a>]</p> </li> <li> <p>acl: add MaxTokenTTL field to auth methods [<a href="https://github-redirect.dependabot.com/hashicorp/consul/issues/7779">GH-7779</a>]</p> </li> <li> <p>agent/xds: add support for configuring passive health checks [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7713">GH-7713</a>]</p> </li> <li> <p>cli: Add -config flag to "acl authmethod update/create" [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7776">GH-7776</a>]</p> </li> <li> <p>ui: Help menu to provide further documentation/learn links [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7310">GH-7310</a>]</p> </li> <li> <p>ui: <strong>(Consul Enterprise only)</strong> SSO support [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7742">GH-7742</a>] [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7771">GH-7771</a>] [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7790">GH-7790</a>]</p> </li> <li> <p>ui: Support for termininating and ingress gateways [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7858">GH-7858</a>] [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7865">GH-7865</a>]</p> </li> </ul> <p>IMPROVEMENTS:</p> <ul> <li>acl: change authmethod.Validator to take a logger [<a href="https://github-redirect.dependabot.com/hashicorp/consul/issues/7758">GH-7758</a>]</li> <li>agent: show warning when enable_script_checks is enabled without safety net [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7437">GH-7437</a>]</li> <li>api: Added filtering support to the v1/connect/intentions endpoint. [<a href="https://github-redirect.dependabot.com/hashicorp/consul/issues/7478">GH-7478</a>]</li> <li>auto_encrypt: add validations for auto_encrypt.{tls,allow_tls} [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7704">GH-7704</a>]</li> <li>build: switched to compile with Go 1.14.1 [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7481">GH-7481</a>]</li> <li>config: validate system limits against limits.http_max_conns_per_client [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7434">GH-7434</a>]</li> <li>connect: support envoy 1.12.3, 1.13.1, and 1.14.1. Envoy 1.10 is no longer officially supported. [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7380">GH-7380</a>],[<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7624">GH-7624</a>]</li> <li>connect: add DNSSAN and IPSAN to cache key for ConnectCALeafRequest [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7597">GH-7597</a>]</li> <li>connect: Added a new expose CLI command for ingress gateways [<a href="https://github-redirect.dependabot.com/hashicorp/consul/issues/8099">GH-8099</a>]</li> <li>license: <strong>(Consul Enterprise only)</strong> Update licensing to align with the current modules licensing structure.</li> <li>logging: catch problems with the log destination earlier by creating the file immediately [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7469">GH-7469</a>]</li> <li>proxycfg: support path exposed with non-HTTP2 protocol [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7510">GH-7510</a>]</li> <li>tls: remove old ciphers [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7282">GH-7282</a>]</li> <li>ui: Show the last 8 characters of AccessorIDs in listing views [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7327">GH-7327</a>]</li> <li>ui: Make all tabs within the UI linkable/bookmarkable and include in history [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7592">GH-7592</a>]</li> <li>ui: Redesign of all service pages [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7605">GH-7605</a>] [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7632">GH-7632</a>] [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7655">GH-7655</a>] [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7683">GH-7683</a>]</li> <li>ui: Show intentions per individual service [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7615">GH-7615</a>]</li> <li>ui: Improved login/logout flow [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7790">GH-7790</a>]</li> <li>ui: Revert search to search as you type, add sort control for the service listing page [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7489">GH-7489</a>]</li> <li>ui: Omit proxy services from the service listing view and mark services as being proxied [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7820">GH-7820</a>]</li> <li>ui: Display proxies in a proxy info tab with the service instance detail page [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7745">GH-7745</a>]</li> <li>ui: Add live updates/blocking queries to gateway listings [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7967">GH-7967</a>]</li> <li>ui: Improved 'empty states' [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7940">GH-7940</a>]</li> <li>ui: Add ability to sort services based on health [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7989">GH-7989</a>]</li> <li>ui: Add explanatory tooltip panels for gateway services [<a href="https://github-redirect.dependabot.com/hashicorp/consul/issues/8048">GH-8048</a>](<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/8048">hashicorp/consul#8048</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/hashicorp/consul/blob/master/CHANGELOG.md">github.com/hashicorp/consul/api's changelog</a>.</em></p> <blockquote> <h2>1.8.0 (June 18, 2020)</h2> <p>BREAKING CHANGES:</p> <ul> <li>acl: Remove deprecated <code>acl_enforce_version_8</code> option [<a href="https://github-redirect.dependabot.com/hashicorp/consul/issues/7991">GH-7991</a>]</li> </ul> <p>FEATURES:</p> <ul> <li> <p><strong>Terminating Gateway</strong>: Envoy can now be run as a gateway to enable services in a Consul service mesh to connect to external services through their local proxy. Terminating gateways unlock several of the benefits of a service mesh in the cases where a sidecar proxy cannot be deployed alongside services such as legacy applications or managed cloud databases.</p> </li> <li> <p><strong>Ingress Gateway</strong>: Envoy can now be run as a gateway to ingress traffic into the Consul service mesh, enabling a more incremental transition for applications.</p> </li> <li> <p><strong>WAN Federation over Mesh Gateways</strong>: Allows Consul datacenters to federate by forwarding WAN gossip and RPC traffic through Mesh Gateways rather than requiring the servers to be exposed to the WAN directly.</p> </li> <li> <p><strong>JSON Web Token (JWT) Auth Method</strong>: Allows exchanging a signed JWT from a trusted external identity provider for a Consul ACL token.</p> </li> <li> <p><strong>Single Sign-On (SSO) [Enterprise]</strong>: Lets an operator configure Consul to use an external OpenID Connect (OIDC) provider to automatically handle the lifecycle of creating, distributing and managing ACL tokens for performing CLI operations or accessing the UI.</p> </li> <li> <p><strong>Audit Logging [Enterprise]</strong>: Adds instrumentation to record a trail of events (both attempted and authorized) by users of Consul’s HTTP API for purposes of regulatory compliance.</p> </li> <li> <p>acl: add DisplayName field to auth methods [<a href="https://github-redirect.dependabot.com/hashicorp/consul/issues/7769">GH-7769</a>]</p> </li> <li> <p>acl: add MaxTokenTTL field to auth methods [<a href="https://github-redirect.dependabot.com/hashicorp/consul/issues/7779">GH-7779</a>]</p> </li> <li> <p>agent/xds: add support for configuring passive health checks [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7713">GH-7713</a>]</p> </li> <li> <p>cli: Add -config flag to "acl authmethod update/create" [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7776">GH-7776</a>]</p> </li> <li> <p>serf: allow to restrict servers that can join a given Serf Consul cluster. [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7628">GH-7628</a>]</p> </li> <li> <p>ui: Help menu to provide further documentation/learn links [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7310">GH-7310</a>]</p> </li> <li> <p>ui: <strong>(Consul Enterprise only)</strong> SSO support [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7742">GH-7742</a>] [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7771">GH-7771</a>] [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7790">GH-7790</a>]</p> </li> <li> <p>ui: Support for termininating and ingress gateways [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7858">GH-7858</a>] [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7865">GH-7865</a>]</p> </li> </ul> <p>IMPROVEMENTS:</p> <ul> <li>acl: change authmethod.Validator to take a logger [<a href="https://github-redirect.dependabot.com/hashicorp/consul/issues/7758">GH-7758</a>]</li> <li>agent: show warning when enable_script_checks is enabled without safety net [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7437">GH-7437</a>]</li> <li>api: Added filtering support to the v1/connect/intentions endpoint. [<a href="https://github-redirect.dependabot.com/hashicorp/consul/issues/7478">GH-7478</a>]</li> <li>auto_encrypt: add validations for auto_encrypt.{tls,allow_tls} [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7704">GH-7704</a>]</li> <li>build: switched to compile with Go 1.14.1 [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7481">GH-7481</a>]</li> <li>config: validate system limits against limits.http_max_conns_per_client [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7434">GH-7434</a>]</li> <li>connect: support envoy 1.12.3, 1.13.1, and 1.14.1. Envoy 1.10 is no longer officially supported. [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7380">GH-7380</a>],[<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7624">GH-7624</a>]</li> <li>connect: add DNSSAN and IPSAN to cache key for ConnectCALeafRequest [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7597">GH-7597</a>]</li> <li>connect: Added a new expose CLI command for ingress gateways [<a href="https://github-redirect.dependabot.com/hashicorp/consul/issues/8099">GH-8099</a>]</li> <li>license: <strong>(Consul Enterprise only)</strong> Update licensing to align with the current modules licensing structure.</li> <li>logging: catch problems with the log destination earlier by creating the file immediately [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7469">GH-7469</a>]</li> <li>proxycfg: support path exposed with non-HTTP2 protocol [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7510">GH-7510</a>]</li> <li>tls: remove old ciphers [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7282">GH-7282</a>]</li> <li>ui: Show the last 8 characters of AccessorIDs in listing views [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7327">GH-7327</a>]</li> <li>ui: Make all tabs within the UI linkable/bookmarkable and include in history [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7592">GH-7592</a>]</li> <li>ui: Redesign of all service pages [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7605">GH-7605</a>] [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7632">GH-7632</a>] [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7655">GH-7655</a>] [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7683">GH-7683</a>]</li> <li>ui: Show intentions per individual service [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7615">GH-7615</a>]</li> <li>ui: Improved login/logout flow [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7790">GH-7790</a>]</li> <li>ui: Revert search to search as you type, add sort control for the service listing page [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7489">GH-7489</a>]</li> <li>ui: Omit proxy services from the service listing view and mark services as being proxied [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7820">GH-7820</a>]</li> <li>ui: Display proxies in a proxy info tab with the service instance detail page [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7745">GH-7745</a>]</li> <li>ui: Add live updates/blocking queries to gateway listings [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7967">GH-7967</a>]</li> <li>ui: Improved 'empty states' [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7940">GH-7940</a>]</li> <li>ui: Add ability to sort services based on health [<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/7989">GH-7989</a>]</li> <li>ui: Add explanatory tooltip panels for gateway services [<a href="https://github-redirect.dependabot.com/hashicorp/consul/issues/8048">GH-8048</a>](<a href="https://github-redirect.dependabot.com/hashicorp/consul/pull/8048">hashicorp/consul#8048</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/hashicorp/consul/commit/3111cb8c7df8545abaa0c96347996b5341ff625d"><code>3111cb8</code></a> Release v1.8.0</li> <li><a href="https://github.com/hashicorp/consul/commit/f3b7cf63b3e7778aa664aa93775f09d2b757a5f5"><code>f3b7cf6</code></a> update bindata_assetfs.go</li> <li><a href="https://github.com/hashicorp/consul/commit/4102fb23cb218bc9788ffae9f788a0a82dc4cf0d"><code>4102fb2</code></a> Bump GOLANG_IMAGE to 1.14.4 to get patch for runtime issue</li> <li><a href="https://github.com/hashicorp/consul/commit/874bd29891326d96aed207208d11286119d88e3f"><code>874bd29</code></a> Run make update-vendor and remove submodule references</li> <li><a href="https://github.com/hashicorp/consul/commit/9ce1ace8923d1349319a76d6ae497f6d9a62f09d"><code>9ce1ace</code></a> Bump api and sdk modules</li> <li><a href="https://github.com/hashicorp/consul/commit/4ec1d1dc75908ae69edd2788b5be892517198b9d"><code>4ec1d1d</code></a> Bump sdk module version to 0.5.0</li> <li><a href="https://github.com/hashicorp/consul/commit/d94055313588b266964ffb2f0ad0533da32b4cbc"><code>d940553</code></a> Update CHANGELOG for 1.8.0 GA (<a href="https://github-redirect.dependabot.com/hashicorp/consul/issues/8143">#8143</a>)</li> <li><a href="https://github.com/hashicorp/consul/commit/42a058f560f0aea1dcb95f9aa9b71b70a529b610"><code>42a058f</code></a> Putting source back into Dev Mode</li> <li><a href="https://github.com/hashicorp/consul/commit/44e17c81c319c05e098aa2254ec9be389f98f9bb"><code>44e17c8</code></a> Release v1.8.0-rc1</li> <li><a href="https://github.com/hashicorp/consul/commit/7fa405f85820921edf536b01cff93d59553b4a7c"><code>7fa405f</code></a> update bindata_assetfs.go</li> <li>Additional commits viewable in <a href="https://github.com/hashicorp/consul/compare/v1.7.0...v1.8.0">compare view</a></li> </ul> </details> <br />

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


<details> <summary>Dependabot commands and options</summary> <br />

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

Additionally, you can set the following in the .dependabot/config.yml file in this repo:

  • Update frequency
  • Automerge options (never/patch/minor, and dev/runtime dependencies)
  • Out-of-range updates (receive only lockfile updates, if desired)
  • Security updates (receive only security updates, if desired)

</details>

+4 -1

0 comment

2 changed files

pr created time in 2 days

delete branch hairyhenderson/gomplate

delete branch : dependabot/docker/consul-1.8.6

delete time in 2 days

PR closed hairyhenderson/gomplate

Reviewers
chore(deps): bump consul from 1.8.5 to 1.8.6 dependencies docker

Bumps consul from 1.8.5 to 1.8.6.

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


<details> <summary>Dependabot commands and options</summary> <br />

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

Additionally, you can set the following in the .dependabot/config.yml file in this repo:

  • Update frequency
  • Automerge options (never/patch/minor, and dev/runtime dependencies)
  • Out-of-range updates (receive only lockfile updates, if desired)
  • Security updates (receive only security updates, if desired)

</details>

+1 -1

1 comment

1 changed file

dependabot-preview[bot]

pr closed time in 2 days

pull request commenthairyhenderson/gomplate

chore(deps): bump consul from 1.8.5 to 1.8.6

Superseded by #989.

dependabot-preview[bot]

comment created time in 2 days

create barnchhairyhenderson/gomplate

branch : dependabot/docker/consul-1.9.0

created branch time in 2 days

PR opened hairyhenderson/gomplate

chore(deps): bump consul from 1.8.5 to 1.9.0

Bumps consul from 1.8.5 to 1.9.0.

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


<details> <summary>Dependabot commands and options</summary> <br />

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

Additionally, you can set the following in the .dependabot/config.yml file in this repo:

  • Update frequency
  • Automerge options (never/patch/minor, and dev/runtime dependencies)
  • Out-of-range updates (receive only lockfile updates, if desired)
  • Security updates (receive only security updates, if desired)

</details>

+1 -1

0 comment

1 changed file

pr created time in 2 days

Pull request review commentapple/foundationdb

Proxy rejects long enqueued transactions

 ACTOR Future<Void> commitBatch( 	/////// Phase 1: Pre-resolution processing (CPU bound except waiting for a version # which is separately pipelined and *should* be available by now (unless empty commit); ordered; currently atomic but could yield) 	TEST(self->latestLocalCommitBatchResolving.get() < localBatchNumber-1); // Queuing pre-resolution commit processing  	wait(self->latestLocalCommitBatchResolving.whenAtLeast(localBatchNumber-1));+	double queuingDelay = g_network->timer() - timeStart;+	if (queuingDelay > (double)SERVER_KNOBS->MAX_READ_TRANSACTION_LIFE_VERSIONS / SERVER_KNOBS->VERSIONS_PER_SECOND ||+	    (BUGGIFY && g_network->isSimulated() && deterministicRandom()->random01() < 0.01 && trs.size() > 0 &&

what if a txn that changes system keyspace is queued for longer than (5) seconds? The condition seems to reject those transactions, which may not be what you want.

jzhou77

comment created time in 2 days

PR opened apple/foundationdb

Release 6.2

This PR is resolves #4108

Changes in this PR:

  • If the queuing time already exceeds the MVCC window (i.e., 5s), a proxy will reject the transaction batch immediately, bypassing the rest of commit path

Style

  • [x] All variable and function names make sense.
  • [x] The code is properly formatted (consider running git clang-format).

Performance

  • [ ] All CPU-hot paths are well optimized.
  • [ ] The proper containers are used (for example std::vector vs VectorRef).
  • [ ] There are no new known SlowTask traces.

Testing

  • [x] The code was sufficiently tested in simulation.
  • [ ] If there are new parameters or knobs, different values are tested in simulation.
  • [x] ASSERT, ASSERT_WE_THINK, and TEST macros are added in appropriate places.
  • [ ] Unit tests were added for new algorithms and data structure that make sense to unit-test
  • [ ] If this is a bugfix: there is a test that can easily reproduce the bug.
+24 -1

0 comment

1 changed file

pr created time in 2 days

PR closed apple/foundationdb

Yield while processing ignored pop requests on tlog

This resolves https://github.com/apple/foundationdb/issues/3848

+57 -55

3 comments

2 changed files

sfc-gh-tclinkenbeard

pr closed time in 3 days

pull request commentapple/foundationdb

Yield while processing ignored pop requests on tlog

This replaces https://github.com/apple/foundationdb/pull/3883, applying this optimization to 6.3 instead of 6.2

sfc-gh-tclinkenbeard

comment created time in 3 days

PR opened apple/foundationdb

Yield while processing ignored pop requests on tlog

This PR is resolves https://github.com/apple/foundationdb/issues/3848

Changes in this PR:

  • While playing ignored pop requests on transaction log, yield to prevent a slow task.

General guideline:

  • If this PR is ready to be merged (and all checkboxes below are either ticked or not applicable), make this a regular PR
  • If this PR still needs work, please make this a draft PR
    • If you wish to get feedback/code-review, please add the label RFC to this PR

Please verify that all things listed below were considered and check them. If an item doesn't apply to this type of PR (for example a documentation change doesn't need to be performance tested), you should make a strikethrough (markdown syntax: ~~strikethrough~~). More infos on the guidlines can be found here.

Style

  • [x] All variable and function names make sense.
  • [x] The code is properly formatted (consider running git clang-format).

Performance

  • [x] All CPU-hot paths are well optimized.
  • [x] The proper containers are used (for example std::vector vs VectorRef).
  • [x] There are no new known SlowTask traces.

Testing

  • [x] The code was sufficiently tested in simulation.
  • [x] If there are new parameters or knobs, different values are tested in simulation.
  • [x] ASSERT, ASSERT_WE_THINK, and TEST macros are added in appropriate places.
  • [x] Unit tests were added for new algorithms and data structure that make sense to unit-test
  • [x] If this is a bugfix: there is a test that can easily reproduce the bug.
+45 -49

0 comment

1 changed file

pr created time in 3 days

issue commentapple/foundationdb

Add debug tools to the runtime docker image

@apkar is the idea to always contain these debugging tools in the image or should we release 2 different images e.g. one $FDB-VERSION-debug and the current one?

apkar

comment created time in 4 days

PR opened donnemartin/awesome-aws

AWS SQS & ElasticMQ Message Queue Dashboard with real-time logs from Consumers

Review the Contributing Guidelines

Before submitting a pull request, verify it meets all requirements in the Contributing Guidelines.

Describe Why This Is Awesome

Because you can track registered SQS messages over different consumers. Typical SQS won't let you see how the message is processed. With this dashboard and API, you can track state and asynchronously send logs from the consumer for a real-time overview.

--

Like this pull request? Vote for it by adding a :+1:

+2 -0

0 comment

1 changed file

pr created time in 4 days

more