ESarch - CQRS & Event Sourcing with Spring Boot, Axon and Pivotal Cloud Foundry PAS
cloudfoundry-incubator/credhub-cli 35
CredHub CLI provides a command line interface to interact with CredHub servers
BOSH release of CredHub server
pivotal-cf/oss-pcf-gcp-retail-demo 17
PCF on GCP Retail Demo, Open Source version
Homebrew + RVM + Chef Soloist
cloudfoundry-incubator/mssql-server-broker 13
Cloud Foundry service broker for Microsoft SQL Server
cf-platform-eng/concourse-pypi-resource 10
A Concourse CI resource for Python PyPI packages.
pivotal-cf/spring-cloud-services-cli-plugin 9
A Cloud Foundry CLI plugin for Spring Cloud Services
pivotal/Realtime-scoring-for-MADlib 8
Operationalize AI/ML models built using Apache MADlib and Postgres PL/Python.
PCF Dev cf CLI Plugin
PR opened greenplum-db/gpdb
write_to_gpfdist_timeout controls timeout value (in seconds) for writing data to gpfdist server. Default value is 300, valid scope is [1, 7200]
Set CURLOPT_TIMEOUT as write_to_gpfdist_timeout
For any error, retry with double interval time, returns SQL ERROR if write_to_gpfdist_timeout is reached
Add regression test for GUC writable_external_table_timeout
(cherry picked from commit ab73713211e93a95e696429f4821e893ee3397de)
Here are some reminders before you submit the pull request
- [ ] Add tests for the change
- [ ] Document changes
- [ ] Communicate in the mailing list if needed
- [ ] Pass
make installcheck
- [ ] Review a PR in return to support the community
pr created time in an hour
PR opened greenplum-db/gpdb
The REFRESH MATERIALIZED used to count pgstat on QD. But it always result to 0 since the es_processed are not colleted from segments. In GPDB, we should count the pgstat on segments and union them on QD, so both the segments and coordinator will have pgstat for this relation. This fix issue: https://github.com/greenplum-db/gpdb/issues/11375
Here are some reminders before you submit the pull request
- [ ] Add tests for the change
- [ ] Document changes
- [ ] Communicate in the mailing list if needed
- [ ] Pass
make installcheck
- [ ] Review a PR in return to support the community
pr created time in an hour
pull request commentgreenplum-db/gpdb
Truncate distributed log when vacuuming database
CI failed with the patch (not flaky one). FYI.
That's because the fts_error test left an orphan temp table, vacuumdb didn't work when in this case, I created a PR to fix that first
comment created time in an hour
Pull request review commentgreenplum-db/gpdb
Remove fixme and code for the "ctas with no data"
standard_ExecutorStart(QueryDesc *queryDesc, int eflags) /* QD-only query, no dispatching required */ shouldDispatch = false; }-- /*- * If this is CREATE TABLE AS ... WITH NO DATA, there's no need- * need to actually execute the plan.- *- * GPDB_12_MERGE_FIXME: it would be nice to apply this optimization to- * materialized views as well but then QEs cannot tell the difference- * between CTAS and materialized view when CreateStmt is dispatched to- * QEs (see createas.c). QEs must populate rules for materialized- * views, which doesn't happen if this optimization is applied as is.- */- if (queryDesc->plannedstmt->intoClause &&- queryDesc->plannedstmt->intoClause->skipData &&- queryDesc->plannedstmt->intoClause->viewQuery == NULL)- {- shouldDispatch = false;- }
In matview, intoClause->viewQuery is not null.
comment created time in 2 hours
PR opened greenplum-db/gpdb
We have already considered IsBinaryUpgrade in GetNewOrPreassignedOid, so there is no need to deal with this variable in index_create.
Here are some reminders before you submit the pull request
- [ ] Add tests for the change
- [ ] Document changes
- [ ] Communicate in the mailing list if needed
- [ ] Pass
make installcheck
- [ ] Review a PR in return to support the community
pr created time in 2 hours
push eventgreenplum-db/gpdb
commit sha 1c51d8b7bf1ea5a70a610d1fcb829612b97895fb
Fix that pgstat_send_qd_tabstats will report statistics redundantly for Replicated Table
push time in 2 hours
PR merged greenplum-db/gpdb
I think we can introduce a template engine to generate regress tests based on the regress test template. The content of regress test template as follow(based on Liquid):
select n_tup_ins, n_tup_upd, n_tup_del, n_tup_hot_upd, n_live_tup, n_dead_tup, n_mod_since_analyze from pg_stat_all_tables where relid = 'table_for_truncate_abort'::regclass;
{% if 'replicated' in amit_conf_distby %}
{% comment %}
ORCA can't be used in the replicated table, the update of table_for_iud will not be implemented by delete + insert
{% endcomment %}
1600|301|777|0|777|1124|777
{% else %}
1901|0|1078|0|777|1124|777
{% endif %}
{% if 'replicated' not in amit_conf_distby %}
{% comment %}
ERROR: PARTITION BY clause cannot be used with DISTRIBUTED REPLICATED clause
So don't do the following test in DISTRIBUTED REPLICATED
{% endcomment %}
-- Test pgstat table stat for partition table on QD
CREATE TABLE rankpart (id int, rank int, product int)
with ({{amit_conf_opts}})
{{amit_conf_distby}} PARTITION BY RANGE (rank)
( START (1) END (10) EVERY (5),
DEFAULT PARTITION extra );
...
{% endif %}
amit_conf_distby
will be replaced by appendonly=false
, appendonly=true,orientation=row
, appendonly=true,orientation=column
, and amit_conf_distby
will be replaced by distributed randomly
, distributed replicated
, ''
(an empty string which means distributed by
). So that we can test pgstat_qd_tabstat on heap/ao_row/ao_column and/or distributed by
/distributed replicated
/distributed randomly
.
test pgstat_qd_tabstat-heap_hash_dist ... ok (83.60 sec) (diff:0.13 sec)
test pgstat_qd_tabstat-heap_rand_dist ... ok (83.65 sec) (diff:0.14 sec)
test pgstat_qd_tabstat-heap_replicated ... ok (63.62 sec) (diff:0.13 sec)
test pgstat_qd_tabstat-aoro_hash_dist ... ok (84.45 sec) (diff:0.13 sec)
test pgstat_qd_tabstat-aoro_rand_dist ... ok (84.58 sec) (diff:0.14 sec)
test pgstat_qd_tabstat-aoro_replicated ... ok (64.32 sec) (diff:0.13 sec)
test pgstat_qd_tabstat-aoco_hash_dist ... ok (86.36 sec) (diff:0.13 sec)
test pgstat_qd_tabstat-aoco_rand_dist ... ok (86.47 sec) (diff:0.14 sec)
test pgstat_qd_tabstat-aoco_replicated ... or (65.70 sec) (diff:0.25 sec)
pr closed time in 2 hours
pull request commentgreenplum-db/gpdb
Fix that pgstat_send_qd_tabstats will report statistics redundantly for Replicated Table.
Let's just merge the current fix for now. For the test template, feel free to open a new PR so we can have more people review it.
comment created time in 2 hours
pull request commentgreenplum-db/gpdb
Coerce unknown-type literals to type text instead of cstring
It doesn't need to drop, without this commit, it dumped as unknown::cstring::date
, with this commit it dumps as unknown::text::date
.
comment created time in 3 hours
issue commentgreenplum-db/gpdb
pgstat_send_qd_tabstats can not work in REFRESH MATERIALIZED VIEW
@asimrp We do the dispatch here and the es_processed
is 0 before we fetch the total processed across segments.
And pg don't do dispatch, it keeps increase es_processed
.
comment created time in 3 hours
Pull request review commentgreenplum-db/gpdb
Set column processed after infer new predicate on it
+<?xml version="1.0" encoding="UTF-8"?>+<dxl:DXLMessage xmlns:dxl="http://greenplum.com/dxl/2010/12/">
updated
comment created time in 2 days
Pull request review commentgreenplum-db/gpdb
gprecoverseg: don't recover unreachable segments
Feature: gprecoverseg tests And the segments are synchronized And pg_isready reports all primaries are accepting connections + Scenario: recovery skips unreachable segments+ Given the database is running+ And all the segments are running+ And the segments are synchronized++ And the primary on content 0 is stopped+ And user can start transactions+ And the primary on content 1 is stopped+ And user can start transactions+ And the status of the primary on content 0 should be "d"+ And the status of the primary on content 1 should be "d"++ And the host for the primary on content 1 is made unreachable++ And the user runs psql with "-c 'CREATE TABLE IF NOT EXISTS foo (i int)'" against database "postgres"+ And the user runs psql with "-c 'INSERT INTO foo SELECT generate_series(1, 10000)'" against database "postgres"++ When the user runs "gprecoverseg -aF"+ Then gprecoverseg should print "Not recovering segment \d because invalid_host is unreachable" to stdout+ And the user runs psql with "-c 'SELECT gp_request_fts_probe_scan()'" against database "postgres"+ And the status of the primary on content 0 should be "u"+ And the status of the primary on content 1 should be "d"++ And the user runs psql with "-c 'DROP TABLE foo'" against database "postgres"+ And the cluster is returned to a good state
Since the test takes ~35 seconds, I added a "table driven" test.
I would like to ask the team to keep in mind the overhead of adding tests such as run time as we add more.
comment created time in 2 days
Pull request review commentgreenplum-db/gpdb
gprecoverseg: don't recover unreachable segments
+from gppylib import gplog+from gppylib.commands.base import Command+from gppylib.commands import base+from gppylib.gparray import STATUS_DOWN++logger = gplog.get_default_logger()+++def get_unreachable_segment_hosts(hosts_excluding_master, num_workers):+ pool = base.WorkerPool(numWorkers=num_workers)+ try:+ for host in hosts_excluding_master:+ cmd = Command(name='check %s is up' % host, cmdStr="ssh %s 'echo %s'" % (host, host))+ pool.addCommand(cmd)+ pool.join()+ finally:+ pool.haltWork()+ pool.joinWorkers()++ # There's no good way to map a CommandResult back to its originating Command so instead+ # of looping through and finding the hosts that errored out, we remove any hosts that+ # succeeded from the hosts_excluding_master and any remaining hosts will be ones that were unreachable.+ for item in pool.getCompletedItems():+ result = item.get_results()+ if result.rc == 0:+ host = result.stdout.strip()+ hosts_excluding_master.remove(host)++ if len(hosts_excluding_master) > 0:+ logger.warning("One or more hosts are not reachable via SSH. Any segments on those hosts will be marked down")+ for host in sorted(hosts_excluding_master):+ logger.warning("Host %s is unreachable" % host)+ return hosts_excluding_master+ return None+
updated to return a new list called unreachable_hosts
comment created time in 2 days
Pull request review commentgreenplum-db/gpdb
[5x backport]: add support for ipv6
int main(int argc, char** argv) } } - if (!serverPort)- {+ if (!serverPort) { fprintf(stdout, "-p port not specified\n"); usage(); return 1; } receiveBuffer = malloc(SERVER_APPLICATION_RECEIVE_BUF_SIZE);- if (!receiveBuffer)- {+ if (!receiveBuffer) { fprintf(stdout, "failed allocating memory for application receive buffer\n"); return 1; } - socketFd = socket(PF_INET, SOCK_STREAM, 0); + rc = setupListen("::0", serverPort, AF_INET6);+ if (rc != 0)+ rc = setupListen("0.0.0.0", serverPort, AF_INET); - if (socketFd < 0)- { - perror("Socket creation failed");- return 1;- } + return rc;+}++static int setupListen(char hostname[], char port[], int protocol)+{+ struct addrinfo hints;+ struct addrinfo *addrs;+ int s, clientFd, clientPid;+ int one = 1;+ int fd = -1;+ int pid = -1;++ memset(&hints, 0, sizeof(hints));+ hints.ai_family = protocol; /* Allow IPv4 or IPv6 */+ hints.ai_socktype = SOCK_STREAM; /* Two-way, out of band connection */+ hints.ai_flags = AI_PASSIVE; /* For wildcard IP address */
As in 7X I think AI_PASSIVE should be removed here.
comment created time in 2 days
Pull request review commentgreenplum-db/gpdb
[5x backport]: add support for ipv6
int main(int argc, char** argv) } } - if (!serverPort)- {+ if (!serverPort) { fprintf(stdout, "-p port not specified\n"); usage(); return 1; } receiveBuffer = malloc(SERVER_APPLICATION_RECEIVE_BUF_SIZE);- if (!receiveBuffer)- {+ if (!receiveBuffer) { fprintf(stdout, "failed allocating memory for application receive buffer\n"); return 1; } - socketFd = socket(PF_INET, SOCK_STREAM, 0); + rc = setupListen("::0", serverPort, AF_INET6);+ if (rc != 0)+ rc = setupListen("0.0.0.0", serverPort, AF_INET); - if (socketFd < 0)- { - perror("Socket creation failed");- return 1;- } + return rc;+}++static int setupListen(char hostname[], char port[], int protocol)+{+ struct addrinfo hints;+ struct addrinfo *addrs;+ int s, clientFd, clientPid;+ int one = 1;+ int fd = -1;+ int pid = -1;++ memset(&hints, 0, sizeof(hints));+ hints.ai_family = protocol; /* Allow IPv4 or IPv6 */+ hints.ai_socktype = SOCK_STREAM; /* Two-way, out of band connection */+ hints.ai_flags = AI_PASSIVE; /* For wildcard IP address */+ hints.ai_protocol = IPPROTO_TCP; /* Any protocol - TCP implied for network use due to SOCK_STREAM */ - retVal = setsockopt(socketFd, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(one));- if (retVal)+ s = getaddrinfo((char *)hostname, port, &hints, &addrs);+ if (s != 0) {- perror("Could not set SO_REUSEADDR on socket");+ fprintf(stderr, "getaddrinfo says %s", gai_strerror(s)); return 1; } - memset(&serverSocketAddress, 0, sizeof(struct sockaddr_in));- serverSocketAddress.sin_family = AF_INET;- serverSocketAddress.sin_addr.s_addr = htonl(INADDR_ANY);- serverSocketAddress.sin_port = htons(serverPort); - retVal = bind(socketFd,(struct sockaddr *)&serverSocketAddress, sizeof(serverSocketAddress));- if (retVal)+ while (addrs != NULL) {- perror("Could not bind port");- return 1;+ {+ fd = socket(addrs->ai_family, SOCK_STREAM, 0);+ if (fd < 0)+ {+ fprintf(stderr, "socket creation failed\n");+ addrs = addrs->ai_next;+ continue;+ }++ if (setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, (char *) &one, sizeof(one)) < 0)+ {+ fprintf(stderr, "could not set SO_REUSEADDR\n");+ close(fd);+ addrs = addrs->ai_next;+ continue;+ }++ if (bind(fd, addrs->ai_addr, addrs->ai_addrlen) != -1 &&+ listen(fd, SOMAXCONN) != -1)+ {+ pid = fork();+ if (pid > 0)+ {+ return 0; // we exit the parent cleanly and leave the child process open as a listening server+ }+ else if (pid == 0)+ {+ break;+ }+ else+ {+ fprintf(stderr, "failed to fork process");+ exit(1);+ }+ }+ else+ close(fd);++ addrs = addrs->ai_next;+ }
The L81/L123 braces are an unnecessary additional code block. Please remove them.
comment created time in 2 days
Pull request review commentgreenplum-db/gpdb
Enable autovacuum on catalog tables
AutoVacWorkerMain(int argc, char *argv[]) { sigjmp_buf local_sigjmp_buf; Oid dbid;- bool for_analyze = false;- GpRoleValue orig_role; am_autovacuum_worker = true; /* MPP-4990: Autovacuum always runs as utility-mode */- Gp_role = GP_ROLE_UTILITY;- if (IS_QUERY_DISPATCHER() && AutoVacuumingActive())- {- /*- * Gp_role for the current autovacuum worker should be determined by wi_for_analyze. - * But we don't know the value of wi_for_analyze now, so we set Gp_role to - * GP_ROLE_DISPATCH first. Gp_role will switch to GP_ROLE_UTILITY as needed - * after we get the wi_for_analyze.- */+ if (IS_QUERY_DISPATCHER())
Instead of flipping the role based on some operation I feel its much cleaner expectation to remember, code and test. Coordinator has it in dispatch mode and segments in utility mode.
comment created time in 2 days
Pull request review commentgreenplum-db/gpdb
[5x backport]: add support for ipv6
int main(int argc, char** argv) { fprintf(stdout, "socket call failed\n"); return 1;- } + }
https://github.com/greenplum-db/gpdb/pull/11350/files#r562979054
remove the original socket creation.
comment created time in 2 days
Pull request review commentgreenplum-db/gpdb
[6X backport]: Add ipv6 support for gpcheckperf
main(int argc, char** argv) { fprintf(stderr, "socket call failed\n"); return 1;- } + }
see https://github.com/greenplum-db/gpdb/pull/11350/files#r562979054...the old socket creation code needs to be removed also.
comment created time in 2 days
Pull request review commentgreenplum-db/gpdb
main(int argc, char** argv) return 1; } - socketFd = socket(PF_INET, SOCK_STREAM, 0); - if (socketFd < 0)- { - fprintf(stderr, "socket call failed\n");- return 1;- }
(ignore for 7X) creating a comment here for reference in the #11355 issue.
comment created time in 2 days
Pull request review commentgreenplum-db/gpdb
[6X backport]: Add ipv6 support for gpcheckperf
main(int argc, char** argv) return 1; } - socketFd = socket(PF_INET, SOCK_STREAM, 0); + rc = setupListen("::0", serverPort, AF_INET6);+ if (rc != 0)+ setupListen("0.0.0.0", serverPort, AF_INET);
This line should be:
if (rc != 0)
rc = setupListen("0.0.0.0", serverPort, AF_INET);
comment created time in 2 days
Pull request review commentgreenplum-db/gpdb
[6X backport]: Add ipv6 support for gpcheckperf
+@gpcheckperf+Feature: Tests for gpcheckperf++ @concourse_cluster+ Scenario: gpcheckperf runs disk and memory tests+ Given the database is running+ When the user runs "gpcheckperf -h mdw -h sdw1 -d /data/gpdata/ -r ds"+ Then gpcheckperf should return a return code of 0+ And gpcheckperf should print "disk write tot bytes" to stdout++ @concourse_cluster+ Scenario: gpcheckperf runs runs sequential network test+ Given the database is running+ When the user runs "gpcheckperf -h mdw -h sdw1 -d /data/gpdata/ -r n"+ Then gpcheckperf should return a return code of 0+ And gpcheckperf should print "avg = " to stdout
super nit: add a newline here.
comment created time in 2 days
pull request commentgreenplum-db/gpdb
Enable autovacuum on catalog tables
Thinking more, if there is genuine inconsistency pg_trigger
, pg_index
or pg_rules
table will complain about inconsistency across nodes. Hence, I feel we can safely avoid checking consistency for these variables for pg_class across coordinator and segments. Locally on each postgres instance consistency of pg_trigger
with relhastriggers
pg_class value should be performed anyways.
comment created time in 2 days
pull request commentgreenplum-db/gpdb
Enable autovacuum on catalog tables
Why is
gpcheckcat
failing in PR pipeline? The failures look legit.
The reason for the failure is relhastriggers=true
flag is not cleared from pg_class when triggers are dropped for the table, reference comment in RemoveTriggerById()
.
/*
* We do not bother to try to determine whether any other triggers remain,
* which would be needed in order to decide whether it's safe to clear the
* relation's relhastriggers. (In any case, there might be a concurrent
* process adding new triggers.) Instead, just force a relcache inval to
* make other backends (and this one too!) rebuild their relcache entries.
* There's no great harm in leaving relhastriggers true even if there are
* no triggers left.
*/
Instead, vac_update_relstats()
invoked during vacuum update these fields in pg_class along with updating relfrozenxid
to reflect the reality.
Hence, tables for which trigger was added and then removed, only on nodes vacuum is triggered relhastriggers
will reflect the reality. As we are auto-vacuuming only content 0 and coordinator and not other contents, gpcheckcat
reports miss-match. Though this is harmless miss-match as in-reality no triggers are present for the table.
relhasindex
and relhasrules
also fall in same bucket. True doesn't mean index, rule or trigger is for sure present they may be dropped. Only false means for sure these objects are not present for the table.
I am thinking ideally, gpcheckcat
shouldn't be checking for these fields consistency across coordinator/segments as they can differ. Question is how to detect genuine miss-match. Maybe need to add little more sophisticated mechanism like if mismatch is found then check if really the underlying object is present on nodes which reported value as true, if not then its not really inconsistency. No idea how easy or hard it would be to add such sophistication to gpcheckcat. Or its okay to not check for these fields consistency and done.
comment created time in 2 days
push eventgreenplum-db/gpdb
commit sha 162a94d7690fd6b520853a8e86d5b58129988c25
gpcheckperf: fix python3 errors in multidd
commit sha 06fc8c49b8a4a10f343976d7b190dbadb974c871
gpcheckperf: update to python3
commit sha 97dc0ae2a2a8c2729d937bde7fb2cbb29a3b2eb7
gpcheckperf: use a default value when reducing
commit sha eb804f2aca43c952169370be65cb8e8dd74d36b2
Support ipv6 for gpnetbenchClient and gpnetbenchServer
push time in 2 days
PR merged greenplum-db/gpdb
Add support for ipv6 for running gpnetbenchClient and gpnetbenchServer
Here are some reminders before you submit the pull request
- [ ] Add tests for the change
- [ ] Document changes
- [ ] Communicate in the mailing list if needed
- [ ] Pass
make installcheck
- [ ] Review a PR in return to support the community
pr closed time in 2 days
push eventgreenplum-db/gpdb
commit sha d42662c02aeb07fd53b08b349618cb71e339e372
gpinitsystem: DEFAULT_LOCALE_SETTING determined by output of locale -a Rather than hard coding DEFAULT_LOCALE_SETTING to en_US.utf8 on Linux, which is not available on all platforms, use locale -a output. Authored-by: Brent Doil <bdoil@vmware.com>
push time in 2 days
PR merged greenplum-db/gpdb
Rather than hard coding DEFAULT_LOCALE_SETTING to en_US.utf8 on linux, which is not available on all platforms, use locale -a output.
pr closed time in 2 days
pull request commentgreenplum-db/gpdb
gpinitsystem: DEFAULT_LOCALE_SETTING determined by output of locale -a
What if instead we let
initdb
pick the default locale (as it already has logic for it) for coordinator and then use the same for all segment initdb's (this should reduce the intelligence in the management scripts) ?
That's a good suggestion, and we considered making a larger change like that, but at the moment we're trying to make minimally-impactful changes to the utilities so we can focus on gpupgrade, but we'll keep that in mind when we get to rewriting gpinitsystem as part of the cloud readiness effort.
comment created time in 2 days
push eventgreenplum-db/gpdb
commit sha 05b4fa54df44780d96ff65d100cfc78383519a33
Remove 3 temporarily dead tests that don't compile This trips up Clang Tidy, both locally and in Travis CI, after greenplum-db/gpdb#11336 . To quickly turn Travis CI green on master, let's remove the 3 offending files. We can have a longer discussion about whether we should remove *all* tests that are currently not compiled, and whether we should fail unit tests when the test names themselves are invalid.
push time in 2 days
PR merged greenplum-db/gpdb
This trips up Clang Tidy, both locally and in Travis CI, after greenplum-db/gpdb#11336 . To quickly turn Travis CI green on master, let's remove the 3 offending files. We can have a longer discussion about whether we should remove all tests that are currently not compiled, and whether we should fail unit tests when the test names themselves are invalid.
Question for reviewers:
- Should I remove only the 3 files that offend clang-tidy, or all of the test .cpp files that are not built in CMake?
- If "yes" to the previous question, should I also fix the defect where we let a test invocation "succeed" even if there's no such a test? c.f. #11403
Here are some reminders before you submit the pull request
- [ ] Add tests for the change
- [ ] Document changes
- [ ] Communicate in the mailing list if needed
- [ ] Pass
make installcheck
- [ ] Review a PR in return to support the community
pr closed time in 2 days