profile
viewpoint
Huang YunKun htynkn @apache ChengDu https://www.huangyunkun.com/ Apache Dubbo PMC

htynkn/DartsShaSha 24

a demo of libgdx with 2d ui

htynkn/aliyun-serverless-action 2

github action for deploying function to aliyun serverless platform

htynkn/k8s-vagrant-in-china 1

vagrant config file for k8s in china

htynkn/bk-ci 0

蓝鲸CI平台(BlueKing CI)

htynkn/cocos-ui-libgdx 0

a ui library of ligdx with cocostudio 一个cocostudio的ui解析库

htynkn/docsite 0

An opensource static website generator

htynkn/dubbo 0

Apache Dubbo is a high-performance, java based, open source RPC framework.

htynkn/dubbo-admin 0

The ops and reference implementation for Apache Dubbo

push eventhtynkn/dubbo

Mercy Ma

commit sha f4af3ded4b305423a79272843c6c8ea0207415b1

Polish /apache/dubbo#5745 : Increasing the stack size in the start.sh (#5753)

view details

push time in 5 hours

push eventhtynkn/spring-boot

Brian Clozel

commit sha e59d3fbb861560de372f8bd186e199f0c7d5e42a

Clear ProducesRequestCondition cache attribute As of spring-projects/spring-framework#22644, Spring Framework caches the "produces" condition when matching for endpoints in the `HandlerMapping` infrastructure. This has been improved in spring-projects/spring-framework#23091 to prevent side-effects in other implementations. Prior to this commit, the Spring Boot actuator infrastructure for `EndpointHandlerMapping` would not clear the cached attribute, presenting the same issue as Spring Framework's infrastructure. This means that a custom arrangement with custom `HandlerMapping` or `ContentTypeResolver` would not work properly and reuse the cached produced conditions for other, unintented, parts of the handler mapping process. This commit clears the cached data and ensures that other handler mapping implementations are free of that side-effect. Fixes gh-20150

view details

dreis2211

commit sha 76d2bc27eb95b2b2e3de7db53758bc441b812ab5

Explicitly set java home in Maven Plugin integration tests See gh-20193

view details

Andy Wilkinson

commit sha dcbbe20d41c46e6373655d03a11bb868ce8621f3

Merge pull request #20193 from dreis2211 * gh-20193: Explicitly set java home in Maven Plugin integration tests Closes gh-20193

view details

dreis2211

commit sha 4f824bf9ad572af675f8cde21c6337de8467891d

Fix duplicate words See gh-20210

view details

Stephane Nicoll

commit sha c53d4f2bf1900697b5a6986ba63a418b7d7717e8

Merge pull request #20210 from dreis2211 * pr/20210: Fix duplicate words Closes gh-20210

view details

Stephane Nicoll

commit sha 16111f126e707fb5064bab6150df7a77f54850ee

Use query-less datasource validation by default This commit changes DataSourceHealthIndicator to validate the connection rather than issuing a query to the database. If a custom validation query is specified, it uses that as before. Closes gh-17582

view details

dreis2211

commit sha 866c441d70c841afa1fe7e3af44a5afd250fcae1

Upgrade to Gradle 6.2 See gh-20213

view details

Andy Wilkinson

commit sha bca1b926ff17be124e53b34c05ceaf94f5363bf4

Merge pull request #20213 from dreis2211 * gh-20213: Upgrade to Gradle 6.2 Closes gh-20213

view details

Stephane Nicoll

commit sha 2147976c178a4eab7ca02ba99cfb53d767e3d8f1

Do not fallback to embedded configuration if a datasource url is set This commit makes sure that a fallback embedded datasource is not created if no suitable connection pool is found and an url has been explicitly registered. This is consistent with EmbeddedDataSourceConfiguration as it is using EmbeddedDatabaseBuilder behind the scenes and the latter does not honour the configured URL anyway. Closes gh-19192

view details

Stephane Nicoll

commit sha 1d60184075a4333ed0f58da4b58bb9cf1ba49829

Merge branch '2.1.x' into 2.2.x Closes gh-20217

view details

Stephane Nicoll

commit sha 287d577aeac516de8b21630b76208d276b7b5fca

Merge branch '2.2.x' Closes gh-20218

view details

Stephane Nicoll

commit sha 4ec30e1145d7221338e420b5727693b5fe073716

Add support for SimpleDriverDataSource This commit makes sure that DataSourceBuilder can configure SimpleDriverDataSource by adding an alias for the driver's class name. Closes gh-20220 Co-authored-by: Dmytro Nosan <dimanosan@gmail.com>

view details

hbellahc

commit sha d890f1f6d8c763c527d3795923befa51254f1f19

Document missing reference to DataSourceHealthIndicator See gh-20216

view details

Stephane Nicoll

commit sha d3535ca15f77906555ac6610750d0d3f2fe72394

Merge pull request #20216 from hbellahc * pr/20216: Document missing reference to DataSourceHealthIndicator Closes gh-20216

view details

Stephane Nicoll

commit sha 363edfa00cc9427be545abc1b41f39df82c7b17b

Merge branch '2.2.x' Closes gh-20221

view details

Andy Wilkinson

commit sha d8c309a31080f2db548178558522fc11769dd60c

Update gradlew.bat with Gradle 6.2's changes See gh-20213

view details

push time in 5 hours

push eventhtynkn/spark

Ajith

commit sha 657d151395ca996f8ec0eed695a873abbd63d760

[SPARK-29174][SQL] Support LOCAL in INSERT OVERWRITE DIRECTORY to data source ### What changes were proposed in this pull request? `INSERT OVERWRITE LOCAL DIRECTORY` is supported with ensuring the provided path is always using `file://` as scheme and removing the check which throws exception if we do insert overwrite by mentioning directory with `LOCAL` syntax ### Why are the changes needed? without the modification in PR, ``` insert overwrite local directory <location> using ``` throws exception ``` Error: org.apache.spark.sql.catalyst.parser.ParseException: LOCAL is not supported in INSERT OVERWRITE DIRECTORY to data source(line 1, pos 0) ``` which was introduced in https://github.com/apache/spark/pull/18975, but this restriction is not needed, hence dropping the same. Keep behaviour consistent for local and remote file-system in `INSERT OVERWRITE DIRECTORY` ### Does this PR introduce any user-facing change? Yes, after this change `INSERT OVERWRITE LOCAL DIRECTORY` will not throw exception ### How was this patch tested? Added UT Closes #27039 from ajithme/insertoverwrite2. Authored-by: Ajith <ajith2489@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

beliefer

commit sha d8d3ce5c76732a8e3daf10a37621b5ec8a52dcec

[SPARK-30825][SQL][DOC] Update documents information for window function ### What changes were proposed in this pull request? I checked the all the window function and found all of them not add parameter information and version information to the document. This PR will make a supplement. ### Why are the changes needed? Documentation is missing and does not meet new standards. ### Does this PR introduce any user-facing change? Yes. User will face the information of parameters and version. ### How was this patch tested? Exists UT Closes #27572 from beliefer/add_since_for_window_function. Authored-by: beliefer <beliefer@163.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

Nicholas Chammas

commit sha 4ed9b88996dc0e8f096157659d537d9dd8eaed47

[SPARK-30832][DOCS] SQL function doc headers should link to anchors ### Why are the changes needed? In most of our docs, you can click on a heading to immediately get an anchor link to that specific section of the docs. This is very handy when you are reading the docs and want to share a link to a specific part. The SQL function docs are lacking this. This PR adds this convenience to the SQL function docs. Here's the impact on the generated HTML. Before this PR: ```html <h3 id="array_join">array_join</h3> ``` After this PR: ```html <h3 id="array_join"><a class="toclink" href="#array_join">array_join</a></h3> ``` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? I built the docs manually and reviewed the results in my browser. Closes #27585 from nchammas/SPARK-30832-sql-doc-headers. Authored-by: Nicholas Chammas <nicholas.chammas@gmail.com> Signed-off-by: Sean Owen <srowen@gmail.com>

view details

Liang Zhang

commit sha d8c0599e542976ef70b60bc673e7c9732fce49e5

[SPARK-30791][SQL][PYTHON] Add 'sameSemantics' and 'sementicHash' methods in Dataset ### What changes were proposed in this pull request? This PR added two DeveloperApis to the Dataset[T] class. Both methods are just exposing lower-level methods to the Dataset[T] class. ### Why are the changes needed? They are useful for checking whether two dataframes are the same when implementing dataframe caching in python, and also get a unique ID. It's easier to use if we wrap the lower-level APIs. ### Does this PR introduce any user-facing change? ``` scala> val df1 = Seq((1,2),(4,5)).toDF("col1", "col2") df1: org.apache.spark.sql.DataFrame = [col1: int, col2: int] scala> val df2 = Seq((1,2),(4,5)).toDF("col1", "col2") df2: org.apache.spark.sql.DataFrame = [col1: int, col2: int] scala> val df3 = Seq((0,2),(4,5)).toDF("col1", "col2") df3: org.apache.spark.sql.DataFrame = [col1: int, col2: int] scala> val df4 = Seq((0,2),(4,5)).toDF("col0", "col2") df4: org.apache.spark.sql.DataFrame = [col0: int, col2: int] scala> df1.semanticHash res0: Int = 594427822 scala> df2.semanticHash res1: Int = 594427822 scala> df1.sameSemantics(df2) res2: Boolean = true scala> df1.sameSemantics(df3) res3: Boolean = false scala> df3.semanticHash res4: Int = -1592702048 scala> df4.semanticHash res5: Int = -1592702048 scala> df4.sameSemantics(df3) res6: Boolean = true ``` ### How was this patch tested? Unit test in scala and doctest in python. Note: comments are copied from the corresponding lower-level APIs. Note: There are some issues to be fixed that would improve the hash collision rate: https://github.com/apache/spark/pull/27565#discussion_r379881028 Closes #27565 from liangz1/df-same-result. Authored-by: Liang Zhang <liang.zhang@databricks.com> Signed-off-by: WeichenXu <weichen.xu@databricks.com>

view details

Terry Kim

commit sha 5866bc77d7703939e93c00f22ea32981d4ebdc6c

[SPARK-30814][SQL] ALTER TABLE ... ADD COLUMN position should be able to reference columns being added ### What changes were proposed in this pull request? In ALTER TABLE, a column in ADD COLUMNS can depend on the position of a column that is just being added. For example, for a table with the following schema: ``` root: - a: string - b: long ``` , the following should work: ``` ALTER TABLE t ADD COLUMNS (x int AFTER a, y int AFTER x) ``` Currently, the above statement will throw an exception saying that AFTER x cannot be resolved, because x doesn't exist yet. This PR proposes to fix this issue. ### Why are the changes needed? To fix a bug described above. ### Does this PR introduce any user-facing change? Yes, now ``` ALTER TABLE t ADD COLUMNS (x int AFTER a, y int AFTER x) ``` works as expected. ### How was this patch tested? Added new tests Closes #27584 from imback82/alter_table_pos_fix. Authored-by: Terry Kim <yuminkim@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

HyukjinKwon

commit sha 9618806f44196aa2a69061794b34c31d264a4b3c

[SPARK-30847][SQL] Take productPrefix into account in MurmurHash3.productHash ### What changes were proposed in this pull request? This PR proposes to port Scala's bugfix https://github.com/scala/scala/pull/7693 (Scala 2.13) to address https://github.com/scala/bug/issues/10495 issue. In short, it is possible for different product instances having the same children to have the same hash. See: ```scala scala> spark.range(1).selectExpr("id - 1").queryExecution.analyzed.semanticHash() res0: Int = -565572825 scala> spark.range(1).selectExpr("id + 1").queryExecution.analyzed.semanticHash() res1: Int = -565572825 ``` ### Why are the changes needed? It was found during the review of https://github.com/apache/spark/pull/27565. We should better produce different hash for different objects. ### Does this PR introduce any user-facing change? No, it's not identified. Possibly performance related issue. ### How was this patch tested? Manually tested, and unittest was added. Closes #27601 from HyukjinKwon/SPARK-30847. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

David Toneian

commit sha 504b5135d09751c15ba1f515e82d3bb6fe87ea40

[SPARK-30859][PYSPARK][DOCS][MINOR] Fixed docstring syntax issues preventing proper compilation of documentation This commit is published into the public domain. ### What changes were proposed in this pull request? Some syntax issues in docstrings have been fixed. ### Why are the changes needed? In some places, the documentation did not render as intended, e.g. parameter documentations were not formatted as such. ### Does this PR introduce any user-facing change? Slight improvements in documentation. ### How was this patch tested? Manual testing. No new Sphinx warnings arise due to this change. Closes #27613 from DavidToneian/SPARK-30859. Authored-by: David Toneian <david@toneian.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

yi.wu

commit sha 643a480b115d19fdc26a1edb463cf896467f890a

[SPARK-30863][SQL] Distinguish Cast and AnsiCast in toString ### What changes were proposed in this pull request? Prefix by `ansi_` in `toString` if it's a `AnsiCast` or ansi enabled `Cast`. E.g. run `spark.sql("select cast('51' as int)").queryExecution.analyzed` under ansi mode. Before this PR: ``` Project [cast(51 as int) AS CAST(51 AS INT)#0] +- OneRowRelation ``` After this PR: ``` Project [ansi_cast(51 as int) AS CAST(51 AS INT)#0] +- OneRowRelation ``` ### Why are the changes needed? This is useful while comparing `LogicalPlan`s literally. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Pass Jenkins. Closes #27608 from Ngone51/ansi_cast_tostring. Authored-by: yi.wu <yi.wu@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

yi.wu

commit sha 68d7edf9497bea2f73707d32ab55dd8e53088e7c

[SPARK-30812][SQL][CORE] Revise boolean config name to comply with new config naming policy ### What changes were proposed in this pull request? Revise below config names to comply with [new config naming policy](http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-naming-policy-of-Spark-configs-td28875.html): SQL: * spark.sql.execution.subquery.reuse.enabled / [SPARK-27083](https://issues.apache.org/jira/browse/SPARK-27083) * spark.sql.legacy.allowNegativeScaleOfDecimal.enabled / [SPARK-30252](https://issues.apache.org/jira/browse/SPARK-30252) * spark.sql.adaptive.optimizeSkewedJoin.enabled / [SPARK-29544](https://issues.apache.org/jira/browse/SPARK-29544) * spark.sql.legacy.property.nonReserved / [SPARK-30183](https://issues.apache.org/jira/browse/SPARK-30183) * spark.sql.streaming.forceDeleteTempCheckpointLocation.enabled / [SPARK-26389](https://issues.apache.org/jira/browse/SPARK-26389) * spark.sql.analyzer.failAmbiguousSelfJoin.enabled / [SPARK-28344](https://issues.apache.org/jira/browse/SPARK-28344) * spark.sql.adaptive.shuffle.reducePostShufflePartitions.enabled / [SPARK-30074](https://issues.apache.org/jira/browse/SPARK-30074) * spark.sql.execution.pandas.arrowSafeTypeConversion / [SPARK-25811](https://issues.apache.org/jira/browse/SPARK-25811) * spark.sql.legacy.looseUpcast / [SPARK-24586](https://issues.apache.org/jira/browse/SPARK-24586) * spark.sql.legacy.arrayExistsFollowsThreeValuedLogic / [SPARK-28052](https://issues.apache.org/jira/browse/SPARK-28052) * spark.sql.sources.ignoreDataLocality.enabled / [SPARK-29189](https://issues.apache.org/jira/browse/SPARK-29189) * spark.sql.adaptive.shuffle.fetchShuffleBlocksInBatch.enabled / [SPARK-9853](https://issues.apache.org/jira/browse/SPARK-9853) CORE: * spark.eventLog.erasureCoding.enabled / [SPARK-25855](https://issues.apache.org/jira/browse/SPARK-25855) * spark.shuffle.readHostLocalDisk.enabled / [SPARK-30235](https://issues.apache.org/jira/browse/SPARK-30235) * spark.scheduler.listenerbus.logSlowEvent.enabled / [SPARK-29001](https://issues.apache.org/jira/browse/SPARK-29001) * spark.resources.coordinate.enable / [SPARK-27371](https://issues.apache.org/jira/browse/SPARK-27371) * spark.eventLog.logStageExecutorMetrics.enabled / [SPARK-23429](https://issues.apache.org/jira/browse/SPARK-23429) ### Why are the changes needed? To comply with the config naming policy. ### Does this PR introduce any user-facing change? No. Configurations listed above are all newly added in Spark 3.0. ### How was this patch tested? Pass Jenkins. Closes #27563 from Ngone51/revise_boolean_conf_name. Authored-by: yi.wu <yi.wu@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

Wenchen Fan

commit sha 1b67d546bd96412943de0c7a3b4295cbde887bd2

revert SPARK-29663 and SPARK-29688 ### What changes were proposed in this pull request? This PR reverts https://github.com/apache/spark/pull/26325 and https://github.com/apache/spark/pull/26347 ### Why are the changes needed? When we do sum/avg, we need a wider type of input to hold the sum value, to reduce the possibility of overflow. For example, we use long to hold the sum of integral inputs, use double to hold the sum of float/double. However, we don't have a wider type of interval. Also the semantic is unclear: what if the days field overflows but the months field doesn't? Currently the avg of `1 month` and `2 month` is `1 month 15 days`, which assumes 1 month has 30 days and we should avoid this assumption. ### Does this PR introduce any user-facing change? yes, remove 2 features added in 3.0 ### How was this patch tested? N/A Closes #27619 from cloud-fan/revert. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: herman <herman@databricks.com>

view details

push time in 5 hours

push eventhtynkn/fish-redux

push time in 5 hours

push eventhtynkn/dubbo-admin

push time in 5 hours

push eventhtynkn/dubbo-samples

push time in 5 hours

push eventhtynkn/rpc-benchmark

push time in 5 hours

push eventhtynkn/dubbo

Mercy Ma

commit sha 7b0ae127a96eb508af5c2b980b20041efff01d56

2.7.6 REST Metadata (#5738) * Polish /apache/dubbo#4687 : Remove the duplicated test code in dubbo-config-spring * Polish /apache/dubbo#4674 & /apache/dubbo#4470 * Polish /apache/dubbo#5093 : Revert the previous commit * Polish apache/dubbo#5093 : [Feature] Dubbo Services generate the metadata of REST services * Polish apache/dubbo#5306 : [Migration] Upgrade the @since tags in Javadoc migration cloud native to master * Polish apache/dubbo#5306 : [Migration] Upgrade the @since tags in Javadoc migration cloud native to master * Polish apache/dubbo#5309 : [ISSURE] The beans of Dubbo's Config can't be found on the ReferenceBean's initialization * Polish apache/dubbo#5312 : Resolve the demos' issues of zookeeper and nacos * Polish apache/dubbo#5313 : [Migration] migrate the code in common module from cloud-native branch to master * Polish apache/dubbo#5316 : [Refactor] Replace @EnableDubboConfigBinding Using spring-context-support * Polish apache/dubbo#5317 : [Refactor] Refactor ReferenceAnnotationBeanPostProcessor using Alibaba spring-context-suuport API * Polish apache/dubbo#5321 : Remove BeanFactoryUtils * Polish apache/dubbo#5321 : Remove AnnotatedBeanDefinitionRegistryUtils * Polish apache/dubbo#5321 : Remove AnnotationUtils * Polish apache/dubbo#5321 : Remove ClassUtils * Polish apache/dubbo#5321 : Remove BeanRegistrar * Polish apache/dubbo#5321 : Remove ObjectUtils * Polish apache/dubbo#5321 : Remove PropertySourcesUtils * Polish apache/dubbo#5325 : [Migration] To migrate dubbo-metadata-api from cloud-native branch * Polish apache/dubbo#5326 : [Migration] To migrate dubbo-metadata-processor from cloud-native branch * Polish apache/dubbo#5329 : [Feature] To add the default metadata into ServiceInstance * Polish apache/dubbo#5339 : [Refactor] Refactor the DynamicConfiguration interface * Polish bugfix * Fixes test cases * Merge remote-tracking branch 'upstream/master' into cloud-native-2.7.5 # Conflicts: # dubbo-configcenter/dubbo-configcenter-zookeeper/src/test/java/org/apache/dubbo/configcenter/support/zookeeper/ZookeeperDynamicConfigurationTest.java # dubbo-metadata/dubbo-metadata-api/src/test/java/org/apache/dubbo/metadata/DynamicConfigurationServiceNameMappingTest.java * Merge remote-tracking branch 'upstream/master' into cloud-native-2.7.5 # Conflicts: # dubbo-configcenter/dubbo-configcenter-zookeeper/src/test/java/org/apache/dubbo/configcenter/support/zookeeper/ZookeeperDynamicConfigurationTest.java # dubbo-metadata/dubbo-metadata-api/src/test/java/org/apache/dubbo/metadata/DynamicConfigurationServiceNameMappingTest.java * Polish /apache/dubbo#5721 : [Enhancement] Setting the default IDs for Dubbo's Config Beans * Polish /apache/dubbo#5729 : [Optimization] To remove EnableDubboConfigBinding and EnableDubboConfigBindings * Polish /apache/dubbo#5594 : [Feature] Add the resolver of ServiceRestMetadata based on Java Reflection * Polish /apache/dubbo#5736 : [Feature] Introducing Conversion features * Polish /apache/dubbo#5737 : [Feature] Introducing "dubbo-metadata-processor" module * Polish /apache/dubbo#5594 : Change the Metadata implementation * Polish /apache/dubbo#5594 : Fixed test cases * Polish /apache/dubbo#5594 : Fixed test cases * Polish /apache/dubbo#5594 : Fixed test cases * Polish /apache/dubbo#5594 : Fixed test cases * Polish /apache/dubbo#5594 : Fixed test cases * Polish /apache/dubbo#5594 : Fixed test cases * Polish /apache/dubbo#5594 : Fixed test cases * Polish /apache/dubbo#5594 : Fixed test cases

view details

push time in a day

push eventhtynkn/spark

Gengliang Wang

commit sha da2ca85cee3960de7a86a21483de1d77767ca060

[SPARK-30703][SQL][DOCS][FOLLOWUP] Declare the ANSI SQL compliance options as experimental ### What changes were proposed in this pull request? This is a follow-up of https://github.com/apache/spark/pull/27489. It declares the ANSI SQL compliance options as experimental in the documentation. ### Why are the changes needed? The options are experimental. There can be new features/behaviors in future releases. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Generating doc Closes #27590 from gengliangwang/ExperimentalAnsi. Authored-by: Gengliang Wang <gengliang.wang@databricks.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

Bryan Cutler

commit sha be3cb71e9cb34ad9054325c3122745e66e6f1ede

[SPARK-30834][DOCS][PYTHON] Add note for recommended pandas and pyarrow versions ### What changes were proposed in this pull request? Add doc for recommended pandas and pyarrow versions. ### Why are the changes needed? The recommended versions are those that have been thoroughly tested by Spark CI. Other versions may be used at the discretion of the user. ### Does this PR introduce any user-facing change? No ### How was this patch tested? NA Closes #27587 from BryanCutler/python-doc-rec-pandas-pyarrow-SPARK-30834-3.0. Lead-authored-by: Bryan Cutler <cutlerb@gmail.com> Co-authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

Kent Yao

commit sha 0353cbf092e15a09e8979070ecd5b653062b2cb5

[MINOR][DOC] Fix 2 style issues in running-on-kubernetes doc ### What changes were proposed in this pull request? fix style issue in the k8s document, please go to http://spark.apache.org/docs/3.0.0-preview2/running-on-kubernetes.html and search the keyword`spark.kubernetes.file.upload.path` to jump to the error context ### Why are the changes needed? doc correctness ### Does this PR introduce any user-facing change? Nah ### How was this patch tested? Nah Closes #27582 from yaooqinn/k8s-doc. Authored-by: Kent Yao <yaooqinn@hotmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

Wenchen Fan

commit sha ab07c6300c884e772f88694f4b718659c45dbb33

[SPARK-30799][SQL] "spark_catalog.t" should not be resolved to temp view ### What changes were proposed in this pull request? No v2 command supports temp views and the `ResolveCatalogs`/`ResolveSessionCatalog` framework is designed with this assumption. However, `ResolveSessionCatalog` needs to fallback to v1 commands, which do support temp views (e.g. CACHE TABLE). To work around it, we add a hack in `CatalogAndIdentifier`, which does not expand the given identifier with current namespace if the catalog is session catalog. This works fine in most cases, as temp views should take precedence over tables during lookup. So if `CatalogAndIdentifier` returns a single name "t", the v1 commands can still resolve it to temp views correctly, or resolve it to table "default.t" if temp view doesn't exist. However, if users write `spark_catalog.t`, it shouldn't be resolved to temp views as temp views don't belong to any catalog. `CatalogAndIdentifier` can't distinguish between `spark_catalog.t` and `t`, so the caller side may mistakenly resolve `spark_catalog.t` to a temp view. This PR proposes to fix this issue by 1. remove the hack in `CatalogAndIdentifier`, and clearly document that this shouldn't be used to resolve temp views. 2. update `ResolveSessionCatalog` to explicitly look up temp views first before calling `CatalogAndIdentifier`, for v1 commands that support temp views. ### Why are the changes needed? To avoid releasing a behavior that we should not support. Removing the hack also fixes the problem we hit in https://github.com/apache/spark/pull/27532/files#diff-57b3d87be744b7d79a9beacf8e5e5eb2R937 ### Does this PR introduce any user-facing change? yes, now it's not allowed to refer to a temp view with `spark_catalog` prefix. ### How was this patch tested? new tests Closes #27550 from cloud-fan/ns. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

Wenchen Fan

commit sha 619274ed363ea4beb1d2cd10f24988b0dc57ef80

[DOC] add config naming guideline ### What changes were proposed in this pull request? Add docs to describe the config naming guideline. ### Why are the changes needed? To encourage contributors to name configs more consistently. ### Does this PR introduce any user-facing change? no ### How was this patch tested? N/A Closes #27577 from cloud-fan/config. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

Maxim Gekk

commit sha 9107f77f15cd0630dc981b6e8a9ca696b79e624f

[SPARK-30843][SQL] Fix getting of time components before 1582 year ### What changes were proposed in this pull request? 1. Rewrite DateTimeUtils methods `getHours()`, `getMinutes()`, `getSeconds()`, `getSecondsWithFraction()`, `getMilliseconds()` and `getMicroseconds()` using Java 8 time APIs. This will automatically switch the `Hour`, `Minute`, `Second` and `DatePart` expressions on Proleptic Gregorian calendar. 2. Remove unused methods and constant of DateTimeUtils - `to2001`, `YearZero `, `toYearZero` and `absoluteMicroSecond()`. 3. Remove unused value `timeZone` from `TimeZoneAwareExpression` since all expressions have been migrated to Java 8 time API, and legacy instance of `TimeZone` is not needed any more. 4. Change signatures of modified DateTimeUtils methods, and pass `ZoneId` instead of `TimeZone`. This will allow to avoid unnecessary conversions `TimeZone` -> `String` -> `ZoneId`. 5. Modify tests in `DateTimeUtilsSuite` and in `DateExpressionsSuite` to pass `ZoneId` instead of `TimeZone`. Correct the tests, to pass tested zone id instead of None. ### Why are the changes needed? The changes fix the issue of wrong results returned by the `hour()`, `minute()`, `second()`, `date_part('millisecond', ...)` and `date_part('microsecond', ....)`, see example in [SPARK-30843](https://issues.apache.org/jira/browse/SPARK-30843). ### Does this PR introduce any user-facing change? Yes. After the changes, the results of examples from SPARK-30843: ```sql spark-sql> select hour(timestamp '0010-01-01 00:00:00'); 0 spark-sql> select minute(timestamp '0010-01-01 00:00:00'); 0 spark-sql> select second(timestamp '0010-01-01 00:00:00'); 0 spark-sql> select date_part('milliseconds', timestamp '0010-01-01 00:00:00'); 0.000 spark-sql> select date_part('microseconds', timestamp '0010-01-01 00:00:00'); 0 ``` ### How was this patch tested? - By existing test suites `DateTimeUtilsSuite`, `DateExpressionsSuite` and `DateFunctionsSuite`. - Add new tests to `DateExpressionsSuite` and `DateTimeUtilsSuite` for 10 year, like: ```scala input = date(10, 1, 1, 0, 0, 0, 0, zonePST) assert(getHours(input, zonePST) === 0) ``` - Re-run `DateTimeBenchmark` using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/11 | Closes #27596 from MaxGekk/localtimestamp-greg-cal. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Max Gekk <max.gekk@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-1-30.us-west-2.compute.internal> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

Takeshi Yamamuro

commit sha 29b3e427794b04a6113f04189cc503fa5bb18666

[MINOR] Update the PR template for adding a link to the configuration naming guideline ### What changes were proposed in this pull request? This is a follow-up of #27577. This pr intends to add a link to the configuration naming guideline in `.github/PULL_REQUEST_TEMPLATE`. ### Why are the changes needed? For reminding developers to follow the naming rules. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? N/A Closes #27602 from maropu/pr27577-FOLLOWUP. Authored-by: Takeshi Yamamuro <yamamuro@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

Jungtaek Lim (HeartSaVioR)

commit sha 446b2d2653a8ae0c9e35799eb4fc1d9f4dcd7991

[SPARK-28869][DOCS][FOLLOWUP] Add direct relationship between configs for rolling event log ### What changes were proposed in this pull request? This patch addresses the post-hoc review comment linked here - https://github.com/apache/spark/pull/25670#discussion_r373304076 ### Why are the changes needed? We would like to explicitly document the direct relationship before we finish up structuring of configurations. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? N/A Closes #27576 from HeartSaVioR/SPARK-28869-FOLLOWUP-doc. Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

Jungtaek Lim (HeartSaVioR)

commit sha 5445fe92887d04513d77af0890db572266136220

[SPARK-30827][DOCS] Document direct relationship among configurations in "spark.history.*" namespace ### What changes were proposed in this pull request? This patch adds direct relationship among configurations under "spark.history" namespace. ### Why are the changes needed? Refer the discussion thread: https://lists.apache.org/thread.html/r43c4e57cace116aca1f0f099e8a577cf202859e3671a04077867b84a%40%3Cdev.spark.apache.org%3E ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Locally ran jekyll and confirmed. Screenshots for the modified spots: <img width="1159" alt="Screen Shot 2020-02-15 at 8 20 14 PM" src="https://user-images.githubusercontent.com/1317309/74587003-d5922b00-5030-11ea-954b-ee37fc08470a.png"> <img width="1158" alt="Screen Shot 2020-02-15 at 8 20 44 PM" src="https://user-images.githubusercontent.com/1317309/74587005-d62ac180-5030-11ea-98fc-98b1c9d83ff4.png"> <img width="1149" alt="Screen Shot 2020-02-15 at 8 19 56 PM" src="https://user-images.githubusercontent.com/1317309/74587002-d1660d80-5030-11ea-84b5-dec3d7f5c97c.png"> Closes #27575 from HeartSaVioR/SPARK-30827. Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

Liupengcheng

commit sha 5b873420b039f61b85bf349443eb830556fab339

[SPARK-30346][CORE] Improve logging when events dropped ### What changes were proposed in this pull request? Make logging events dropping every 60s works fine, the orignal implementaion some times not working due to susequent events comming and updating the DroppedEventCounter ### Why are the changes needed? Currenly, the logging may be skipped and delayed a long time under high concurrency, that make debugging hard. So This PR will try to fix it. ### Does this PR introduce any user-facing change? No ### How was this patch tested? NA Closes #27002 from liupc/Improve-logging-dropped-events-and-logging-threadDump. Authored-by: Liupengcheng <liupengcheng@xiaomi.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

Yuanjian Li

commit sha ab186e36597e16d48f8eb16d57e7a39b6b67d4e4

[SPARK-25829][SQL] Add config `spark.sql.legacy.allowDuplicatedMapKeys` and change the default behavior ### What changes were proposed in this pull request? This is a follow-up for #23124, add a new config `spark.sql.legacy.allowDuplicatedMapKeys` to control the behavior of removing duplicated map keys in build-in functions. With the default value `false`, Spark will throw a RuntimeException while duplicated keys are found. ### Why are the changes needed? Prevent silent behavior changes. ### Does this PR introduce any user-facing change? Yes, new config added and the default behavior for duplicated map keys changed to RuntimeException thrown. ### How was this patch tested? Modify existing UT. Closes #27478 from xuanyuanking/SPARK-25892-follow. Authored-by: Yuanjian Li <xyliyuanjian@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

Maxim Gekk

commit sha 06217cfded8d32962e7c54c315f8e684eb9f0999

[SPARK-30793][SQL] Fix truncations of timestamps before the epoch to minutes and seconds ### What changes were proposed in this pull request? In the PR, I propose to replace `%` by `Math.floorMod` in `DateTimeUtils.truncTimestamp` for the `SECOND` and `MINUTE` levels. ### Why are the changes needed? This fixes the issue of incorrect truncation of timestamps before the epoch `1970-01-01T00:00:00.000000Z` to the `SECOND` and `MINUTE` levels. For example, timestamps after the epoch are truncated by cutting off the rest part of the timestamp: ```sql spark-sql> select date_trunc('SECOND', '2020-02-11 00:01:02.123'); 2020-02-11 00:01:02 ``` but seconds in the truncated timestamp before the epoch are increased by 1: ```sql spark-sql> select date_trunc('SECOND', '1960-02-11 00:01:02.123'); 1960-02-11 00:01:03 ``` ### Does this PR introduce any user-facing change? Yes. After the changes, the example above outputs correct result: ```sql spark-sql> select date_trunc('SECOND', '1960-02-11 00:01:02.123'); 1960-02-11 00:01:02 ``` ### How was this patch tested? Added new tests to `DateFunctionsSuite`. Closes #27543 from MaxGekk/fix-second-minute-truc. Authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

Arwin Tio

commit sha 25e9156bc091c087305405c2310e0e85159dc30b

[SPARK-29089][SQL] Parallelize blocking FileSystem calls in DataSource#checkAndGlobPathIfNecessary ### What changes were proposed in this pull request? See JIRA: https://issues.apache.org/jira/browse/SPARK-29089 Mailing List: http://apache-spark-developers-list.1001551.n3.nabble.com/DataFrameReader-bottleneck-in-DataSource-checkAndGlobPathIfNecessary-when-reading-S3-files-td27828.html When using DataFrameReader#csv to read many files on S3, globbing and fs.exists on DataSource#checkAndGlobPathIfNecessary becomes a bottleneck. From the mailing list discussions, an improvement that can be made is to parallelize the blocking FS calls: > - have SparkHadoopUtils differentiate between files returned by globStatus(), and which therefore exist, and those which it didn't glob for -it will only need to check those. > - add parallel execution to the glob and existence checks ### Why are the changes needed? Verifying/globbing files happens on the driver, and if this operations take a long time (for example against S3), then the entire cluster has to wait, potentially sitting idle. This change hopes to make this process faster. ### Does this PR introduce any user-facing change? No ### How was this patch tested? I added a test suite `DataSourceSuite` - open to suggestions for better naming. See [here](https://github.com/apache/spark/pull/25899#issuecomment-534380034) and [here](https://github.com/apache/spark/pull/25899#issuecomment-534069194) for some measurements Closes #25899 from cozos/master. Lead-authored-by: Arwin Tio <Arwin.tio@adroll.com> Co-authored-by: Arwin Tio <arwin.tio@hotmail.com> Co-authored-by: Arwin Tio <arwin.tio@adroll.com> Signed-off-by: Sean Owen <srowen@gmail.com>

view details

zhengruifeng

commit sha 0a4080ec3bac4b99c076db62adc23f23039584ce

[SPARK-30736][ML] One-Pass ChiSquareTest ### What changes were proposed in this pull request? 1, distributedly gather matrix `contingency` of each feature 2, distributedly compute the results and then collect them back to the driver ### Why are the changes needed? existing impl is not efficient: 1, it directly collect matrix `contingency` of partial featues to driver and compute the corresponding result on one pass; 2, a matrix `contingency` of a featues is of size numDistinctValues X numDistinctLabels, so only 1000 matrices can be collected at a time; ### Does this PR introduce any user-facing change? No ### How was this patch tested? existing testsuites Closes #27461 from zhengruifeng/chisq_opt. Authored-by: zhengruifeng <ruifengz@foxmail.com> Signed-off-by: Sean Owen <srowen@gmail.com>

view details

Yuanjian Li

commit sha e4a541b278b47d375252b2d0a8482c053ec3dd3e

[SPARK-30829][SQL] Define LegacyBehaviorPolicy enumeration as the common value for result change configs ### What changes were proposed in this pull request? Define a new enumeration `LegacyBehaviorPolicy` in SQLConf, it will be used as the common value for result change configs. ### Why are the changes needed? During API auditing for the 3.0 release, we found several new approaches that will change the results silently. For these features, we need a common three-value config. ### Does this PR introduce any user-facing change? Yes, original config `spark.sql.legacy.ctePrecedence.enabled` change to `spark.sql.legacy.ctePrecedencePolicy`. ### How was this patch tested? Existing UT. Closes #27579 from xuanyuanking/SPARK-30829. Authored-by: Yuanjian Li <xyliyuanjian@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

wangguangxin.cn

commit sha 0ae3ff60c4679495d376d2418450cfe93cac5590

[SPARK-30806][SQL] Evaluate once per group in UnboundedWindowFunctionFrame ### What changes were proposed in this pull request? We only need to do aggregate evaluation once per group in `UnboundedWindowFunctionFrame` ### Why are the changes needed? Currently, in `UnboundedWindowFunctionFrame.write`,it re-evaluate the processor for each row in a group, which is not necessary in fact which I'll address later. It hurts performance when the evaluation is time-consuming (for example, Percentile's eval need to sort its buffer and do some calculation). In our production, there is a percentile with window operation sql, it costs more than 10 hours in SparkSQL while 10min in Hive. In fact, `UnboundedWindowFunctionFrame` can be treated as `SlidingWindowFunctionFrame` with `lbound = UnboundedPreceding` and `ubound = UnboundedFollowing`, just as its comments. In that case, `SlidingWindowFunctionFrame` also only do evaluation once for each group. The performance issue can be reproduced by running the follow scripts in local spark-shell ``` spark.range(100*100).map(i => (i, "India")).toDF("uv", "country").createOrReplaceTempView("test") sql("select uv, country, percentile(uv, 0.95) over (partition by country) as ptc95 from test").collect.foreach(println) ``` Before this patch, the sql costs **128048 ms**. With this patch, the sql costs **3485 ms**. If we increase the data size to 1000*1000 for example, then spark cannot even produce result without this patch(I'v waited for several hours). ### Does this PR introduce any user-facing change? NO ### How was this patch tested? Existing UT Closes #27558 from WangGuangxin/windows. Authored-by: wangguangxin.cn <wangguangxin.cn@gmail.com> Signed-off-by: herman <herman@databricks.com>

view details

Yuming Wang

commit sha 76ddb6d835e773d87d93fd2dac008ace1f756d3b

[SPARK-30755][SQL] Update migration guide and add actionable exception for HIVE-15167 ### What changes were proposed in this pull request? [HIVE-15167](https://issues.apache.org/jira/browse/HIVE-15167) removed the `SerDe` interface. This may break custom `SerDe` builds for Hive 1.2. This PR update the migration guide for this change. ### Why are the changes needed? Otherwise: ``` 2020-01-27 05:11:20.446 - stderr> 20/01/27 05:11:20 INFO DAGScheduler: ResultStage 2 (main at NativeMethodAccessorImpl.java:0) failed in 1.000 s due to Job aborted due to stage failure: Task 0 in stage 2.0 failed 4 times, most recent failure: Lost task 0.3 in stage 2.0 (TID 13, 10.110.21.210, executor 1): java.lang.NoClassDefFoundError: org/apache/hadoop/hive/serde2/SerDe 2020-01-27 05:11:20.446 - stderr> at java.lang.ClassLoader.defineClass1(Native Method) 2020-01-27 05:11:20.446 - stderr> at java.lang.ClassLoader.defineClass(ClassLoader.java:756) 2020-01-27 05:11:20.446 - stderr> at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) 2020-01-27 05:11:20.446 - stderr> at java.net.URLClassLoader.defineClass(URLClassLoader.java:468) 2020-01-27 05:11:20.446 - stderr> at java.net.URLClassLoader.access$100(URLClassLoader.java:74) 2020-01-27 05:11:20.446 - stderr> at java.net.URLClassLoader$1.run(URLClassLoader.java:369) 2020-01-27 05:11:20.446 - stderr> at java.net.URLClassLoader$1.run(URLClassLoader.java:363) 2020-01-27 05:11:20.446 - stderr> at java.security.AccessController.doPrivileged(Native Method) 2020-01-27 05:11:20.446 - stderr> at java.net.URLClassLoader.findClass(URLClassLoader.java:362) 2020-01-27 05:11:20.446 - stderr> at java.lang.ClassLoader.loadClass(ClassLoader.java:418) 2020-01-27 05:11:20.446 - stderr> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) 2020-01-27 05:11:20.446 - stderr> at java.lang.ClassLoader.loadClass(ClassLoader.java:405) 2020-01-27 05:11:20.446 - stderr> at java.lang.ClassLoader.loadClass(ClassLoader.java:351) 2020-01-27 05:11:20.446 - stderr> at java.lang.Class.forName0(Native Method) 2020-01-27 05:11:20.446 - stderr> at java.lang.Class.forName(Class.java:348) 2020-01-27 05:11:20.446 - stderr> at org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:76) ..... ``` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Manual test Closes #27492 from wangyum/SPARK-30755. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>

view details

Yuanjian Li

commit sha 5ffc5ff55e86c180ad8a7fafb0e49e268bbfbca5

[SPARK-11150][SQL][FOLLOWUP] Move sql/dynamicpruning to sql/execution/dynamicpruning ### What changes were proposed in this pull request? Follow-up work for #25600. In this PR, we move `sql/dynamicpruning` to `sql/execution/dynamicpruning`. ### Why are the changes needed? Fix the unexpected public APIs in 3.0.0 #27560. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Existing UT. Closes #27581 from xuanyuanking/SPARK-11150-follow. Authored-by: Yuanjian Li <xyliyuanjian@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

yi.wu

commit sha a1d536cb3e34b74b433d31c475058f38520ab399

[SPARK-15616][FOLLOW-UP][SQL] Sub Optimizer should include super.postHocOptimizationBatches ### What changes were proposed in this pull request? Let sub optimizer's `postHocOptimizationBatches` also includes super's `postHocOptimizationBatches`. ### Why are the changes needed? It's necessary according to the design of catalyst optimizer. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Pass jenkins. Closes #27607 from Ngone51/spark_15616_followup. Authored-by: yi.wu <yi.wu@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

Maxim Gekk

commit sha afaeb29599593f021c9ea47e52f8c70013a4afef

[SPARK-30808][SQL] Enable Java 8 time API in Thrift server ### What changes were proposed in this pull request? - Set `spark.sql.datetime.java8API.enabled` to `true` in `hiveResultString()`, and restore it back at the end of the call. - Convert collected `java.time.Instant` & `java.time.LocalDate` to `java.sql.Timestamp` and `java.sql.Date` for correct formatting. ### Why are the changes needed? Because of textual representation of timestamps/dates before 1582 year is incorrect: ```shell $ export TZ="America/Los_Angeles" $ ./bin/spark-sql -S ``` ```sql spark-sql> set spark.sql.session.timeZone=America/Los_Angeles; spark.sql.session.timeZone America/Los_Angeles spark-sql> SELECT DATE_TRUNC('MILLENNIUM', DATE '1970-03-20'); 1001-01-01 00:07:02 ``` It must be 1001-01-01 00:**00:00**. ### Does this PR introduce any user-facing change? Yes. After the changes: ```shell $ export TZ="America/Los_Angeles" $ ./bin/spark-sql -S ``` ```sql spark-sql> set spark.sql.session.timeZone=America/Los_Angeles; spark.sql.session.timeZone America/Los_Angeles spark-sql> SELECT DATE_TRUNC('MILLENNIUM', DATE '1970-03-20'); 1001-01-01 00:00:00 ``` ### How was this patch tested? By running hive-thiftserver tests. In particular: ``` ./build/sbt -Phadoop-2.7 -Phive-2.3 -Phive-thriftserver "hive-thriftserver/test:testOnly *SparkThriftServerProtocolVersionsSuite" ``` Closes #27552 from MaxGekk/hive-thriftserver-java8-time-api. Authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

push time in a day

push eventhtynkn/dubbo-samples

push time in a day

push eventhtynkn/spring-boot

dreis2211

commit sha a9dabe13bb4b1bb53256d9f9802c40dc1e4e5005

Remove redundant useJUnitPlatform declarations See gh-20206

view details

Stephane Nicoll

commit sha 6a0cef8015be10de958f7611558e369b414d238f

Merge pull request #20206 from dreis2211 * pr/20206: Remove redundant useJUnitPlatform declarations Closes gh-20206

view details

Mike Smithson

commit sha 020ae2c7baaed85920336fe67db3b9cccfbfe3d9

Revisit PluginXmlParserTests See gh-20190

view details

Stephane Nicoll

commit sha 3ba411e04d2aab74b0dfd39d7f855d8a2fe726a5

Polish "Revisit PluginXmlParserTests" See gh-20190

view details

Stephane Nicoll

commit sha 5cb24c2584dfb67269552ec5969a64b54d71ab3f

Merge pull request #20190 from mikesmithson * pr/20190: Polish "Revisit PluginXmlParserTests" Revisit PluginXmlParserTests Closes gh-20190

view details

Eddú Meléndez

commit sha 407e237f109dc5a34a724c7cf3f39300baaa4578

Add support for configuring Liquibase tag property See gh-19316

view details

Stephane Nicoll

commit sha 4bcf4245d1d6e7b655b516125d3c89821069bf0c

Polish "Add support for configuring Liquibase tag property" See gh-19316

view details

Stephane Nicoll

commit sha d4c7315369e7e9dce6eb1c77e5f23d1e670247c8

Merge pull request #19316 from eddumelendez * pr/19316: Polish "Add support for configuring Liquibase tag property" Add support for configuring Liquibase tag property Closes gh-19316

view details

push time in a day

push eventhtynkn/dubbo-admin

push time in a day

push eventhtynkn/fish-redux

push time in a day

push eventhtynkn/rpc-benchmark

push time in a day

issue openedhtynkn/kiwi

支持插件删除

created time in 2 days

delete branch htynkn/kiwi

delete branch : issue-15

delete time in 2 days

push eventhtynkn/kiwi

Huang YunKun

commit sha 2c123a54c8b15722452155d442669577bec5b6d4

Issue 15 (#16)

view details

push time in 2 days

PR merged htynkn/kiwi

Issue 15

close #15

+301 -96

1 comment

15 changed files

htynkn

pr closed time in 2 days

issue closedhtynkn/kiwi

支持SitedD 独立插件中心

http://sited.ka94.com/

closed time in 2 days

htynkn

pull request commenthtynkn/kiwi

Issue 15

Codecov Report

:exclamation: No coverage uploaded for pull request base (master@98c0a30). Click here to learn what that means. The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master      #16   +/-   ##
=========================================
  Coverage          ?   64.48%           
=========================================
  Files             ?       20           
  Lines             ?      428           
  Branches          ?        0           
=========================================
  Hits              ?      276           
  Misses            ?      152           
  Partials          ?        0           
Impacted Files Coverage Δ
lib/domain/comic_section_detail.dart 75.00% <0.00%> (ø)
lib/domain/plugin_info.dart 100.00% <0.00%> (ø)
lib/service/simple_logging_service.dart 50.00% <0.00%> (ø)
lib/domain/dao/plugin_db_object.dart 100.00% <0.00%> (ø)
lib/util/decryption_util.dart 70.00% <0.00%> (ø)
lib/domain/enum/js_engine_type.dart 100.00% <0.00%> (ø)
lib/service/adapter/js_engine_adapter.dart 0.00% <0.00%> (ø)
lib/domain/plugin.dart 100.00% <0.00%> (ø)
lib/service/sited_plugin_provider.dart 94.28% <0.00%> (ø)
lib/domain/comic_section.dart 85.71% <0.00%> (ø)
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 98c0a30...b4cc01e. Read the comment docs.

htynkn

comment created time in 2 days

PR opened htynkn/kiwi

Issue 15

close #15

+301 -96

0 comment

15 changed files

pr created time in 2 days

push eventhtynkn/kiwi

Huang Yunkun

commit sha b4cc01e60fda143e18c5d2520056f0c58efe87b8

切换插件

view details

push time in 2 days

push eventhtynkn/spring-boot

zhangt2333

commit sha e2d87a89d0ab942df53e3029e1fe42411732a81f

Polish See gh-20192

view details

Stephane Nicoll

commit sha 03bee839917ce819b1c4056be4231dd7e815a233

Update copyright date See gh-20192

view details

Stephane Nicoll

commit sha 6fc25a309c1385d08e4ca035d0c3eb554494ef03

Merge pull request #20192 from zhangt2333 * pr/20192: Update copyright date Polish Closes gh-20192

view details

push time in 2 days

push eventhtynkn/dubbo

push time in 2 days

push eventhtynkn/spark

Wu, Xiaochang

commit sha f5238ea6cb0d2cfa69ae0488df94b29cc50065e0

[GRAPHX][MINOR] Fix typo setRest => setDest ### What changes were proposed in this pull request? Fix typo def setRest(dstId: VertexId, localDstId: Int, dstAttr: VD, attr: ED) to def setDest(dstId: VertexId, localDstId: Int, dstAttr: VD, attr: ED) ### Why are the changes needed? Typo ### Does this PR introduce any user-facing change? No ### How was this patch tested? N/A Closes #27594 from xwu99/fix-graphx-setDest. Authored-by: Wu, Xiaochang <xiaochang.wu@intel.com> Signed-off-by: Sean Owen <srowen@gmail.com>

view details

Huaxin Gao

commit sha 0a03e7e679771da8556fae72b35edf21ae71ac44

[SPARK-30691][SQL][DOC][FOLLOW-UP] Make link names exactly the same as the side bar names ### What changes were proposed in this pull request? Make link names exactly the same as the side bar names ### Why are the changes needed? Make doc look better ### Does this PR introduce any user-facing change? before: ![image](https://user-images.githubusercontent.com/13592258/74578603-ad300100-4f4a-11ea-8430-11fccf31eab4.png) after: ![image](https://user-images.githubusercontent.com/13592258/74578670-eff1d900-4f4a-11ea-97d8-5908c0e50e95.png) ### How was this patch tested? Manually build and check the docs Closes #27591 from huaxingao/spark-doc-followup. Authored-by: Huaxin Gao <huaxing@us.ibm.com> Signed-off-by: Sean Owen <srowen@gmail.com>

view details

Yuanjian Li

commit sha 01cc852982cd065e08f9a652c14a0514f49fb662

[SPARK-30803][DOCS] Fix the home page link for Scala API document ### What changes were proposed in this pull request? Change the link to the Scala API document. ``` $ git grep "#org.apache.spark.package" docs/_layouts/global.html: <li><a href="api/scala/index.html#org.apache.spark.package">Scala</a></li> docs/index.md:* [Spark Scala API (Scaladoc)](api/scala/index.html#org.apache.spark.package) docs/rdd-programming-guide.md:[Scala](api/scala/#org.apache.spark.package), [Java](api/java/), [Python](api/python/) and [R](api/R/). ``` ### Why are the changes needed? The home page link for Scala API document is incorrect after upgrade to 3.0 ### Does this PR introduce any user-facing change? Document UI change only. ### How was this patch tested? Local test, attach screenshots below: Before: ![image](https://user-images.githubusercontent.com/4833765/74335713-c2385300-4dd7-11ea-95d8-f5a3639d2578.png) After: ![image](https://user-images.githubusercontent.com/4833765/74335727-cbc1bb00-4dd7-11ea-89d9-4dcc1310e679.png) Closes #27549 from xuanyuanking/scala-doc. Authored-by: Yuanjian Li <xyliyuanjian@gmail.com> Signed-off-by: Sean Owen <srowen@gmail.com>

view details

zhengruifeng

commit sha 8ebbf85a85f7327a97e0c64315f266459b39dbe7

[SPARK-30772][ML][SQL] avoid tuple assignment because it will circumvent the transient tag ### What changes were proposed in this pull request? it is said in [LeastSquaresAggregator](https://github.com/apache/spark/blob/12e1bbaddbb2ef304b5880a62df6683fcc94ea54/mllib/src/main/scala/org/apache/spark/ml/optim/aggregator/LeastSquaresAggregator.scala#L188) that : > // do not use tuple assignment above because it will circumvent the transient tag I then check this issue with Scala 2.13.1 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_241) ### Why are the changes needed? avoid tuple assignment because it will circumvent the transient tag ### Does this PR introduce any user-facing change? No ### How was this patch tested? existing testsuites Closes #27523 from zhengruifeng/avoid_tuple_assign_to_transient. Authored-by: zhengruifeng <ruifengz@foxmail.com> Signed-off-by: Sean Owen <srowen@gmail.com>

view details

push time in 2 days

push eventhtynkn/dubbo-admin

push time in 2 days

push eventhtynkn/dubbo-samples

push time in 2 days

push eventhtynkn/rpc-benchmark

push time in 2 days

push eventhtynkn/fish-redux

push time in 2 days

create barnchhtynkn/kiwi

branch : issue-15

created branch time in 3 days

push eventhtynkn/dubbo

tangcent

commit sha d30ca86395c40f366ba5e4367be6963da60d5ac3

fix typo `CHARECTER` -> `CHARACTER` (#5744)

view details

push time in 3 days

push eventhtynkn/spark

Maxim Gekk

commit sha 8b73b92aadd685b29ef3d9b33366f5e1fd3dae99

[SPARK-30826][SQL] Respect reference case in `StringStartsWith` pushed down to parquet ### What changes were proposed in this pull request? In the PR, I propose to convert the attribute name of `StringStartsWith` pushed down to the Parquet datasource to column reference via the `nameToParquetField` map. Similar conversions are performed for other source filters pushed down to parquet. ### Why are the changes needed? This fixes the bug described in [SPARK-30826](https://issues.apache.org/jira/browse/SPARK-30826). The query from an external table: ```sql CREATE TABLE t1 (col STRING) USING parquet OPTIONS (path '$path') ``` created on top of written parquet files by `Seq("42").toDF("COL").write.parquet(path)` returns wrong empty result: ```scala spark.sql("SELECT * FROM t1 WHERE col LIKE '4%'").show +---+ |col| +---+ +---+ ``` ### Does this PR introduce any user-facing change? Yes. After the changes the result is correct for the example above: ```scala spark.sql("SELECT * FROM t1 WHERE col LIKE '4%'").show +---+ |col| +---+ | 42| +---+ ``` ### How was this patch tested? Added a test to `ParquetFilterSuite` Closes #27574 from MaxGekk/parquet-StringStartsWith-case-sens. Authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

push time in 3 days

push eventhtynkn/spring-boot

Andy Wilkinson

commit sha bf8ed44453e18a40fe889d55ceb6d7936dedefca

Upgrade to Spring AMQP 2.2.4.RELEASE Closes gh-20106

view details

Andy Wilkinson

commit sha ff4de95b3eecbdba0567a19ff5e0cab83415a51c

Merge branch '2.2.x'

view details

push time in 3 days

push eventhtynkn/fish-redux

push time in 3 days

push eventhtynkn/dubbo-admin

push time in 3 days

push eventhtynkn/dubbo-samples

push time in 3 days

push eventhtynkn/rpc-benchmark

push time in 3 days

issue closedhtynkn/kiwi

0.1.1

closed time in 4 days

htynkn

issue openedhtynkn/kiwi

支持SitedD 独立插件中心

http://sited.ka94.com/

created time in 4 days

issue openedhtynkn/kiwi

0.1.1

created time in 4 days

push eventhtynkn/dubbo

push time in 4 days

push eventhtynkn/spring-boot

Stephane Nicoll

commit sha 32c1dd45a927e7620e4c2b5b5a7cb27fe39dcd21

Revert "Merge pull request #19926 from xak2000" Closes gh-19926

view details

Stephane Nicoll

commit sha 2ede9e63b9473333be406f103aed15df3121b4b5

Merge branch '2.1.x' into 2.2.x Closes gh-20117

view details

Stephane Nicoll

commit sha e3383b614c0e8f045a1aee74dbffe3de5e74f82d

Merge branch '2.2.x'

view details

Dave Syer

commit sha dcaaf9785f29b7aafe8a1377a89d0f013ee70b04

Remove duplicate auto-configuration class See gh-20168

view details

Stephane Nicoll

commit sha fc410f05722357ab155997317a9ecdf274cd2edd

Merge pull request #20168 from dsyer * pr/20168: Remove duplicate auto-configuration class Closes gh-20168

view details

Stephane Nicoll

commit sha a6fdbdcd80c199d40d1455c6c42a8a1b828a9811

Merge branch '2.2.x' Closes gh-20178

view details

Andy Wilkinson

commit sha c8907d46b40348f47a60055ca4016a6cba95d90c

Fix up-to-date checking of build info properties Closes gh-20135

view details

cbono

commit sha 852734b1296a0df108ac1d571072759412853cbf

Add support for configuring Jetty's backing queue See gh-19494

view details

Andy Wilkinson

commit sha d61b035640d4929fc4331aaa66542afdef9b7e27

Merge branch '2.1.x' into 2.2.x Closes gh-20183

view details

Stephane Nicoll

commit sha b56c4f1a4d89542089b538fefe2d95c2a6372edf

Polish "Add support for configuring Jetty's backing queue" See gh-19494

view details

Stephane Nicoll

commit sha f8173eb76dbf38706346947e6d510a48e026f4f9

Merge pull request #19494 from bono007 * pr/19494: Polish "Add support for configuring Jetty's backing queue" Add support for configuring Jetty's backing queue Closes gh-19494

view details

Juzer Ali

commit sha 30f7f9c9c4db53073cd86418dfc46c71f17c87d2

Document sanitized keys and uri sanitization behavior See gh-20169

view details

Stephane Nicoll

commit sha 40d1727cc53f1498c5e902d4e7cc76033a8c9a14

Polish "Document sanitized keys and uri sanitization behavior" See gh-20169

view details

Stephane Nicoll

commit sha 7a114995b38e707ff2de6f1892de6b4747b9a5f2

Merge pull request #20169 from juzerali * pr/20169: Polish "Document sanitized keys and uri sanitization behavior" Document sanitized keys and uri sanitization behavior Closes gh-20169

view details

Stephane Nicoll

commit sha ddeac66ca28860d3a0d8a03e91c1b411ff578c9c

Merge branch '2.2.x' Closes gh-20186

view details

Andy Wilkinson

commit sha b9c2d775a96f1fde45b4a85c695e07b0fb0f7120

Merge branch '2.2.x' Closes gh-20185

view details

Andy Wilkinson

commit sha e065ee7be2b1497cdde634e91a0e30a56dcef996

Merge branch '2.1.x' into 2.2.x Closes gh-20183

view details

Andy Wilkinson

commit sha d1aa8c02f93ae9096b2f7c44e6b0ae06e48b1066

Merge branch '2.2.x'

view details

Madhura Bhave

commit sha 4ac12660c650d233f8a64e324b3d87b5dfd80fe5

Explicitly enable config properties scan in java release scripts This commit also upgrades the Spring Boot version used by the scripts to 2.2.4 which is why the explicit annotation is required. Closes gh-20174

view details

Madhura Bhave

commit sha 0ec1ed4642633902f3d4fcae6f129e928b2ac448

Merge branch '2.2.x' Closes gh-20188

view details

push time in 4 days

push eventhtynkn/spark

sarthfrey-db

commit sha 57254c9719f9af9ad985596ed7fbbaafa4052002

[SPARK-30667][CORE] Add allGather method to BarrierTaskContext ### What changes were proposed in this pull request? The `allGather` method is added to the `BarrierTaskContext`. This method contains the same functionality as the `BarrierTaskContext.barrier` method; it blocks the task until all tasks make the call, at which time they may continue execution. In addition, the `allGather` method takes an input message. Upon returning from the `allGather` the task receives a list of all the messages sent by all the tasks that made the `allGather` call. ### Why are the changes needed? There are many situations where having the tasks communicate in a synchronized way is useful. One simple example is if each task needs to start a server to serve requests from one another; first the tasks must find a free port (the result of which is undetermined beforehand) and then start making requests, but to do so they each must know the port chosen by the other task. An `allGather` method would allow them to inform each other of the port they will run on. ### Does this PR introduce any user-facing change? Yes, an `BarrierTaskContext.allGather` method will be available through the Scala, Java, and Python APIs. ### How was this patch tested? Most of the code path is already covered by tests to the `barrier` method, since this PR includes a refactor so that much code is shared by the `barrier` and `allGather` methods. However, a test is added to assert that an all gather on each tasks partition ID will return a list of every partition ID. An example through the Python API: ```python >>> from pyspark import BarrierTaskContext >>> >>> def f(iterator): ... context = BarrierTaskContext.get() ... return [context.allGather('{}'.format(context.partitionId()))] ... >>> sc.parallelize(range(4), 4).barrier().mapPartitions(f).collect()[0] [u'3', u'1', u'0', u'2'] ``` Closes #27395 from sarthfrey/master. Lead-authored-by: sarthfrey-db <sarth.frey@databricks.com> Co-authored-by: sarthfrey <sarth.frey@gmail.com> Signed-off-by: Xiangrui Meng <meng@databricks.com>

view details

Xingbo Jiang

commit sha fa3517cdb163b0589dc02c7d1fefb65be811f65f

Revert "[SPARK-30667][CORE] Add allGather method to BarrierTaskContext" This reverts commit 57254c9719f9af9ad985596ed7fbbaafa4052002.

view details

David Toneian

commit sha 25db8c71a2100c167b8c2d7a6c540ebc61db9b73

[PYSPARK][DOCS][MINOR] Changed `:func:` to `:attr:` Sphinx roles, fixed links in documentation of `Data{Frame,Stream}{Reader,Writer}` This commit is published into the public domain. ### What changes were proposed in this pull request? This PR fixes the documentation of `DataFrameReader`, `DataFrameWriter`, `DataStreamReader`, and `DataStreamWriter`, where attributes of other classes were misrepresented as functions. Additionally, creation of hyperlinks across modules was fixed in these instances. ### Why are the changes needed? The old state produced documentation that suggested invalid usage of PySpark objects (accessing attributes as though they were callable.) ### Does this PR introduce any user-facing change? No, except for improved documentation. ### How was this patch tested? No test added; documentation build runs through. Closes #27553 from DavidToneian/docfix-DataFrameReader-DataFrameWriter-DataStreamReader-DataStreamWriter. Authored-by: David Toneian <david@toneian.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

maryannxue

commit sha 0aed77a0155b404e39bc5dbc0579e29e4c7bf887

[SPARK-30801][SQL] Subqueries should not be AQE-ed if main query is not ### What changes were proposed in this pull request? This PR makes sure AQE is either enabled or disabled for the entire query, including the main query and all subqueries. Currently there are unsupported queries by AQE, e.g., queries that contain DPP filters. We need to make sure that if the main query is unsupported, none of the sub-queries should apply AQE, otherwise it can lead to performance regressions due to missed opportunity of sub-query reuse. ### Why are the changes needed? To get rid of potential perf regressions when AQE is turned on. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Updated DynamicPartitionPruningSuite: 1. Removed the existing workaround `withSQLConf(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key, "false")` 2. Added `DynamicPartitionPruningSuiteAEOn` and `DynamicPartitionPruningSuiteAEOff` to enable testing this suite with AQE on and off options 3. Added a check in `checkPartitionPruningPredicate` to verify that the subqueries are always in sync with the main query in terms of whether AQE is applied. Closes #27554 from maryannxue/spark-30801. Authored-by: maryannxue <maryannxue@apache.org> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

David Toneian

commit sha b2134ee73cfad4d78aaf8f0a2898011ac0308e48

[SPARK-30823][PYTHON][DOCS] Set `%PYTHONPATH%` when building PySpark documentation on Windows This commit is published into the public domain. ### What changes were proposed in this pull request? In analogy to `python/docs/Makefile`, which has > export PYTHONPATH=$(realpath ..):$(realpath ../lib/py4j-0.10.8.1-src.zip) on line 10, this PR adds > set PYTHONPATH=..;..\lib\py4j-0.10.8.1-src.zip to `make2.bat`. Since there is no `realpath` in default installations of Windows, I left the relative paths unresolved. Per the instructions on how to build docs, `make.bat` is supposed to be run from `python/docs` as the working directory, so this should probably not cause issues (`%BUILDDIR%` is a relative path as well.) ### Why are the changes needed? When building the PySpark documentation on Windows, by changing directory to `python/docs` and running `make.bat` (which runs `make2.bat`), the majority of the documentation may not be built if pyspark is not in the default `%PYTHONPATH%`. Sphinx then reports that `pyspark` (and possibly dependencies) cannot be imported. If `pyspark` is in the default `%PYTHONPATH%`, I suppose it is that version of `pyspark` – as opposed to the version found above the `python/docs` directory – that is considered when building the documentation, which may result in documentation that does not correspond to the development version one is trying to build. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Manual tests on my Windows 10 machine. Additional tests with other environments very welcome! Closes #27569 from DavidToneian/SPARK-30823. Authored-by: David Toneian <david@toneian.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

yi.wu

commit sha 99b8136a86030411e6bcbd312f40eb2a901ab0f0

[SPARK-25990][SQL] ScriptTransformation should handle different data types correctly ### What changes were proposed in this pull request? We should convert Spark InternalRows to hive data via `HiveInspectors.wrapperFor`. ### Why are the changes needed? We may hit below exception without this change: ``` [info] org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 1, 192.168.1.6, executor driver): java.lang.ClassCastException: org.apache.spark.sql.types.Decimal cannot be cast to org.apache.hadoop.hive.common.type.HiveDecimal [info] at org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaHiveDecimalObjectInspector.getPrimitiveJavaObject(JavaHiveDecimalObjectInspector.java:55) [info] at org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:321) [info] at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:292) [info] at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:247) [info] at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:231) [info] at org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55) [info] at org.apache.spark.sql.hive.execution.ScriptTransformationWriterThread.$anonfun$run$2(ScriptTransformationExec.scala:300) [info] at org.apache.spark.sql.hive.execution.ScriptTransformationWriterThread.$anonfun$run$2$adapted(ScriptTransformationExec.scala:281) [info] at scala.collection.Iterator.foreach(Iterator.scala:941) [info] at scala.collection.Iterator.foreach$(Iterator.scala:941) [info] at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) [info] at org.apache.spark.sql.hive.execution.ScriptTransformationWriterThread.$anonfun$run$1(ScriptTransformationExec.scala:281) [info] at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) [info] at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1932) [info] at org.apache.spark.sql.hive.execution.ScriptTransformationWriterThread.run(ScriptTransformationExec.scala:270) ``` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Added new test. But please note that this test returns different result between Hive1.2 and Hive2.3 due to `HiveDecimal` or `SerDe` difference(don't know the root cause yet). Closes #27556 from Ngone51/script_transform. Lead-authored-by: yi.wu <yi.wu@databricks.com> Co-authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

HyukjinKwon

commit sha 2a270a731a3b1da9a0fb036d648dd522e5c4d5ad

[SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API ### What changes were proposed in this pull request? This PR fixes `DataFrameReader.csv(dataset: Dataset[String])` API to take a `Dataset[String]` originated from a column name different from `value`. This is a long-standing bug started from the very first place. `CSVUtils.filterCommentAndEmpty` assumed the `Dataset[String]` to be originated with `value` column. This PR changes to use the first column name in the schema. ### Why are the changes needed? For `DataFrameReader.csv(dataset: Dataset[String])` to support any `Dataset[String]` as the signature indicates. ### Does this PR introduce any user-facing change? Yes, ```scala val ds = spark.range(2).selectExpr("concat('a,b,', id) AS text").as[String] spark.read.option("header", true).option("inferSchema", true).csv(ds).show() ``` Before: ``` org.apache.spark.sql.AnalysisException: cannot resolve '`value`' given input columns: [text];; 'Filter (length(trim('value, None)) > 0) +- Project [concat(a,b,, cast(id#0L as string)) AS text#2] +- Range (0, 2, step=1, splits=Some(2)) ``` After: ``` +---+---+---+ | a| b| 0| +---+---+---+ | a| b| 1| +---+---+---+ ``` ### How was this patch tested? Unittest was added. Closes #27561 from HyukjinKwon/SPARK-30810. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

Maxim Gekk

commit sha 7137a6d065edeaab97bf5bf49ffaca3d060a14fe

[SPARK-30766][SQL] Fix the timestamp truncation to the `HOUR` and `DAY` levels ### What changes were proposed in this pull request? In the PR, I propose to use Java 8 time API in timestamp truncations to the levels of `HOUR` and `DAY`. The problem is in the usage of `timeZone.getOffset(millis)` in days/hours truncations where the combined calendar (Julian + Gregorian) is used underneath. ### Why are the changes needed? The change fix wrong truncations. For example, the following truncation to hours should print `0010-01-01 01:00:00` but it outputs wrong timestamp: ```scala Seq("0010-01-01 01:02:03.123456").toDF() .select($"value".cast("timestamp").as("ts")) .select(date_trunc("HOUR", $"ts").cast("string")) .show(false) +------------------------------------+ |CAST(date_trunc(HOUR, ts) AS STRING)| +------------------------------------+ |0010-01-01 01:30:17 | +------------------------------------+ ``` ### Does this PR introduce any user-facing change? Yes. After the changes, the result of the example above is: ```scala +------------------------------------+ |CAST(date_trunc(HOUR, ts) AS STRING)| +------------------------------------+ |0010-01-01 01:00:00 | +------------------------------------+ ``` ### How was this patch tested? - Added new test to `DateFunctionsSuite` - By `DateExpressionsSuite` and `DateTimeUtilsSuite` Closes #27512 from MaxGekk/fix-trunc-old-timestamp. Authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

HyukjinKwon

commit sha b343757b1bd5d0344b82f36aa4d65ed34f840606

[SPARK-29748][DOCS][FOLLOW-UP] Add a note that the legacy environment variable to set in both executor and driver ### What changes were proposed in this pull request? This PR address the comment at https://github.com/apache/spark/pull/26496#discussion_r379194091 and improves the migration guide to explicitly note that the legacy environment variable to set in both executor and driver. ### Why are the changes needed? To clarify this env should be set both in driver and executors. ### Does this PR introduce any user-facing change? Nope. ### How was this patch tested? I checked it via md editor. Closes #27573 from HyukjinKwon/SPARK-29748. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>

view details

Holden Karau

commit sha d273a2bb0fac452a97f5670edd69d3e452e3e57e

[SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support This PR is based on an existing/previou PR - https://github.com/apache/spark/pull/19045 ### What changes were proposed in this pull request? This changes adds a decommissioning state that we can enter when the cloud provider/scheduler lets us know we aren't going to be removed immediately but instead will be removed soon. This concept fits nicely in K8s and also with spot-instances on AWS / preemptible instances all of which we can get a notice that our host is going away. For now we simply stop scheduling jobs, in the future we could perform some kind of migration of data during scale-down, or at least stop accepting new blocks to cache. There is a design document at https://docs.google.com/document/d/1xVO1b6KAwdUhjEJBolVPl9C6sLj7oOveErwDSYdT-pE/edit?usp=sharing ### Why are the changes needed? With more move to preemptible multi-tenancy, serverless environments, and spot-instances better handling of node scale down is required. ### Does this PR introduce any user-facing change? There is no API change, however an additional configuration flag is added to enable/disable this behaviour. ### How was this patch tested? New integration tests in the Spark K8s integration testing. Extension of the AppClientSuite to test decommissioning seperate from the K8s. Closes #26440 from holdenk/SPARK-20628-keep-track-of-nodes-which-are-going-to-be-shutdown-r4. Lead-authored-by: Holden Karau <hkarau@apple.com> Co-authored-by: Holden Karau <holden@pigscanfly.ca> Signed-off-by: Holden Karau <hkarau@apple.com>

view details

DB Tsai

commit sha d0f961476031b62bda0d4d41f7248295d651ea92

[SPARK-30289][SQL] Partitioned by Nested Column for `InMemoryTable` ### What changes were proposed in this pull request? 1. `InMemoryTable` was flatting the nested columns, and then the flatten columns was used to look up the indices which is not correct. This PR implements partitioned by nested column for `InMemoryTable`. ### Why are the changes needed? This PR implements partitioned by nested column for `InMemoryTable`, so we can test this features in DSv2 ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Existing unit tests and new tests. Closes #26929 from dbtsai/addTests. Authored-by: DB Tsai <d_tsai@apple.com> Signed-off-by: DB Tsai <d_tsai@apple.com>

view details

push time in 4 days

push eventhtynkn/dubbo-admin

push time in 4 days

push eventhtynkn/fish-redux

push time in 4 days

push eventhtynkn/dubbo-samples

push time in 4 days

push eventhtynkn/rpc-benchmark

push time in 4 days

push eventhtynkn/dubbo

ken.lj

commit sha 2031110c35f107997260a41b6c97f32eb09b9723

fix misuse, call getConsumerUrl

view details

push time in 5 days

push eventhtynkn/spring-boot

Russell Scheerer

commit sha d61029a37aaba0a66c23631bb587311a74da1284

Fix Spring Boot version reference in spring-boot-starter-parent See gh-20143

view details

Stephane Nicoll

commit sha 90f7a3fd327901a6d7e4f01e2b5c2bcac979ca6a

Merge pull request #20143 from scheerer * pr/20143: Fix Spring Boot version reference in spring-boot-starter-parent Closes gh-20143

view details

Stephane Nicoll

commit sha cd11b74db7cc7f121ac3eedba7c9c309fd382c7d

Remove usage of Infinispan BOM Closes gh-20154

view details

Stephane Nicoll

commit sha db1c9f4058f817a0cace4d0939f1401e42d58da4

Remove plugin management for infinispan-protocol-parser-generator Closes gh-20155

view details

dreis2211

commit sha e1f743a21b7def869936142495cfb84f882c2266

Upgrade CI to Docker 19.03.5 See gh-20157

view details

Stephane Nicoll

commit sha 41b054460bf1a3e4fcdf18eb9ad41919782bf015

Merge pull request #20157 from dreis2211 * pr/20157: Upgrade CI to Docker 19.03.5 Closes gh-20157

view details

dreis2211

commit sha 23bf948101a83e811e53b1c2a22acdb3421f06ea

Upgrade to Asciidoctor Gradle JVM 3.0.0 See gh-19953

view details

Andy Wilkinson

commit sha f2a1840c88e530fe2786ee258046aa2dc0f339bd

Merge pull request #19953 from dreis2211 * gh-19953: Upgrade to Asciidoctor Gradle JVM 3.0.0 Closes gh-19953

view details

Andy Wilkinson

commit sha 8577a39a964a05f37b68ade0fe2f7a132068c794

Upgrade to Spring Asciidoctor Extensions 0.4.1.RELEASE Closes gh-20158

view details

Andy Wilkinson

commit sha 903a4a48e8a426c7bb4c6abf5c6000c1bc2e6810

Fix configuration property references in the reference docs Previously, the configprop macro was being used in the source but the extension that implements the macro was not available to Asciidoctor. This led to the references not being checked at build time and the macro being left as-is in the rendered documentation. This commit updates the dependencies that are available to Asciidoctor to include the extension and the projects which define the configuration properties referenced in the documentation. Closes gh-20149

view details

Stephane Nicoll

commit sha 06c85e96c3c66fc711f246bbb255872d067f99b8

Merge branch '2.1.x' into 2.2.x Closes gh-20159

view details

Stephane Nicoll

commit sha 475169a80e2c4573f86551ede380a339f57da944

Merge branch '2.2.x' Closes gh-20160

view details

Andy Wilkinson

commit sha 68f59a0d4042c8ded24f77b87386f98b60742a67

Move dependency management for JNA into spring-boot-parent Previously, dependency management for JNA was provided by spring-boot-dependencies so it affected users' applications. It was original added for Elasticsearch but is no longer needed for that purpose. We use JNA in spring-boot-buildpack-platform which is used by our Gradle and Maven plugins and should not affect an application's use of JNA. This commit moves management of JNA from spring-boot-dependencies into spring-boot-parent. This means that users' applications will now be free to use whatever version of JNA meets their needs while still controlling the version used for image building via Gradle or Maven. Closes gh-20156

view details

Stephane Nicoll

commit sha 362297a010083b655903aec0c4d1976293c31da8

Fix formatting

view details

Stephane Nicoll

commit sha 738e8b39c7cb096a685ca051a557ffa4c1b185b0

Upgrade to spring javaformat 0.0.20

view details

Stephane Nicoll

commit sha b8ccfbafd00f23b3cbc1993dcbc042a49dbf8458

Upgrade to Spring Boot 2.2.4

view details

Stephane Nicoll

commit sha faaf9a7e0c5b33e190ae7ca18985c1d4f83a9525

Upgrade to SendGrid 4.4.4 Closes gh-20092

view details

Stephane Nicoll

commit sha ca4d5b13339c74958c475c918b5012ab2f4f877c

Upgrade to Flyway 6.2.3 Closes gh-20161

view details

Stephane Nicoll

commit sha 5f826cdbd2dddd08748ad3defd046addc5214b03

Upgrade to Hibernate 5.4.11.Final Closes gh-20162

view details

Stephane Nicoll

commit sha be58d1a3100c8f2300e3a2fc1581a339d8079273

Upgrade to Infinispan 10.1.2.Final Closes gh-20163

view details

push time in 5 days

push eventhtynkn/spark

iRakson

commit sha 926e3a1efe9e142804fcbf52146b22700640ae1b

[SPARK-30790] The dataType of map() should be map<null,null> ### What changes were proposed in this pull request? `spark.sql("select map()")` returns {}. After these changes it will return map<null,null> ### Why are the changes needed? After changes introduced due to #27521, it is important to maintain consistency while using map(). ### Does this PR introduce any user-facing change? Yes. Now map() will give map<null,null> instead of {}. ### How was this patch tested? UT added. Migration guide updated as well Closes #27542 from iRakson/SPARK-30790. Authored-by: iRakson <raksonrakesh@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

maryannxue

commit sha 453d5261b22ebcdd5886e65ab9d0d9857051e76a

[SPARK-30528][SQL] Turn off DPP subquery duplication by default ### What changes were proposed in this pull request? This PR adds a config for Dynamic Partition Pruning subquery duplication and turns it off by default due to its potential performance regression. When planning a DPP filter, it seeks to reuse the broadcast exchange relation if the corresponding join is a BHJ with the filter relation being on the build side, otherwise it will either opt out or plan the filter as an un-reusable subquery duplication based on the cost estimate. However, the cost estimate is not accurate and only takes into account the table scan overhead, thus adding an un-reusable subquery duplication DPP filter can sometimes cause perf regression. This PR turns off the subquery duplication DPP filter by: 1. adding a config `spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly` and setting it `true` by default. 2. removing the existing meaningless config `spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcast` since we always want to reuse broadcast results if possible. ### Why are the changes needed? This is to fix a potential performance regression caused by DPP. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Updated DynamicPartitionPruningSuite to test the new configuration. Closes #27551 from maryannxue/spark-30528. Authored-by: maryannxue <maryannxue@apache.org> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

Terry Kim

commit sha a6b4b914f2d2b873b0e9b9d446fda69dc74c3cf8

[SPARK-30613][SQL] Support Hive style REPLACE COLUMNS syntax ### What changes were proposed in this pull request? This PR proposes to support Hive-style `ALTER TABLE ... REPLACE COLUMNS ...` as described in https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Add/ReplaceColumns The user now can do the following: ```SQL CREATE TABLE t (col1 int, col2 int) USING Foo; ALTER TABLE t REPLACE COLUMNS (col2 string COMMENT 'comment2', col3 int COMMENT 'comment3'); ``` , which drops the existing columns `col1` and `col2`, and add new columns `col2` and `col3`. ### Why are the changes needed? This is a new DDL statement. Spark currently supports the Hive-style `ALTER TABLE ... CHANGE COLUMN ...`, so this new addition can be useful. ### Does this PR introduce any user-facing change? Yes, adding a new DDL statement. ### How was this patch tested? More tests to be added. Closes #27482 from imback82/replace_cols. Authored-by: Terry Kim <yuminkim@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

beliefer

commit sha 04604b9899cc43a9726d671061ff305912fdb85f

[SPARK-30758][SQL][TESTS] Improve bracketed comments tests ### What changes were proposed in this pull request? Although Spark SQL support bracketed comments, but `SQLQueryTestSuite` can't treat bracketed comments well and lead to generated golden files can't display bracketed comments well. This PR will improve the treatment of bracketed comments and add three test case in `PlanParserSuite`. Spark SQL can't support nested bracketed comments and https://github.com/apache/spark/pull/27495 used to support it. ### Why are the changes needed? Golden files can't display well. ### Does this PR introduce any user-facing change? No ### How was this patch tested? New UT. Closes #27481 from beliefer/ansi-brancket-comments. Authored-by: beliefer <beliefer@163.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

Yuming Wang

commit sha fb0e07b08ccaeda50a5121bcb1fab69a1ff749c4

[SPARK-29231][SQL] Constraints should be inferred from cast equality constraint ### What changes were proposed in this pull request? This PR add support infer constraints from cast equality constraint. For example: ```scala scala> spark.sql("create table spark_29231_1(c1 bigint, c2 bigint)") res0: org.apache.spark.sql.DataFrame = [] scala> spark.sql("create table spark_29231_2(c1 int, c2 bigint)") res1: org.apache.spark.sql.DataFrame = [] scala> spark.sql("select t1.* from spark_29231_1 t1 join spark_29231_2 t2 on (t1.c1 = t2.c1 and t1.c1 = 1)").explain == Physical Plan == *(2) Project [c1#5L, c2#6L] +- *(2) BroadcastHashJoin [c1#5L], [cast(c1#7 as bigint)], Inner, BuildRight :- *(2) Project [c1#5L, c2#6L] : +- *(2) Filter (isnotnull(c1#5L) AND (c1#5L = 1)) : +- *(2) ColumnarToRow : +- FileScan parquet default.spark_29231_1[c1#5L,c2#6L] Batched: true, DataFilters: [isnotnull(c1#5L), (c1#5L = 1)], Format: Parquet, Location: InMemoryFileIndex[file:/root/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehouse/spark_29231_1], PartitionFilters: [], PushedFilters: [IsNotNull(c1), EqualTo(c1,1)], ReadSchema: struct<c1:bigint,c2:bigint> +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint))), [id=#209] +- *(1) Project [c1#7] +- *(1) Filter isnotnull(c1#7) +- *(1) ColumnarToRow +- FileScan parquet default.spark_29231_2[c1#7] Batched: true, DataFilters: [isnotnull(c1#7)], Format: Parquet, Location: InMemoryFileIndex[file:/root/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehouse/spark_29231_2], PartitionFilters: [], PushedFilters: [IsNotNull(c1)], ReadSchema: struct<c1:int> ``` After this PR: ```scala scala> spark.sql("select t1.* from spark_29231_1 t1 join spark_29231_2 t2 on (t1.c1 = t2.c1 and t1.c1 = 1)").explain == Physical Plan == *(2) Project [c1#0L, c2#1L] +- *(2) BroadcastHashJoin [c1#0L], [cast(c1#2 as bigint)], Inner, BuildRight :- *(2) Project [c1#0L, c2#1L] : +- *(2) Filter (isnotnull(c1#0L) AND (c1#0L = 1)) : +- *(2) ColumnarToRow : +- FileScan parquet default.spark_29231_1[c1#0L,c2#1L] Batched: true, DataFilters: [isnotnull(c1#0L), (c1#0L = 1)], Format: Parquet, Location: InMemoryFileIndex[file:/root/opensource/spark/spark-warehouse/spark_29231_1], PartitionFilters: [], PushedFilters: [IsNotNull(c1), EqualTo(c1,1)], ReadSchema: struct<c1:bigint,c2:bigint> +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint))), [id=#99] +- *(1) Project [c1#2] +- *(1) Filter ((cast(c1#2 as bigint) = 1) AND isnotnull(c1#2)) +- *(1) ColumnarToRow +- FileScan parquet default.spark_29231_2[c1#2] Batched: true, DataFilters: [(cast(c1#2 as bigint) = 1), isnotnull(c1#2)], Format: Parquet, Location: InMemoryFileIndex[file:/root/opensource/spark/spark-warehouse/spark_29231_2], PartitionFilters: [], PushedFilters: [IsNotNull(c1)], ReadSchema: struct<c1:int> ``` ### Why are the changes needed? Improve query performance. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Unit test. Closes #27252 from wangyum/SPARK-29231. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

Liang Zhang

commit sha 82d0aa37ae521231d8067e473c6ea79a118a115a

[SPARK-30762] Add dtype=float32 support to vector_to_array UDF ### What changes were proposed in this pull request? In this PR, we add a parameter in the python function vector_to_array(col) that allows converting to a column of arrays of Float (32bits) in scala, which would be mapped to a numpy array of dtype=float32. ### Why are the changes needed? In the downstream ML training, using float32 instead of float64 (default) would allow a larger batch size, i.e., allow more data to fit in the memory. ### Does this PR introduce any user-facing change? Yes. Old: `vector_to_array()` only take one param ``` df.select(vector_to_array("colA"), ...) ``` New: `vector_to_array()` can take an additional optional param: `dtype` = "float32" (or "float64") ``` df.select(vector_to_array("colA", "float32"), ...) ``` ### How was this patch tested? Unit test in scala. doctest in python. Closes #27522 from liangz1/udf-float32. Authored-by: Liang Zhang <liang.zhang@databricks.com> Signed-off-by: WeichenXu <weichen.xu@databricks.com>

view details

Takeshi Yamamuro

commit sha 3c4044ea77fe3b1268b52744cd4f1ae61f17a9a8

[SPARK-30703][SQL][DOCS] Add a document for the ANSI mode ### What changes were proposed in this pull request? This pr intends to add a document for the ANSI mode; <img width="600" alt="Screen Shot 2020-02-13 at 8 08 52" src="https://user-images.githubusercontent.com/692303/74386041-5934f780-4e38-11ea-8162-26e524e11c65.png"> <img width="600" alt="Screen Shot 2020-02-13 at 8 09 13" src="https://user-images.githubusercontent.com/692303/74386040-589c6100-4e38-11ea-8a64-899788eaf55f.png"> <img width="600" alt="Screen Shot 2020-02-13 at 8 09 26" src="https://user-images.githubusercontent.com/692303/74386039-5803ca80-4e38-11ea-949f-049208d2203d.png"> <img width="600" alt="Screen Shot 2020-02-13 at 8 09 38" src="https://user-images.githubusercontent.com/692303/74386036-563a0700-4e38-11ea-9ec3-87a8f6771cf0.png"> ### Why are the changes needed? For better document coverage and usability. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? N/A Closes #27489 from maropu/SPARK-30703. Authored-by: Takeshi Yamamuro <yamamuro@apache.org> Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>

view details

Wenchen Fan

commit sha a4ceea6868002b88161517b14b94a2006be8af1b

[SPARK-30751][SQL] Combine the skewed readers into one in AQE skew join optimizations <!-- Thanks for sending a pull request! Here are some tips for you: 1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html 2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html 3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'. 4. Be sure to keep the PR description updated to reflect all changes. 5. Please write your PR title to summarize what this PR proposes. 6. If possible, provide a concise example to reproduce the issue for a faster review. --> ### What changes were proposed in this pull request? <!-- Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below. 1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers. 2. If you fix some SQL features, you can provide some references of other DBMSes. 3. If there is design documentation, please add the link. 4. If there is a discussion in the mailing list, please add the link. --> This is a followup of https://github.com/apache/spark/pull/26434 This PR use one special shuffle reader for skew join, so that we only have one join after optimization. In order to do that, this PR 1. add a very general `CustomShuffledRowRDD` which support all kind of partition arrangement. 2. move the logic of coalescing shuffle partitions to a util function, and call it during skew join optimization, to totally decouple with the `ReduceNumShufflePartitions` rule. It's too complicated to interfere skew join with `ReduceNumShufflePartitions`, as you need to consider the size of split partitions which don't respect target size already. ### Why are the changes needed? <!-- Please clarify why the changes are needed. For instance, 1. If you propose a new API, clarify the use case for a new API. 2. If you fix a bug, you can clarify why it is a bug. --> The current skew join optimization has a serious performance issue: the size of the query plan depends on the number and size of skewed partitions. ### Does this PR introduce any user-facing change? <!-- If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible. If no, write 'No'. --> no ### How was this patch tested? <!-- If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible. If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future. If tests were not added, please describe why they were not added and/or why it was difficult to add. --> existing tests test UI manually: ![image](https://user-images.githubusercontent.com/3182036/74357390-cfb30480-4dfa-11ea-83f6-825d1b9379ca.png) explain output ``` AdaptiveSparkPlan(isFinalPlan=true) +- OverwriteByExpression org.apache.spark.sql.execution.datasources.noop.NoopTable$403a2ed5, [AlwaysTrue()], org.apache.spark.sql.util.CaseInsensitiveStringMap1f +- *(5) SortMergeJoin(skew=true) [key1#2L], [key2#6L], Inner :- *(3) Sort [key1#2L ASC NULLS FIRST], false, 0 : +- SkewJoinShuffleReader 2 skewed partitions with size(max=5 KB, min=5 KB, avg=5 KB) : +- ShuffleQueryStage 0 : +- Exchange hashpartitioning(key1#2L, 200), true, [id=#53] : +- *(1) Project [(id#0L % 2) AS key1#2L] : +- *(1) Filter isnotnull((id#0L % 2)) : +- *(1) Range (0, 100000, step=1, splits=6) +- *(4) Sort [key2#6L ASC NULLS FIRST], false, 0 +- SkewJoinShuffleReader 2 skewed partitions with size(max=5 KB, min=5 KB, avg=5 KB) +- ShuffleQueryStage 1 +- Exchange hashpartitioning(key2#6L, 200), true, [id=#64] +- *(2) Project [((id#4L % 2) + 1) AS key2#6L] +- *(2) Filter isnotnull(((id#4L % 2) + 1)) +- *(2) Range (0, 100000, step=1, splits=6) ``` Closes #27493 from cloud-fan/aqe. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: herman <herman@databricks.com>

view details

Dongjoon Hyun

commit sha 859699135cb63b57f5d844e762070065cedb4408

[SPARK-30807][K8S][TESTS] Support Java 11 in K8S integration tests ### What changes were proposed in this pull request? This PR aims to support JDK11 test in K8S integration tests. - This is an update in testing framework instead of individual tests. - This will enable JDK11 runtime test when you didn't installed JDK11 on your local system. ### Why are the changes needed? Apache Spark 3.0.0 adds JDK11 support, but K8s integration tests use JDK8 until now. ### Does this PR introduce any user-facing change? No. This is a dev-only test-related PR. ### How was this patch tested? This is irrelevant to Jenkins UT, but Jenkins K8S IT (JDK8) should pass. - https://github.com/apache/spark/pull/27559#issuecomment-585903489 (JDK8 Passed) And, manually do the following for JDK11 test. ``` $ NO_MANUAL=1 ./dev/make-distribution.sh --r --pip --tgz -Phadoop-3.2 -Pkubernetes $ resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh --java-image-tag 11-jre-slim --spark-tgz $PWD/spark-*.tgz ``` ``` $ docker run -it --rm kubespark/spark:1318DD8A-2B15-4A00-BC69-D0E90CED235B /usr/local/openjdk-11/bin/java --version | tail -n1 OpenJDK 64-Bit Server VM 18.9 (build 11.0.6+10, mixed mode) ``` Closes #27559 from dongjoon-hyun/SPARK-30807. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>

view details

Dongjoon Hyun

commit sha 74cd46eb691be5dc1cb1c496eeeaa2614945bd98

[SPARK-30816][K8S][TESTS] Fix dev-run-integration-tests.sh to ignore empty params ### What changes were proposed in this pull request? This PR aims to fix `dev-run-integration-tests.sh` to ignore empty params correctly. ### Why are the changes needed? The following script runs `mvn` integration test like the following. ``` $ resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh ... build/mvn integration-test -f /Users/dongjoon/APACHE/spark/pom.xml -pl resource-managers/kubernetes/integration-tests -am -Pscala-2.12 -Pkubernetes -Pkubernetes-integration-tests -Djava.version=8 -Dspark.kubernetes.test.sparkTgz=N/A -Dspark.kubernetes.test.imageTag=N/A -Dspark.kubernetes.test.imageRepo=docker.io/kubespark -Dspark.kubernetes.test.deployMode=minikube -Dtest.include.tags=k8s -Dspark.kubernetes.test.namespace= -Dspark.kubernetes.test.serviceAccountName= -Dspark.kubernetes.test.kubeConfigContext= -Dspark.kubernetes.test.master= -Dtest.exclude.tags= -Dspark.kubernetes.test.jvmImage=spark -Dspark.kubernetes.test.pythonImage=spark-py -Dspark.kubernetes.test.rImage=spark-r ``` After this PR, the empty parameters like the followings will be skipped like the original design. ``` -Dspark.kubernetes.test.namespace= -Dspark.kubernetes.test.serviceAccountName= -Dspark.kubernetes.test.kubeConfigContext= -Dspark.kubernetes.test.master= -Dtest.exclude.tags= ``` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Pass the Jenkins K8S integration test. Closes #27566 from dongjoon-hyun/SPARK-30816. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>

view details

Ali Afroozeh

commit sha e2d3983de78f5c80fac066b7ee8bedd0987110dd

[SPARK-30798][SQL] Scope Session.active in QueryExecution ### What changes were proposed in this pull request? This PR scopes `SparkSession.active` to prevent problems with processing queries with possibly different spark sessions (and different configs). A new method, `withActive` is introduced on `SparkSession` that restores the previous spark session after the block of code is executed. ### Why are the changes needed? `SparkSession.active` is a thread local variable that points to the current thread's spark session. It is important to note that the `SQLConf.get` method depends on `SparkSession.active`. In the current implementation it is possible that `SparkSession.active` points to a different session which causes various problems. Most of these problems arise because part of the query processing is done using the configurations of a different session. For example, when creating a data frame using a new session, i.e., `session.sql("...")`, part of the data frame is constructed using the currently active spark session, which can be a different session from the one used later for processing the query. ### Does this PR introduce any user-facing change? The `withActive` method is introduced on `SparkSession`. ### How was this patch tested? Unit tests (to be added) Closes #27387 from dbaliafroozeh/UseWithActiveSessionInQueryExecution. Authored-by: Ali Afroozeh <ali.afroozeh@databricks.com> Signed-off-by: herman <herman@databricks.com>

view details

push time in 5 days

push eventhtynkn/dubbo-samples

push time in 5 days

push eventhtynkn/dubbo-admin

push time in 5 days

push eventhtynkn/rpc-benchmark

push time in 5 days

push eventhtynkn/fish-redux

push time in 5 days

push eventhtynkn/dubbo-samples

Huang Yunkun

commit sha d3703250f9579f7e8f3c137277a179de8795bb53

fix

view details

push time in 6 days

push eventhtynkn/dubbo-samples

Huang Yunkun

commit sha 095432406ea297e7e9c38c82615c92b47da0bbc1

try to use listener2

view details

push time in 6 days

push eventhtynkn/dubbo

push time in 6 days

push eventhtynkn/spring-boot

cbono

commit sha e3516059622a09ea2ff439daffaeb87c649d9667

Verify ssl key alias on server startup See gh-19202

view details

Madhura Bhave

commit sha ac91f14f05ccf7d2198f6e313f03a5c73cc1c296

Polish "Verify ssl key alias on server startup" See gh-19202

view details

Madhura Bhave

commit sha 8bce2704454dfa07fce785ba6e51714c3942b236

Merge pull request #19202 from bono007 * pr/19202: Polish "Verify ssl key alias on server startup" Verify ssl key alias on server startup Closes gh-19202

view details

Madhura Bhave

commit sha 98a657fedbd1f16e14cce727bb354f8cdcc18d4d

Merge branch '2.1.x' into 2.2.x Closes gh-20132

view details

Madhura Bhave

commit sha 8d9b1d29886985b476f9614f8e524867cdb25d76

Merge branch '2.2.x' Closes gh-20133

view details

Madhura Bhave

commit sha 97ac6c9de740a09f4b9e4dfe0770599d78552723

Polish

view details

Madhura Bhave

commit sha 4eb71fc424b14f78bc9b8ba678ef613ba8d3e760

Merge branch '2.1.x' into 2.2.x

view details

Madhura Bhave

commit sha fe325c06c210fe525b7e1bfdb5c8bebb708b634b

Merge branch '2.2.x'

view details

Stephane Nicoll

commit sha d19920ae39a2d2b42ca17bdedc339db1d05b3a8c

Upgrade to Cassandra Driver 4.4.0 Closes gh-20064

view details

dreis2211

commit sha 450ef36c7237ca4d3218c10dfbc25b08f0c443fa

Exclude jcl-over-slf4j also on Reactive Cassandra starter See gh-20141

view details

Stephane Nicoll

commit sha 1b4a3dfa77371fffd8849b3d5a6c33cf29840892

Merge pull request #20141 from dreis2211 * pr/20141: Exclude jcl-over-slf4j also on Reactive Cassandra starter Closes gh-20141

view details

Andy Wilkinson

commit sha 5fed25bb43a337eac367f347ac893f054844fa5e

Upgrade to Spring HATEOAS 1.1.0.M2 Closes gh-20142

view details

Andy Wilkinson

commit sha 2f16898c5f9f30fe4d4a506ee00785a18f567a7c

Upgrade to Spring Data Neumann-M3 Closes gh-20103

view details

dreis2211

commit sha 92b4ba6367a26a852250a24de83452dd7fbcc4e7

Fix structuring your code link in multi-page HTML documentation Closes gh-19953

view details

Andy Wilkinson

commit sha 420af175709de8550d44b8cc7d7e2d05fd7ce056

Merge branch '2.2.x' Closes gh-20148

view details

Brian Clozel

commit sha 97af0b2f3a0a39c0eeebbbf0f7a514db1d35eac3

Add actuator specific ObjectMapper Prior to this commit, Actuator endpoints would use the application ObjectMapper instance for serializing payloads as JSON. This was problematic in several cases: * application-specific configuration would change the actuator endpoint output. * choosing a different JSON mapper implementation in the application would break completely some endpoints. Spring Boot Actuator already has a hard dependency on Jackson, and this commit uses that fact to configure a shared `ObjectMapper` instance that will be used by the Actuator infrastructure consistently, without polluting the application context. This `ObjectMapper` is used in Actuator for: * JMX endpoints * Spring MVC endpoints with an HTTP message converter * Spring WebFlux endpoints with an `Encoder` * Jersey endpoints with a `ContextResolver<ObjectMapper>` For all web endpoints, this configuration is limited to the actuator-specific media types such as `"application/vnd.spring-boot.actuator.v3+json"`. Fixes gh-12951

view details

Scott Frederick

commit sha 191dce3f5e7c6aef43f8b9c122758e2aef34cb9b

Set Spring Boot version in ephemeral builder This commit adds a `createdBy` structure to the metadata of the ephemeral builder container image that identifies Spring Boot as the creator of the image, along with the Spring Boot version. See gh-20126

view details

Scott Frederick

commit sha e294d26458a198f29f8562771c60081b49ddfceb

Set ephemeral builder container creation to a fixed date This commit fixes the `Created` date and time of the ephemeral builder container image at the Windows epoch plus one second (1980-01-01T00:00:01Z). This date matches the created date of the builder image and influences the created date of the resulting image. Using a fixed date for images ensures that the digest is consistent for all images with the same version. Fixes gh-20126

view details

Andy Wilkinson

commit sha f22aeda0cefbc714254b85142c61f9331912a1ee

Upgrade to Spring Kafka 2.4.2.RELEASE Closes gh-20107

view details

Andy Wilkinson

commit sha 9860f9705c84a50a6921da593448b0623c503426

Upgrade to Spring AMQP 2.2.4.RELEASE Closes gh-20105

view details

push time in 6 days

push eventhtynkn/dubbo-admin

push time in 6 days

push eventhtynkn/spark

herman

commit sha b25359cca3190f6a34dce3c3e49c4d2a80e88bdc

[SPARK-30780][SQL] Empty LocalTableScan should use RDD without partitions ### What changes were proposed in this pull request? This is a small follow-up for https://github.com/apache/spark/pull/27400. This PR makes an empty `LocalTableScanExec` return an `RDD` without partitions. ### Why are the changes needed? It is a bit unexpected that the RDD contains partitions if there is not work to do. It also can save a bit of work when this is used in a more complex plan. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Added test to `SparkPlanSuite`. Closes #27530 from hvanhovell/SPARK-30780. Authored-by: herman <herman@databricks.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

HyukjinKwon

commit sha aa6a60530e63ab3bb8b117f8738973d1b26a2cc7

[SPARK-30722][PYTHON][DOCS] Update documentation for Pandas UDF with Python type hints ### What changes were proposed in this pull request? This PR targets to document the Pandas UDF redesign with type hints introduced at SPARK-28264. Mostly self-describing; however, there are few things to note for reviewers. 1. This PR replace the existing documentation of pandas UDFs to the newer redesign to promote the Python type hints. I added some words that Spark 3.0 still keeps the compatibility though. 2. This PR proposes to name non-pandas UDFs as "Pandas Function API" 3. SCALAR_ITER become two separate sections to reduce confusion: - `Iterator[pd.Series]` -> `Iterator[pd.Series]` - `Iterator[Tuple[pd.Series, ...]]` -> `Iterator[pd.Series]` 4. I removed some examples that look overkill to me. 5. I also removed some information in the doc, that seems duplicating or too much. ### Why are the changes needed? To document new redesign in pandas UDF. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Existing tests should cover. Closes #27466 from HyukjinKwon/SPARK-30722. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

Kris Mok

commit sha b4769998efee0f5998104b689b710c11ee0dbd14

[SPARK-30795][SQL] Spark SQL codegen's code() interpolator should treat escapes like Scala's StringContext.s() ### What changes were proposed in this pull request? This PR proposes to make the `code` string interpolator treat escapes the same way as Scala's builtin `StringContext.s()` string interpolator. This will remove the need for an ugly workaround in `Like` expression's codegen. ### Why are the changes needed? The `code()` string interpolator in Spark SQL's code generator should treat escapes like Scala's builtin `StringContext.s()` interpolator, i.e. it should treat escapes in the code parts, and should not treat escapes in the input arguments. For example, ```scala val arg = "This is an argument." val str = s"This is string part 1. $arg This is string part 2." val code = code"This is string part 1. $arg This is string part 2." assert(code.toString == str) ``` We should expect the `code()` interpolator to produce the same result as the `StringContext.s()` interpolator, where only escapes in the string parts should be treated, while the args should be kept verbatim. But in the current implementation, due to the eager folding of code parts and literal input args, the escape treatment is incorrectly done on both code parts and literal args. That causes a problem when an arg contains escape sequences and wants to preserve that in the final produced code string. For example, in `Like` expression's codegen, there's an ugly workaround for this bug: ```scala // We need double escape to avoid org.codehaus.commons.compiler.CompileException. // '\\' will cause exception 'Single quote must be backslash-escaped in character literal'. // '\"' will cause exception 'Line break in literal not allowed'. val newEscapeChar = if (escapeChar == '\"' || escapeChar == '\\') { s"""\\\\\\$escapeChar""" } else { escapeChar } ``` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Added a new unit test case in `CodeBlockSuite`. Closes #27544 from rednaxelafx/fix-code-string-interpolator. Authored-by: Kris Mok <kris.mok@databricks.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

beliefer

commit sha f5026b1ba7c05548d5f271d6d3edf7dfd4c3f9ce

[SPARK-30763][SQL] Fix java.lang.IndexOutOfBoundsException No group 1 for regexp_extract ### What changes were proposed in this pull request? The current implement of `regexp_extract` will throws a unprocessed exception show below: `SELECT regexp_extract('1a 2b 14m', 'd+')` ``` java.lang.IndexOutOfBoundsException: No group 1 [info] at java.util.regex.Matcher.group(Matcher.java:538) [info] at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) [info] at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) [info] at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729) ``` I think should treat this exception well. ### Why are the changes needed? Fix a bug `java.lang.IndexOutOfBoundsException No group 1 ` ### Does this PR introduce any user-facing change? Yes ### How was this patch tested? New UT Closes #27508 from beliefer/fix-regexp_extract-bug. Authored-by: beliefer <beliefer@163.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

turbofei

commit sha 8b1839728acaa5e61f542a7332505289726d3162

[SPARK-29542][FOLLOW-UP] Keep the description of spark.sql.files.* in tuning guide be consistent with that in SQLConf ### What changes were proposed in this pull request? This pr is a follow up of https://github.com/apache/spark/pull/26200. In this PR, I modify the description of spark.sql.files.* in sql-performance-tuning.md to keep consistent with that in SQLConf. ### Why are the changes needed? To keep consistent with the description in SQLConf. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Existed UT. Closes #27545 from turboFei/SPARK-29542-follow-up. Authored-by: turbofei <fwang12@ebay.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

Maxim Gekk

commit sha c1986204e59f1e8cc4b611d5a578cb248cb74c28

[SPARK-30788][SQL] Support `SimpleDateFormat` and `FastDateFormat` as legacy date/timestamp formatters ### What changes were proposed in this pull request? In the PR, I propose to add legacy date/timestamp formatters based on `SimpleDateFormat` and `FastDateFormat`: - `LegacyFastTimestampFormatter` - uses `FastDateFormat` and supports parsing/formatting in microsecond precision. The code was borrowed from Spark 2.4, see https://github.com/apache/spark/pull/26507 & https://github.com/apache/spark/pull/26582 - `LegacySimpleTimestampFormatter` uses `SimpleDateFormat`, and support the `lenient` mode. When the `lenient` parameter is set to `false`, the parser become much stronger in checking its input. ### Why are the changes needed? Spark 2.4.x uses the following parsers for parsing/formatting date/timestamp strings: - `DateTimeFormat` in CSV/JSON datasource - `SimpleDateFormat` - is used in JDBC datasource, in partitions parsing. - `SimpleDateFormat` in strong mode (`lenient = false`), see https://github.com/apache/spark/blob/branch-2.4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala#L124. It is used by the `date_format`, `from_unixtime`, `unix_timestamp` and `to_unix_timestamp` functions. The PR aims to make Spark 3.0 compatible with Spark 2.4.x in all those cases when `spark.sql.legacy.timeParser.enabled` is set to `true`. ### Does this PR introduce any user-facing change? This shouldn't change behavior with default settings. If `spark.sql.legacy.timeParser.enabled` is set to `true`, users should observe behavior of Spark 2.4. ### How was this patch tested? - Modified tests in `DateExpressionsSuite` to check the legacy parser - `SimpleDateFormat`. - Added `CSVLegacyTimeParserSuite` and `JsonLegacyTimeParserSuite` to run `CSVSuite` and `JsonSuite` with the legacy parser - `FastDateFormat`. Closes #27524 from MaxGekk/timestamp-formatter-legacy-fallback. Authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

Maxim Gekk

commit sha 61b1e608f07afd965028313c13bf89c19b006312

[SPARK-30759][SQL][TESTS][FOLLOWUP] Check cache initialization in StringRegexExpression ### What changes were proposed in this pull request? Added new test to `RegexpExpressionsSuite` which checks that `cache` of compiled pattern is set when the `right` expression (pattern in `LIKE`) is a foldable expression. ### Why are the changes needed? To be sure that `cache` in `StringRegexExpression` is initialized for foldable patterns. ### Does this PR introduce any user-facing change? No ### How was this patch tested? By running the added test in `RegexpExpressionsSuite`. Closes #27547 from MaxGekk/regexp-cache-test. Authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

Eric Wu

commit sha 5919bd3b8d3ef3c3e957d8e3e245e00383b979bf

[SPARK-30651][SQL] Add detailed information for Aggregate operators in EXPLAIN FORMATTED ### What changes were proposed in this pull request? Currently `EXPLAIN FORMATTED` only report input attributes of HashAggregate/ObjectHashAggregate/SortAggregate, while `EXPLAIN EXTENDED` provides more information of Keys, Functions, etc. This PR enhanced `EXPLAIN FORMATTED` to sync with original explain behavior. ### Why are the changes needed? The newly added `EXPLAIN FORMATTED` got less information comparing to the original `EXPLAIN EXTENDED` ### Does this PR introduce any user-facing change? Yes, taking HashAggregate explain result as example. **SQL** ``` EXPLAIN FORMATTED SELECT COUNT(val) + SUM(key) as TOTAL, COUNT(key) FILTER (WHERE val > 1) FROM explain_temp1; ``` **EXPLAIN EXTENDED** ``` == Physical Plan == *(2) HashAggregate(keys=[], functions=[count(val#6), sum(cast(key#5 as bigint)), count(key#5)], output=[TOTAL#62L, count(key) FILTER (WHERE (val > 1))#71L]) +- Exchange SinglePartition, true, [id=#89] +- HashAggregate(keys=[], functions=[partial_count(val#6), partial_sum(cast(key#5 as bigint)), partial_count(key#5) FILTER (WHERE (val#6 > 1))], output=[count#75L, sum#76L, count#77L]) +- *(1) ColumnarToRow +- FileScan parquet default.explain_temp1[key#5,val#6] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex[file:/Users/XXX/spark-dev/spark/spark-warehouse/explain_temp1], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<key:int,val:int> ``` **EXPLAIN FORMATTED - BEFORE** ``` == Physical Plan == * HashAggregate (5) +- Exchange (4) +- HashAggregate (3) +- * ColumnarToRow (2) +- Scan parquet default.explain_temp1 (1) ... ... (5) HashAggregate [codegen id : 2] Input: [count#91L, sum#92L, count#93L] ... ... ``` **EXPLAIN FORMATTED - AFTER** ``` == Physical Plan == * HashAggregate (5) +- Exchange (4) +- HashAggregate (3) +- * ColumnarToRow (2) +- Scan parquet default.explain_temp1 (1) ... ... (5) HashAggregate [codegen id : 2] Input: [count#91L, sum#92L, count#93L] Keys: [] Functions: [count(val#6), sum(cast(key#5 as bigint)), count(key#5)] Results: [(count(val#6)#84L + sum(cast(key#5 as bigint))#85L) AS TOTAL#78L, count(key#5)#86L AS count(key) FILTER (WHERE (val > 1))#87L] Output: [TOTAL#78L, count(key) FILTER (WHERE (val > 1))#87L] ... ... ``` ### How was this patch tested? Three tests added in explain.sql for HashAggregate/ObjectHashAggregate/SortAggregate. Closes #27368 from Eric5553/ExplainFormattedAgg. Authored-by: Eric Wu <492960551@qq.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

Maxim Gekk

commit sha aa0d13683cdf9f38f04cc0e73dc8cf63eed29bf4

[SPARK-30760][SQL] Port `millisToDays` and `daysToMillis` on Java 8 time API ### What changes were proposed in this pull request? In the PR, I propose to rewrite the `millisToDays` and `daysToMillis` of `DateTimeUtils` using Java 8 time API. I removed `getOffsetFromLocalMillis` from `DateTimeUtils` because it is a private methods, and is not used anymore in Spark SQL. ### Why are the changes needed? New implementation is based on Proleptic Gregorian calendar which has been already used by other date-time functions. This changes make `millisToDays` and `daysToMillis` consistent to rest Spark SQL API related to date & time operations. ### Does this PR introduce any user-facing change? Yes, this might effect behavior for old dates before 1582 year. ### How was this patch tested? By existing test suites `DateTimeUtilsSuite`, `DateFunctionsSuite`, DateExpressionsSuite`, `SQLQuerySuite` and `HiveResultSuite`. Closes #27494 from MaxGekk/millis-2-days-java8-api. Authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

Liang-Chi Hsieh

commit sha 5b76367a9d0aaca53ce96ab7e555a596567e8335

[SPARK-30797][SQL] Set tradition user/group/other permission to ACL entries when setting up ACLs in truncate table ### What changes were proposed in this pull request? This is a follow-up to the PR #26956. In #26956, the patch proposed to preserve path permission when truncating table. When setting up original ACLs, we need to set user/group/other permission as ACL entries too, otherwise if the path doesn't have default user/group/other ACL entries, ACL API will complain an error `Invalid ACL: the user, group and other entries are required.`. In short this change makes sure: 1. Permissions for user/group/other are always kept into ACLs to work with ACL API. 2. Other custom ACLs are still kept after TRUNCATE TABLE (#26956 did this). ### Why are the changes needed? Without this fix, `TRUNCATE TABLE` will get an error when setting up ACLs if there is no default default user/group/other ACL entries. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Update unit test. Manual test on dev Spark cluster. Set ACLs for a table path without default user/group/other ACL entries: ``` hdfs dfs -setfacl --set 'user:liangchi:rwx,user::rwx,group::r--,other::r--' /user/hive/warehouse/test.db/test_truncate_table hdfs dfs -getfacl /user/hive/warehouse/test.db/test_truncate_table # file: /user/hive/warehouse/test.db/test_truncate_table # owner: liangchi # group: supergroup user::rwx user:liangchi:rwx group::r-- mask::rwx other::r-- ``` Then run `sql("truncate table test.test_truncate_table")`, it works by normally truncating the table and preserve ACLs. Closes #27548 from viirya/fix-truncate-table-permission. Lead-authored-by: Liang-Chi Hsieh <liangchi@uber.com> Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>

view details

Thomas Graves

commit sha 496f6ac86001d284cbfb7488a63dd3a168919c0f

[SPARK-29148][CORE] Add stage level scheduling dynamic allocation and scheduler backend changes ### What changes were proposed in this pull request? This is another PR for stage level scheduling. In particular this adds changes to the dynamic allocation manager and the scheduler backend to be able to track what executors are needed per ResourceProfile. Note the api is still private to Spark until the entire feature gets in, so this functionality will be there but only usable by tests for profiles other then the DefaultProfile. The main changes here are simply tracking things on a ResourceProfile basis as well as sending the executor requests to the scheduler backend for all ResourceProfiles. I introduce a ResourceProfileManager in this PR that will track all the actual ResourceProfile objects so that we can keep them all in a single place and just pass around and use in datastructures the resource profile id. The resource profile id can be used with the ResourceProfileManager to get the actual ResourceProfile contents. There are various places in the code that use executor "slots" for things. The ResourceProfile adds functionality to keep that calculation in it. This logic is more complex then it should due to standalone mode and mesos coarse grained not setting the executor cores config. They default to all cores on the worker, so calculating slots is harder there. This PR keeps the functionality to make the cores the limiting resource because the scheduler still uses that for "slots" for a few things. This PR does also add the resource profile id to the Stage and stage info classes to be able to test things easier. That full set of changes will come with the scheduler PR that will be after this one. The PR stops at the scheduler backend pieces for the cluster manager and the real YARN support hasn't been added in this PR, that again will be in a separate PR, so this has a few of the API changes up to the cluster manager and then just uses the default profile requests to continue. The code for the entire feature is here for reference: https://github.com/apache/spark/pull/27053/files although it needs to be upmerged again as well. ### Why are the changes needed? Needed for stage level scheduling feature. ### Does this PR introduce any user-facing change? No user facing api changes added here. ### How was this patch tested? Lots of unit tests and manually testing. I tested on yarn, k8s, standalone, local modes. Ran both failure and success cases. Closes #27313 from tgravescs/SPARK-29148. Authored-by: Thomas Graves <tgraves@nvidia.com> Signed-off-by: Thomas Graves <tgraves@apache.org>

view details

push time in 6 days

push eventhtynkn/dubbo-samples

push time in 6 days

push eventhtynkn/fish-redux

push time in 6 days

push eventhtynkn/rpc-benchmark

push time in 6 days

push eventhtynkn/dubbo

Mercy Ma

commit sha b99b3837186b7fb7f68ae106b67cc42deeed18b9

[Optimization] To remove EnableDubboConfigBinding and EnableDubboConfigBindings (#5730) * Polish /apache/dubbo#5721 : [Enhancement] Setting the default IDs for Dubbo's Config Beans * Polish /apache/dubbo#5729 : [Optimization] To remove EnableDubboConfigBinding and EnableDubboConfigBindings

view details

push time in 7 days

push eventhtynkn/spring-boot

Madhura Bhave

commit sha d485708f6814df433aae6b72a97a0fbaa1a60513

Fix 404 when composite contributor is added to a group Fixes gh-19974

view details

Madhura Bhave

commit sha 19b7dc8e4f7381f7b7f532454c809cd99b0c6bc6

Merge branch '2.2.x' Closes gh-20114

view details

Stephane Nicoll

commit sha 85eb279b30f9f6d99f5eaf83bdcd3b4bec22efdc

Reintroduce "Add Gradle Wrapper Validation GitHub Action" Closes gh-19762

view details

Jorge Cordoba

commit sha 547fc30eadef7fc7d505932ad066c06aafdfa032

Fix condition source in OnBeanCondition See gh-19948

view details

Stephane Nicoll

commit sha 66809c6c1e2c36e12981419c2f26aa493a45e658

Polish "Fix condition source in OnBeanCondition" See gh-19948

view details

Stephane Nicoll

commit sha b4d118e0411375a1ef9caca5d3813dffda70edd3

Merge pull request #19948 from jcordoba95 * pr/19948: Polish "Fix condition source in OnBeanCondition" Fix condition source in OnBeanCondition Closes gh-19948

view details

Stephane Nicoll

commit sha 32bd845a7d2502ecede74080f776439c12632030

Merge branch '2.2.x' Closes gh-20116

view details

Ruslan Stelmachenko

commit sha 5f7e1ac4f2c4268da79a000bac88f8cfc40304e4

Remove unnecessary leading slash in changelog locations See gh-19926

view details

Stephane Nicoll

commit sha ec14e82312cc0681f965f9c9438e7e0024e868ea

Polish "Remove unnecessary leading slash in changelog locations" See gh-19926

view details

Stephane Nicoll

commit sha 4c0c0aeb42bf0d86b49d32f7ffaf301aa1fc8572

Merge pull request #19926 from xak2000 * pr/19926: Polish "Remove unnecessary leading slash in changelog locations" Remove unnecessary leading slash in changelog locations Closes gh-19926

view details

Stephane Nicoll

commit sha a425cc1b466cca6dc61daa0d2119f254e3291ba1

Merge branch '2.1.x' into 2.2.x Closes gh-20117

view details

Stephane Nicoll

commit sha 466c1ba251e7b6f24378f1df9fc7eda3bf79989c

Merge branch '2.2.x' Closes gh-20118

view details

Stephane Nicoll

commit sha 25e87620d3527fbda30bfe89447e52883c2b6244

Fix broken smoke test See gh-19926

view details

Stephane Nicoll

commit sha 1e435372125d33f65a52b5885cd345a024ffb870

Fix broken smoke test See gh-19926

view details

Stephane Nicoll

commit sha 1306c9b77eab8b3330fdaeb0e92a9a69a4fb17e0

Merge branch '2.2.x'

view details

Stephane Nicoll

commit sha 2a90c3dea4ff47ecc2bdeab01e1118fa988d75a2

Fix broken smoke test See gh-19926

view details

Stephane Nicoll

commit sha 5f584101c62023dafab0e9590b5d84f7fc3b5a53

Merge branch '2.1.x' into 2.2.x

view details

Stephane Nicoll

commit sha 765b2178d17d6abe1dd67c6e786ec80297ee8b75

Document spring-boot.run.arguments behaviour with multiple arguments Closes gh-19998

view details

Stephane Nicoll

commit sha 322914218845f49f615627deb71a6a6f7978bc8e

Merge branch '2.2.x' Closes gh-20121

view details

dreis2211

commit sha cfc16c2589bca98dddc5e1e18c1c88d64e960c27

Remove redundant jar task configuration See gh-20113

view details

push time in 7 days

push eventhtynkn/spark

Bryan Cutler

commit sha 07a9885f2792be1353f4a923d649e90bc431cb38

[SPARK-30777][PYTHON][TESTS] Fix test failures for Pandas >= 1.0.0 ### What changes were proposed in this pull request? Fix PySpark test failures for using Pandas >= 1.0.0. ### Why are the changes needed? Pandas 1.0.0 has recently been released and has API changes that result in PySpark test failures, this PR fixes the broken tests. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Manually tested with Pandas 1.0.1 and PyArrow 0.16.0 Closes #27529 from BryanCutler/pandas-fix-tests-1.0-SPARK-30777. Authored-by: Bryan Cutler <cutlerb@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

HyukjinKwon

commit sha 2bc765a831d7f15c7971d41c36cfbec1fd898dfd

[SPARK-30756][SQL] Fix `ThriftServerWithSparkContextSuite` on spark-branch-3.0-test-sbt-hadoop-2.7-hive-2.3 ### What changes were proposed in this pull request? This PR tries #26710 (comment) way to fix the test. ### Why are the changes needed? To make the tests pass. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Jenkins will test first, and then `on spark-branch-3.0-test-sbt-hadoop-2.7-hive-2.3` will test it out. Closes #27513 from HyukjinKwon/test-SPARK-30756. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org> (cherry picked from commit 8efe367a4ee862b8a85aee8881b0335b34cbba70) Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

HyukjinKwon

commit sha 0045be766b949dff23ed72bd559568f17f645ffe

[SPARK-29462][SQL] The data type of "array()" should be array<null> ### What changes were proposed in this pull request? This brings https://github.com/apache/spark/pull/26324 back. It was reverted basically because, firstly Hive compatibility, and the lack of investigations in other DBMSes and ANSI. - In case of PostgreSQL seems coercing NULL literal to TEXT type. - Presto seems coercing `array() + array(1)` -> array of int. - Hive seems `array() + array(1)` -> array of strings Given that, the design choices have been differently made for some reasons. If we pick one of both, seems coercing to array of int makes much more sense. Another investigation was made offline internally. Seems ANSI SQL 2011, section 6.5 "<contextually typed value specification>" states: > If ES is specified, then let ET be the element type determined by the context in which ES appears. The declared type DT of ES is Case: > > a) If ES simply contains ARRAY, then ET ARRAY[0]. > > b) If ES simply contains MULTISET, then ET MULTISET. > > ES is effectively replaced by CAST ( ES AS DT ) From reading other related context, doing it to `NullType`. Given the investigation made, choosing to `null` seems correct, and we have a reference Presto now. Therefore, this PR proposes to bring it back. ### Why are the changes needed? When empty array is created, it should be declared as array<null>. ### Does this PR introduce any user-facing change? Yes, `array()` creates `array<null>`. Now `array(1) + array()` can correctly create `array(1)` instead of `array("1")`. ### How was this patch tested? Tested manually Closes #27521 from HyukjinKwon/SPARK-29462. Lead-authored-by: HyukjinKwon <gurwls223@apache.org> Co-authored-by: Aman Omer <amanomer1996@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

root1

commit sha b20754d9ee033091e2ef4d5bfa2576f946c9df50

[SPARK-27545][SQL][DOC] Update the Documentation for CACHE TABLE and UNCACHE TABLE ### What changes were proposed in this pull request? Document updated for `CACHE TABLE` & `UNCACHE TABLE` ### Why are the changes needed? Cache table creates a temp view while caching data using `CACHE TABLE name AS query`. `UNCACHE TABLE` does not remove this temp view. These things were not mentioned in the existing doc for `CACHE TABLE` & `UNCACHE TABLE`. ### Does this PR introduce any user-facing change? Document updated for `CACHE TABLE` & `UNCACHE TABLE` command. ### How was this patch tested? Manually Closes #27090 from iRakson/SPARK-27545. Lead-authored-by: root1 <raksonrakesh@gmail.com> Co-authored-by: iRakson <raksonrakesh@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

fuwhu

commit sha f1d0dce4848a53831268c80bf7e1e0f47a1f7612

[MINOR][DOC] Add class document for PruneFileSourcePartitions and PruneHiveTablePartitions ### What changes were proposed in this pull request? Add class document for PruneFileSourcePartitions and PruneHiveTablePartitions. ### Why are the changes needed? To describe these two classes. ### Does this PR introduce any user-facing change? no ### How was this patch tested? no Closes #27535 from fuwhu/SPARK-15616-FOLLOW-UP. Authored-by: fuwhu <bestwwg@163.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

Maxim Gekk

commit sha dc66d57e981ac5108e097d4298fa467f0843ffcf

[SPARK-30754][SQL] Reuse results of floorDiv in calculations of floorMod in DateTimeUtils ### What changes were proposed in this pull request? In the case of back-to-back calculation of `floorDiv` and `floorMod` with the same arguments, the result of `foorDiv` can be reused in calculation of `floorMod`. The `floorMod` method is defined as the following in Java standard library: ```java public static int floorMod(int x, int y) { int r = x - floorDiv(x, y) * y; return r; } ``` If `floorDiv(x, y)` has been already calculated, it can be reused in `x - floorDiv(x, y) * y`. I propose to modify 2 places in `DateTimeUtils`: 1. `microsToInstant` which is widely used in many date-time functions. `Math.floorMod(us, MICROS_PER_SECOND)` is just replaced by its definition from Java Math library. 2. `truncDate`: `Math.floorMod(oldYear, divider) == 0` is replaced by `Math.floorDiv(oldYear, divider) * divider == oldYear` where `floorDiv(...) * divider` is pre-calculated. ### Why are the changes needed? This reduces the number of arithmetic operations, and can slightly improve performance of date-time functions. ### Does this PR introduce any user-facing change? No ### How was this patch tested? By existing test suites `DateTimeUtilsSuite`, `DateFunctionsSuite` and `DateExpressionsSuite`. Closes #27491 from MaxGekk/opt-microsToInstant. Authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Sean Owen <srowen@gmail.com>

view details

Yin Huai

commit sha ea626b6acf0de0ff3b0678372f30ba6f84ae2b09

[SPARK-30783] Exclude hive-service-rpc ### What changes were proposed in this pull request? Exclude hive-service-rpc from build. ### Why are the changes needed? hive-service-rpc 2.3.6 and spark sql's thrift server module have duplicate classes. Leaving hive-service-rpc 2.3.6 in the class path means that spark can pick up classes defined in hive instead of its thrift server module, which can cause hard to debug runtime errors due to class loading order and compilation errors for applications depend on spark. If you compare hive-service-rpc 2.3.6's jar (https://search.maven.org/remotecontent?filepath=org/apache/hive/hive-service-rpc/2.3.6/hive-service-rpc-2.3.6.jar) and spark thrift server's jar (e.g. https://repository.apache.org/content/groups/snapshots/org/apache/spark/spark-hive-thriftserver_2.12/3.0.0-SNAPSHOT/spark-hive-thriftserver_2.12-3.0.0-20200207.021914-364.jar), you will see that all of classes provided by hive-service-rpc-2.3.6.jar are covered by spark thrift server's jar. https://issues.apache.org/jira/browse/SPARK-30783 has output of jar tf for both jars. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Existing tests. Closes #27533 from yhuai/SPARK-30783. Authored-by: Yin Huai <yhuai@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

HyukjinKwon

commit sha 99bd59fe29a87bb70485db536b0ae676e7a9d42e

[SPARK-29462][SQL][DOCS] Add some more context and details in 'spark.sql.defaultUrlStreamHandlerFactory.enabled' documentation ### What changes were proposed in this pull request? This PR adds some more information and context to `spark.sql.defaultUrlStreamHandlerFactory.enabled`. ### Why are the changes needed? It is a bit difficult to understand the documentation of `spark.sql.defaultUrlStreamHandlerFactory.enabled`. ### Does this PR introduce any user-facing change? Nope, internal doc only fix. ### How was this patch tested? Nope. I only tested linter. Closes #27541 from HyukjinKwon/SPARK-29462-followup. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>

view details

Maxim Gekk

commit sha 45db48e2d29359591a4ebc3db4625dd2158e446e

Revert "[SPARK-30625][SQL] Support `escape` as third parameter of the `like` function ### What changes were proposed in this pull request? In the PR, I propose to revert the commit 8aebc80e0e67bcb1aa300b8c8b1a209159237632. ### Why are the changes needed? See the concerns https://github.com/apache/spark/pull/27355#issuecomment-584344438 ### Does this PR introduce any user-facing change? No ### How was this patch tested? By existing test suites. Closes #27531 from MaxGekk/revert-like-3-args. Authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>

view details

push time in 7 days

push eventhtynkn/dubbo-admin

push time in 7 days

push eventhtynkn/dubbo-samples

push time in 7 days

push eventhtynkn/fish-redux

push time in 7 days

push eventhtynkn/rpc-benchmark

push time in 7 days

push eventhtynkn/dubbo

qinliujie

commit sha c9f12ca17770d7035f1f9bf0a27c18fd6c4dce5a

MessageOnlyChannelHandler class should be consistent with the writing of the ExecutionChannelHandler class. (#5724) fix #5256

view details

Kleber Tarcísio

commit sha cd509a3b0c3f1016585ada3fc7c8516e50170c33

refactoring to remove feature envy. (#5559)

view details

Mercy Ma

commit sha 60e72bf5c91657dfae8106979cd00106e4764bab

Polish /apache/dubbo#5721 : [Enhancement] Setting the default IDs for Dubbo's Config Beans (#5725)

view details

push time in 8 days

push eventhtynkn/spark

Liang-Chi Hsieh

commit sha 9f8172e96a8ee60cd42545778c01d98b6902161f

Revert "[SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project This reverts commit a0e63b61e7c5d55ae2a9213b95ab1e87ac7c203c. ### What changes were proposed in this pull request? This reverts the patch at #26978 based on gatorsmile's suggestion. ### Why are the changes needed? Original patch #26978 has not considered a corner case. We may need to put more time on ensuring we can cover all cases. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Unit test. Closes #27504 from viirya/revert-SPARK-29721. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Xiao Li <gatorsmile@gmail.com>

view details

Kent Yao

commit sha 58b9ca1e6f7768b23e752dabc30468c06d0e1c57

[SPARK-30592][SQL][FOLLOWUP] Add some round-trip test cases ### What changes were proposed in this pull request? Add round-trip tests for CSV and JSON functions as https://github.com/apache/spark/pull/27317#discussion_r376745135 asked. ### Why are the changes needed? improve test coverage ### Does this PR introduce any user-facing change? no ### How was this patch tested? add uts Closes #27510 from yaooqinn/SPARK-30592-F. Authored-by: Kent Yao <yaooqinn@hotmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

Terry Kim

commit sha 70e545a94d47afb2848c24e81c908d28d41016da

[SPARK-30757][SQL][DOC] Update the doc on TableCatalog.alterTable's behavior ### What changes were proposed in this pull request? This PR updates the documentation on `TableCatalog.alterTable`s behavior on the order by which the requested changes are applied. It now explicitly mentions that the changes are applied in the order given. ### Why are the changes needed? The current documentation on `TableCatalog.alterTable` doesn't mention which order the requested changes are applied. It will be useful to explicitly document this behavior so that the user can expect the behavior. For example, `REPLACE COLUMNS` needs to delete columns before adding new columns, and if the order is guaranteed by `alterTable`, it's much easier to work with the catalog API. ### Does this PR introduce any user-facing change? Yes, document change. ### How was this patch tested? Not added (doc changes). Closes #27496 from imback82/catalog_table_alter_table. Authored-by: Terry Kim <yuminkim@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

jiake

commit sha 5a240603fd920e3cb5d9ef49c31d46df8a630d8c

[SPARK-30719][SQL] Add unit test to verify the log warning print when intentionally skip AQE ### What changes were proposed in this pull request? This is a follow up in [#27452](https://github.com/apache/spark/pull/27452). Add a unit test to verify whether the log warning is print when intentionally skip AQE. ### Why are the changes needed? Add unit test ### Does this PR introduce any user-facing change? No ### How was this patch tested? adding unit test Closes #27515 from JkSelf/aqeLoggingWarningTest. Authored-by: jiake <ke.a.jia@intel.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

Eric Wu

commit sha b2011a295bd78b3693a516e049e90250366b8f52

[SPARK-30326][SQL] Raise exception if analyzer exceed max iterations ### What changes were proposed in this pull request? Enhance RuleExecutor strategy to take different actions when exceeding max iterations. And raise exception if analyzer exceed max iterations. ### Why are the changes needed? Currently, both analyzer and optimizer just log warning message if rule execution exceed max iterations. They should have different behavior. Analyzer should raise exception to indicates the plan is not fixed after max iterations, while optimizer just log warning to keep the current plan. This is more feasible after SPARK-30138 was introduced. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Add test in AnalysisSuite Closes #26977 from Eric5553/EnhanceMaxIterations. Authored-by: Eric Wu <492960551@qq.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

view details

Liang-Chi Hsieh

commit sha acfdb46a60fc06dac0af55951492d74b7073f546

[SPARK-27946][SQL][FOLLOW-UP] Change doc and error message for SHOW CREATE TABLE ### What changes were proposed in this pull request? This is a follow-up for #24938 to tweak error message and migration doc. ### Why are the changes needed? Making user know workaround if SHOW CREATE TABLE doesn't work for some Hive tables. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Existing unit tests. Closes #27505 from viirya/SPARK-27946-followup. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Liang-Chi Hsieh <liangchi@uber.com>

view details

HyukjinKwon

commit sha 4439b29bd2ac0c3cc4c6ceea825fc797ff0029a3

Revert "[SPARK-30245][SQL] Add cache for Like and RLike when pattern is not static" ### What changes were proposed in this pull request? This reverts commit 8ce7962931680c204e84dd75783b1c943ea9c525. There's variable name conflicts with https://github.com/apache/spark/commit/8aebc80e0e67bcb1aa300b8c8b1a209159237632#diff-39298b470865a4cbc67398a4ea11e767. This can be cleanly ported back to branch-3.0. ### Why are the changes needed? Performance investigation were not made enough and it's not clear if it really beneficial or now. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Jenkins tests. Closes #27514 from HyukjinKwon/revert-cache-PR. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Xiao Li <gatorsmile@gmail.com>

view details

Maxim Gekk

commit sha 3c1c9b48fcca1a714e6c2a3045b512598438d672

[SPARK-30759][SQL] Initialize cache for foldable patterns in StringRegexExpression ### What changes were proposed in this pull request? In the PR, I propose to fix `cache` initialization in `StringRegexExpression` by changing `case Literal(value: String, StringType)` to `case p: Expression if p.foldable` ### Why are the changes needed? Actually, the case doesn't work at all because of: 1. Literals value has type `UTF8String` 2. It doesn't work for foldable expressions like in the example: ```sql SELECT '%SystemDrive%\Users\John' _FUNC_ '%SystemDrive%\\Users.*'; ``` <img width="649" alt="Screen Shot 2020-02-08 at 22 45 50" src="https://user-images.githubusercontent.com/1580697/74091681-0d4a2180-4acb-11ea-8a0d-7e8c65f4214e.png"> ### Does this PR introduce any user-facing change? No ### How was this patch tested? By the `check outputs of expression examples` test from `SQLQuerySuite`. Closes #27502 from MaxGekk/str-regexp-foldable-pattern. Authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>

view details

Yuanjian Li

commit sha a6b91d2bf727e175d0e175295001db85647539b1

[SPARK-30556][SQL][FOLLOWUP] Reset the status changed in SQLExecution.withThreadLocalCaptured ### What changes were proposed in this pull request? Follow up for #27267, reset the status changed in SQLExecution.withThreadLocalCaptured. ### Why are the changes needed? For code safety. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Existing UT. Closes #27516 from xuanyuanking/SPARK-30556-follow. Authored-by: Yuanjian Li <xyliyuanjian@gmail.com> Signed-off-by: herman <herman@databricks.com>

view details

Shixiong Zhu

commit sha e2ebca733ce4366349a5a25fe94a8e31b67d410e

[SPARK-30779][SS] Fix some API issues found when reviewing Structured Streaming API docs ### What changes were proposed in this pull request? - Fix the scope of `Logging.initializeForcefully` so that it doesn't appear in subclasses' public methods. Right now, `sc.initializeForcefully(false, false)` is allowed to called. - Don't show classes under `org.apache.spark.internal` package in API docs. - Add missing `since` annotation. - Fix the scope of `ArrowUtils` to remove it from the API docs. ### Why are the changes needed? Avoid leaking APIs unintentionally in Spark 3.0.0. ### Does this PR introduce any user-facing change? No. All these changes are to avoid leaking APIs unintentionally in Spark 3.0.0. ### How was this patch tested? Manually generated the API docs and verified the above issues have been fixed. Closes #27528 from zsxwing/audit-ss-apis. Authored-by: Shixiong Zhu <zsxwing@gmail.com> Signed-off-by: Xiao Li <gatorsmile@gmail.com>

view details

push time in 8 days

push eventhtynkn/dubbo-admin

push time in 8 days

push eventhtynkn/spring-boot

Stephane Nicoll

commit sha a053d207d636605e076f447e74f9e3cb991e3e9d

Start building against Spring Integration 5.3 M2 snapshots See gh-20104

view details

Stephane Nicoll

commit sha cf06eec1747c984b0cae2d41cda567c91991bc3a

Start building against Spring AMQP 2.2.4 snapshots See gh-20105

view details

Stephane Nicoll

commit sha e8b97dbc75a2fa29e189c0e4ea59a6fa46d1eb41

Start building against Spring Kafka 2.4.2 snapshots See gh-20107

view details

Dmytro Nosan

commit sha 67dd9ad537ccc4ddce061f994541ff3923591c1e

Create HazelCastClient if necessary This commit makes sure to create a HazelcastClient if an instance name is provided in configuration and if no such client already exists. This harmonizes the behaviour with of the server counter-part. See gh-20109

view details

Stephane Nicoll

commit sha 77bdf992ec3b9dd2ed12e54ed1c9898fac0b686d

Polish "Create HazelCastClient if necessary" See gh-20109

view details

Stephane Nicoll

commit sha 14e54451055ad0c8e1f7f55adee2932c4d37c062

Merge pull request #20109 from nosan * pr/20109: Polish "Create HazelCastClient if necessary" Create HazelCastClient if necessary Closes gh-20109

view details

Stephane Nicoll

commit sha ddcf5966bb5ced47ae594e53ad31fa7b8c7eb4e0

Disable Spring Data Neo4j's open session in view by default Closes gh-20012

view details

dreis2211

commit sha aa0360e1ba427e15c3d26e3ba9efd9fcc54d3323

Fix some deprecation warnings See gh-20108

view details

Stephane Nicoll

commit sha b18db5eea707e33085a0e30f96cde3d3b3744c86

Merge pull request #20108 from dreis2211 * pr/20108: Fix some deprecation warnings Closes gh-20108

view details

Madhura Bhave

commit sha ec42dcd1739317d94359f6a25038ec0cf369fe3d

Fix typo

view details

push time in 8 days

push eventhtynkn/dubbo-samples

push time in 8 days

push eventhtynkn/fish-redux

push time in 8 days

push eventhtynkn/rpc-benchmark

push time in 8 days

push eventhtynkn/spring-boot

Stephane Nicoll

commit sha bde7bd0a1a310f48fb877b9a0d4a05b8d829d6c0

Cleanup cassandra smoke test removal See gh-19588

view details

push time in 9 days

push eventhtynkn/dubbo

Mark

commit sha 6d1afb440768f51caf1a2190a08e182868c5a66f

Fix RpcContext.asyncCall return when throw RpcException (#5607) Fix #5606

view details

LiosWong

commit sha 5b4816dfa6ae92d0b94af4ed41c07ca6c9d0fcb9

fix:Remove unused variable parameters in AbstractProxyInvoker (#5651)

view details

GungnirLaevatain

commit sha 74e2d6814a77a89ae46539310db3ca2792b6d427

remove duplicate code (#5577)

view details

withthewind

commit sha cb7d9f5ca0249636ea6cef96660633329873e2b3

enhance the java doc of dubbo-common module. (#5578)

view details

withthewind

commit sha f4d97c5ae99f30999a5537758312fe1b03971e67

enhance the java doc of dubbo-common module. (#5575)

view details

ken.lj

commit sha c29bb653e011e38c775bffeaae6cf525b067f274

Fixes multi-registry subscription loadbalance strategy does not work properly. (#5686) * refactor directory to make cluster and invoker load balance work * add weight property in dubbo.xsd * distinguish usage of getUrl and getConsumerUrl * fix directory related UT * fix ut

view details

withthewind

commit sha cf0af06c9df7bc00fd0b57641b0d1130c34528de

Fix document comments in code blocks to multi-line comments (#5573)

view details

withthewind

commit sha c3d57008dd52a2d87cc7d23952f0dbeec7d13f8d

enhance the java doc of dubbo-rpc module. (#5566) fix #3001

view details

weixing1204

commit sha 5a62b55814fcc87e36242036490d29af4cdb2fee

add Sentinel support for RedisRegistry(#5622)

view details

push time in 9 days

push eventhtynkn/spark

Huaxin Gao

commit sha a7ae77a8d83bfbb8de5bb0dc2a8a0485c1486614

[SPARK-30662][ML][PYSPARK] Put back the API changes for HasBlockSize in ALS/MLP ### What changes were proposed in this pull request? Add ```HasBlockSize``` in shared Params in both Scala and Python. Make ALS/MLP extend ```HasBlockSize``` ### Why are the changes needed? Add ```HasBlockSize ``` in ALS, so user can specify the blockSize. Make ```HasBlockSize``` a shared param so both ALS and MLP can use it. ### Does this PR introduce any user-facing change? Yes ```ALS.setBlockSize/getBlockSize``` ```ALSModel.setBlockSize/getBlockSize``` ### How was this patch tested? Manually tested. Also added doctest. Closes #27501 from huaxingao/spark_30662. Authored-by: Huaxin Gao <huaxing@us.ibm.com> Signed-off-by: zhengruifeng <ruifengz@foxmail.com>

view details

Nicholas Chammas

commit sha 339c0f9a623521acd4d66292b3fe3e6c4ec3b108

[SPARK-30510][SQL][DOCS] Publicly document Spark SQL configuration options ### What changes were proposed in this pull request? This PR adds a doc builder for Spark SQL's configuration options. Here's what the new Spark SQL config docs look like ([configuration.html.zip](https://github.com/apache/spark/files/4172109/configuration.html.zip)): ![Screen Shot 2020-02-07 at 12 13 23 PM](https://user-images.githubusercontent.com/1039369/74050007-425b5480-49a3-11ea-818c-42700c54d1fb.png) Compare this to the [current docs](http://spark.apache.org/docs/3.0.0-preview2/configuration.html#spark-sql): ![Screen Shot 2020-02-04 at 4 55 10 PM](https://user-images.githubusercontent.com/1039369/73790828-24a5a980-476f-11ea-998c-12cd613883e8.png) ### Why are the changes needed? There is no visibility into the various Spark SQL configs on [the config docs page](http://spark.apache.org/docs/3.0.0-preview2/configuration.html#spark-sql). ### Does this PR introduce any user-facing change? No, apart from new documentation. ### How was this patch tested? I tested this manually by building the docs and reviewing them in my browser. Closes #27459 from nchammas/SPARK-30510-spark-sql-options. Authored-by: Nicholas Chammas <nicholas.chammas@liveramp.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

view details

Gengliang Wang

commit sha b877aac14657832d1b896ea57e06b0d0fd15ee01

[SPARK-30684 ][WEBUI][FollowUp] A new approach for SPARK-30684 ### What changes were proposed in this pull request? Simplify the changes for adding metrics description for WholeStageCodegen in https://github.com/apache/spark/pull/27405 ### Why are the changes needed? In https://github.com/apache/spark/pull/27405, the UI changes can be made without using the function `adjustPositionOfOperationName` to adjust the position of operation name and mark as an operation-name class. I suggest we make simpler changes so that it would be easier for future development. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Manual test with the queries provided in https://github.com/apache/spark/pull/27405 ``` sc.parallelize(1 to 10).toDF.sort("value").filter("value > 1").selectExpr("value * 2").show sc.parallelize(1 to 10).toDF.sort("value").filter("value > 1").selectExpr("value * 2").write.format("json").mode("overwrite").save("/tmp/test_output") sc.parallelize(1 to 10).toDF.write.format("json").mode("append").save("/tmp/test_output") ``` ![image](https://user-images.githubusercontent.com/1097932/74073629-e3f09f00-49bf-11ea-90dc-1edb5ca29e5e.png) Closes #27490 from gengliangwang/wholeCodegenUI. Authored-by: Gengliang Wang <gengliang.wang@databricks.com> Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>

view details

push time in 9 days

push eventhtynkn/dubbo-admin

push time in 9 days

push eventhtynkn/fish-redux

push time in 9 days

push eventhtynkn/dubbo-samples

push time in 9 days

push eventhtynkn/rpc-benchmark

push time in 9 days

push eventhtynkn/spring-boot

Stephane Nicoll

commit sha aa56a6f6473e46c1e4d8d8341de7da82fde0faff

Add missing mockito dependency Closes gh-20097

view details

Stephane Nicoll

commit sha c5d38e0b5fb06a7d97d3748e179a02967cf20e3c

Upgrade to AppEngine SDK 1.9.78 Closes gh-20060

view details

Stephane Nicoll

commit sha 730683ce26f3db9efdc467277279e5eb67deefa5

Upgrade to Awaitility 4.0.2 Closes gh-20061

view details

Stephane Nicoll

commit sha f883a6cf26bacf039b6158d13697539911029d9c

Upgrade to Byte Buddy 1.10.7 Closes gh-20062

view details

Stephane Nicoll

commit sha f0ac9e13891e074e06c5be9e8afd81a62a04ddb3

Upgrade to Caffeine 2.8.1 Closes gh-20063

view details

Stephane Nicoll

commit sha f8087a6f2979e30ee7e81596da5dd742f245bac9

Upgrade to Couchbase Client 2.7.12 Closes gh-20065

view details

Stephane Nicoll

commit sha ad8fa8f6a0c745aef722a3b2012bdb1af87690c6

Upgrade to Elasticsearch 7.5.2 Closes gh-20066

view details

Stephane Nicoll

commit sha a7249d20d22cf0ced01cace31f4f528ff7c587d1

Upgrade to Flyway 6.2.2 Closes gh-20067

view details

Stephane Nicoll

commit sha 93d34781fcd293841026a7f6b885db4a953f30ae

Upgrade to Groovy 2.5.9 Closes gh-20068

view details

Stephane Nicoll

commit sha 466dd66c8f27b06358124308435597f23c3dad8c

Upgrade to Hazelcast 3.12.6 Closes gh-20069

view details

Stephane Nicoll

commit sha da7dbf085e486acaaec874d2c5625d122f6b5eaf

Upgrade to Hibernate 5.4.10.Final Closes gh-20070

view details

Stephane Nicoll

commit sha f0d2d320c2260176597bce680cf1107cbb3cf38e

Upgrade to Hibernate Validator 6.1.2.Final Closes gh-20071

view details

Stephane Nicoll

commit sha 95c4f1b0c09e7fed99d77b6c9e0c6aa2ec41de59

Upgrade to HikariCP 3.4.2 Closes gh-20072

view details

Stephane Nicoll

commit sha 616a33367f9c0a8ca5317e3c515a39129c372084

Upgrade to HtmlUnit 2.37.0 Closes gh-20073

view details

Stephane Nicoll

commit sha 7996a32129a26febbd4b01ea8aaa77de0e21a4b9

Upgrade to HttpClient 4.5.11 Closes gh-20074

view details

Stephane Nicoll

commit sha e0cd00e0d3e5343fd39616370872de8c91379d4a

Upgrade to HttpCore 4.4.13 Closes gh-20075

view details

Stephane Nicoll

commit sha acbdf0cd2fe5d10bfb36ff71bc9ffda407ea1440

Upgrade to Jetty EL 8.5.49 Closes gh-20076

view details

Stephane Nicoll

commit sha 3e0515da2fd089ce9a95896b500f74fe119a3cb8

Upgrade to Jetty Reactive HTTPClient 1.1.1 Closes gh-20077

view details

Stephane Nicoll

commit sha 2b9934573384a50490d5cca408f73bbc6ca17470

Upgrade to Johnzon 1.2.3 Closes gh-20078

view details

Stephane Nicoll

commit sha 7dccc10803ff4e7ac494b274ea069f9dd23c1182

Upgrade to jOOQ 3.12.4 Closes gh-20079

view details

push time in 10 days

push eventhtynkn/dubbo

AndyXu

commit sha 89e1b94f1017e2094b6022bbbe5fd35e84b7b19d

Fix SelectTelnetHandler.telnet IndexOutOfBoundsException

view details

push time in 10 days

push eventhtynkn/spark

zhengruifeng

commit sha 12e1bbaddbb2ef304b5880a62df6683fcc94ea54

Revert "[SPARK-30642][SPARK-30659][SPARK-30660][SPARK-30662]" ### What changes were proposed in this pull request? Revert #27360 #27396 #27374 #27389 ### Why are the changes needed? BLAS need more performace tests, specially on sparse datasets. Perfermance test of LogisticRegression (https://github.com/apache/spark/pull/27374) on sparse dataset shows that blockify vectors to matrices and use BLAS will cause performance regression. LinearSVC and LinearRegression were also updated in the same way as LogisticRegression, so we need to revert them to make sure no regression. ### Does this PR introduce any user-facing change? remove newly added param blockSize ### How was this patch tested? reverted testsuites Closes #27487 from zhengruifeng/revert_blockify_ii. Authored-by: zhengruifeng <ruifengz@foxmail.com> Signed-off-by: zhengruifeng <ruifengz@foxmail.com>

view details

Yuanjian Li

commit sha 3db3e39f1122350f55f305bee049363621c5894d

[SPARK-28228][SQL] Change the default behavior for name conflict in nested WITH clause ### What changes were proposed in this pull request? This is a follow-up for #25029, in this PR we throw an AnalysisException when name conflict is detected in nested WITH clause. In this way, the config `spark.sql.legacy.ctePrecedence.enabled` should be set explicitly for the expected behavior. ### Why are the changes needed? The original change might risky to end-users, it changes behavior silently. ### Does this PR introduce any user-facing change? Yes, change the config `spark.sql.legacy.ctePrecedence.enabled` as optional. ### How was this patch tested? New UT. Closes #27454 from xuanyuanking/SPARK-28228-follow. Authored-by: Yuanjian Li <xyliyuanjian@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>

view details

Yuanjian Li

commit sha e1cd4d9dc25ac3abe33c07686fc2a7d1f2b5c122

[SPARK-29587][DOC][FOLLOWUP] Add `SQL` tab in the `Data Types` page ### What changes were proposed in this pull request? Add the new tab `SQL` in the `Data Types` page. ### Why are the changes needed? New type added in SPARK-29587. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Locally test by Jekyll. ![image](https://user-images.githubusercontent.com/4833765/73908593-2e511d80-48e5-11ea-85a7-6ee451e6b727.png) Closes #27447 from xuanyuanking/SPARK-29587-follow. Authored-by: Yuanjian Li <xyliyuanjian@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>

view details

push time in 10 days

push eventhtynkn/dubbo-admin

push time in 10 days

push eventhtynkn/dubbo-samples

push time in 10 days

push eventhtynkn/fish-redux

push time in 10 days

push eventhtynkn/rpc-benchmark

push time in 10 days

push eventhtynkn/dubbo-samples

Huang Yunkun

commit sha de658dc282cc905361ca303c29ebe5667517bbd8

fix

view details

push time in 11 days

push eventhtynkn/dubbo-samples

Huang Yunkun

commit sha 337de83be148ff5b0289d8988b3550652900cc5f

fix

view details

Huang Yunkun

commit sha 6f8e171d510f5382dc636f5076837736d81368c0

fix

view details

push time in 11 days

push eventhtynkn/dubbo-samples

Huang Yunkun

commit sha fc20815602256b76bf46a24c065a227435364bed

fix

view details

Huang Yunkun

commit sha bd1bc04e61d58b8bf5452e7e352de1272841f597

fix

view details

push time in 11 days

push eventhtynkn/dubbo-samples

Huang Yunkun

commit sha 95a2da08f9409994116f3a9f6c9baf15f092e5c1

fix

view details

push time in 11 days

push eventhtynkn/dubbo-samples

Huang Yunkun

commit sha 7ae778c01fcb228bd9440b8a21a8f0d27739ad9d

fix

view details

push time in 11 days

push eventhtynkn/dubbo-samples

Huang Yunkun

commit sha 6e4d8a11a1bf663b2a3dc00d7903cd290b29a314

fix

view details

push time in 11 days

push eventhtynkn/dubbo-samples

Huang Yunkun

commit sha 17c27a1a84ed4f9f199512c18b2fbb6943032e28

add github action for integraiton test

view details

push time in 11 days

PublicEvent
more