![]() ![]() Support parameterized SQL ( SPARK-41271, SPARK-42702).Add Dataset.as(StructType) ( SPARK-39625).Customized K8s Scheduler (Apache YuniKorn and Volcano) GA ( SPARK-42802).Support IPv6-only environment ( SPARK-39457).Implement PyTorch Distributor ( SPARK-41589).Provide a memory profiler for PySpark user-defined functions ( SPARK-40281).Pandas API coverage improvements ( SPARK-42882) and NumPy input support in PySpark ( SPARK-39405).Python Arbitrary Stateful Processing in Structured Streaming ( SPARK-40434).Async Progress Tracking in Structured Streaming ( SPARK-39591).Better Spark UI scalability and Driver stability for large applications ( SPARK-41053).Enable Bloom filter Joins by default ( SPARK-38841).Harden SQLSTATE usage for error classes ( SPARK-41994).Support “Lateral Column Alias References” ( SPARK-27561). ![]() Support TIMESTAMP WITHOUT TIMEZONE data type ( SPARK-35662).Implement support for DEFAULT values for columns in tables ( SPARK-38334).Python client for Spark Connect ( SPARK-39375).We have curated a list of high level changes here, grouped by major modules. You can consult JIRA for the detailed changes. To download Apache Spark 3.4.0, visit the downloads page. ![]() This release introduces Python client for Spark Connect, augments Structured Streaming with async progress tracking and Python arbitrary stateful processing, increases Pandas API coverage and provides NumPy input support, simplifies the migration from traditional data warehouses by improving ANSI compliance and implementing dozens of new built-in functions, and boosts development productivity and debuggability with memory profiling. With tremendous contribution from the open-source community, this release managed to resolve in excess of 2,600 Jira tickets. Apache Spark 3.4.0 is the fifth release of the 3.x line. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |