Python Polars 0.19.12
⚠️ Deprecations
- Deprecate
nans_compare_equal
parameter in assert utils (#12019) - Rename
ljust
/rjust
topad_end
/pad_start
(#11975) - Deprecate
shift_and_fill
in favor ofshift
(#11955) - Deprecate
clip_min
/clip_max
in favor ofclip
(#11961)
🚀 Performance improvements
- improve parquet downloading (#12061)
- fix regression non-null asof join (#11984)
- drasticly improve performance of limit on async parquet datasets (#11965)
✨ Enhancements
- Add supertype for
List
/Array
(#12016) - enable eq and neq for array dtype (#12020)
- Expressify n of shift (#12004)
- add dedicated
name
namespace for operations that affect expression names (#11973) - optimize asof_join and allow null/string keys (#11712)
- limit concurrent downloads in async parquet (#11971)
- sample fraction can take an expr (#11943)
- Add
infer_schema_length
topl.read_json
(#11724)
🐞 Bug fixes
- Fix
get_index
/iteration forArray
types (#12047) - improved xlsx2csv defaults for
read_excel
(#12081) - str.concat on empty list (#12066)
- fix issue with invalid
Mapping
objects used as schema being silently ignored (#12027) - improve ingest from
numpy
scalar values (#12025) - binary agg should group aware if literal not a scalar (#12043)
- Use Arrow schema for file readers (#12048)
- Error on duplicates in hive partitioning (#12040)
- display fmt for str split (#12039)
- sum_horizontal should not always cast to int (#12031)
- fix apply_to_inner's dtype (#12010)
- Allow inexact checking of nested integers (#12037)
- Fix padding for non-ASCII strings (#12008)
- fix dot visualization of anonymous scans (#12002)
- SQL table aliases (#11988)
- fix streaming multi-column/multi-dtype sort (#11981)
- ensure streaming parquet datasets deal with limits (#11977)
- implement proper hash for identifier in cse (#11960)
- fix take return dtype in group context. (#11949)
- fix panic in format of anonymous scans (#11951)
- sql In should work without specific ops (#11947)
- construct list series from any values subject to dtype (#11944)
🛠️ Other improvements
- minor updates to lint-related dependencies (#12073)
- Add Excel page to user guide (#12055)
- Direct CONTRIBUTING to the docs website (#12042)
- Replace
black
byruff format
(#11996) - Further assert utils refactor (#12015)
- Remove stacklevels checker utility script (#11962)
- Disable type checking for
dataframe_api_compat
dependency (#11997) - Fix release tag (#11994)
- optimize asof_join and allow null/string keys (#11712)
- Add
Development
andReleases
sections to the documentation (#11932) - include the "build" dir when running
make clean
for docs (#11970) - make cloning
PyExpr
consistent (#11956) - fix take return dtype in group context. (#11949)
- warn about scan_pyarrow_dataset's limitations and suggest scan_parquet instead (if possible) (#11952)
- Add
set_fmt_table_cell_list_len
to API docs (#11942)
Thank you to all our contributors for making this release possible!
@JulianCologne, @MarcoGorelli, @Rohxn16, @alexander-beedie, @braaannigan, @brayanjuls, @messense, @nameexhaustion, @orlp, @reswqa, @ritchie46, @squnit, @stinodego and @universalmind303