Python Polars 1.15.0
🚀 Performance improvements
- Reduce the size of row encoding UTF-8 (#19911)
- Memoize duplicates in rolling-gb-dyn (#19939)
- More efficient row encoding for
pl.List
(#19907) - Half the size of Booleans in row encoding (#19927)
- Rolling 'iter_lookbehind' breeze through duplicates (#19922)
- Initially trim leading and trailing filtered rows (#19850)
✨ Enhancements
- Catch use of 'polars' in
to_string
for non-Duration dtypes and raise an informative error (#19977) - Add AhoCorasick backed 'find_many' (#19952)
- Allow Python Enums as dtype inputs (#19926)
- Speed up starts_with for small prefixes (#19904)
- Auto-enable hive partitioning if hive_schema was given (#19902)
- Add
pl.concat_arr
to concatenate columns into an Array column (#19881) - Support both "iso" and "iso:strict" format options for
dt.to_string
(#19840) - Add rounding for Decimal type (#19760)
- Improved array arithmetic support (#19837)
🐞 Bug fixes
- Fix Decimal type fill_null (#19981)
- Fix panic on schema merge for prefiltering (#19972)
- Fix lazy frame join expression (#19974)
- Fix
gather_every
forScalar
(#19964) - Toggle 'fast_unique' on new_from_index (#19956)
- Parse uppercase config keys (#19852)
- Raise proper error message when too small interval is passed to datetime_range (#19955)
- Fix scalar object (#19940)
- Raise InvalidOperationError for invalid float to decimal casts (e.g. Inf, NaN) (#19938)
- Address indexing edge-case with
numpy
arrays (#19895) - Fix panic with combination of hive and parquet prefiltering (#19905)
- Fix panic when joining with empty frame (debug only) (#19896)
- Fix incorrect result from inequality filter after join on LazyFrame (#19898)
- Misleading
ShapeError
error message on dataframe creation (#19901) - Fix panic with empty delta scan, or empty parquet scan with a provided schema (#19884)
- Ensure type object of inputs for cached any-value conversion functions are kept alive (#19866)
- Improve export from 2D Array dtype columns to PyTorch Tensors (
to_torch
) and Jax Arrays (to_jax
) (#19862) - Fix panic using
scan_parquet().with_row_index()
with hive partitioning enabled (#19865) - Improve histogram bin logic (#18761)
- Raise informative error instead of panicking for list arithmetic on some invalid dtypes (#19841)
- Properly handle Zero-Field Structs in row encoding (#19846)
- Incorrect explode schema for
LazyFrame.explode()
(#19860) - DataFrame
rows_by_key
returning key tuples with elements in wrong order (#19486) - Ensure
List
element truncation ellipses respectASCII*
table formats (#19835)
📖 Documentation
- Remove duplicate sentence in
Series.bottom_k
docstring (#19947) - Complete parameters description and add an example for
clip()
(#19875) - Fix some warnings during docs build (#19848)
📦 Build system
- Use public windows runners in python release (#19982)
- Add windows-aarch64 to python binaries (#19966)
🛠️ Other improvements
- Minor non-breaking space (
) tweak for HTML rendering (#19864) - Implement nested row encoding / decoding (#19874)
- Switch back to PyO3 0.22 (#19851)
- Adjust flaky
with_columns
test (#19844) - Add proper tests for row encoding (#19843)
Thank you to all our contributors for making this release possible!
@MarcoGorelli, @alexander-beedie, @barak1412, @coastalwhite, @etiennebacher, @ion-elgreco, @itamarst, @lukemanley, @mcrumiller, @mhogervo, @nameexhaustion, @orlp, @ritchie46, @stijnherfst and @stinodego