Releases · pola-rs/polars

27 Sep 15:35

github-actions

py-0.19.5

b83bf67

Python Polars 0.19.5

🚀 Performance improvements

remove double memcopy (#11365)
adress perf regression (#11354)

🐞 Bug fixes

revert invalid runtime check (#11363)
more cloud urls (#11361)
ensure cloud globbing can deal with spaces (#11360)
recognize more cloud urls (#11357)

🛠️ Other improvements

Disable version warning banner for now (#11359)
Fix error message reference to infer_schema_length (#11358)
Mark some tests as slow (#11350)
improve parametric tests for group_by_rolling by skipping overflowing cases (#11286)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @jonashaag, @orlp, @ritchie46 and @stinodego

Contributors

jonashaag, orlp, and 3 other contributors

Assets 2

27 Sep 10:01

github-actions

py-0.19.4

66f0a6d

Python Polars 0.19.4

🏆 Highlights

support 'hive partitioning' aware readers (#11284)
natively support reading parquet for aws, gcp and azure (#11210)
Add support for Iceberg (#10375)
The great expressification by @reswqa (#11320, #11344, #11313, #11257, #11288, #11275, #11197, #11167, #11155)

⚠️ Deprecations

Add disable_string_cache (#11020)

🚀 Performance improvements

improve dynamic_groupby_iter (#11341)
improve and fix rolling windows by linear scanning (#11326)
faster init from pydantic models that have a small number of fields, and support direct init from SQLModel data (often used with FastAPI) (#11263)
improve outer join materialization (#11241)
use ryu and itoa for primitive serialization (#11193)
use try-binary-elementwise instead of try-binary-elementwise-values in dt_truncate (#11189)
Using cache for str.contains regex compilation (#11183)

✨ Enhancements

introduce 'label' instead of 'truncate' in group_by_dynamic, which can take label='right' (#11337)
Expressify list.shift (#11320)
top_k and bottom_k supports pass an expr (#11344)
add "pyxlsb" engine support to read_excel (for excel binary workbook files) (#11248)
support 'hive partitioning' aware readers (#11284)
str.strip_chars supports take an expr argument (#11313)
sample n can take an expr (#11257)
Add disable_string_cache (#11020)
clip supports expr arguments and physical numeric dtype (#11288)
Introduce list.drop_nulls (#11272)
str.splitn and split_exact can take an expr argument by (#11275)
introduce ambiguous option for dt.round (#11269)
Adds NULLIF and COALESCE SQL functions (#11124)
better tree-formatting representation (#11176)
natively support reading parquet for aws, gcp and azure (#11210)
Expressify str.strip_prefix & suffix (#11197)
Add support for Iceberg (#10375)
list.join's separator can be expression (#11167)
argument every of datetime.truncate can be expression (#11155)

🐞 Bug fixes

Fix Series.__contains__ for None values and implement is_in for null Series (#11345)
don't panic on multi-nodes in streaming conversion (#11343)
ensure trailing quote is written for temporal data when CSV quote_style is non-numeric (#11328)
clarify has_validity docstring and fix several cases where the presence of a bitmask was used to incorrectly infer the existence of null values (#11319)
fix empty Series construction edge-case with Struct dtype (#11301)
DataFrame init from collections.namedtuple values (#11314)
Exclude functools wrapper frames in find_stacklevel (#11292)
set partitions independent of thread pool (#11304)
address VSCode issue with autocomplete on selector expressions in editor/console (#11235)
consume duplicates in rolling_by window (#11261)
handle url encoded paths in objectpath creation (#11240)
use POOL when writing csv (#11222)
don't conflate saved Config JSON string with file path (#11098)
is_in for bool evaluate has_false incorrectly (#11217)
improve handling of database drivers that can return arrow data (#11201)
fix nullable filter mask in group_by (#11207)
replace n-th in filter (#11206)
fix translation of Series-nested datetime/date values for scan_pyarrow predicates (#11195)
address unexpected expression name from use of unary - or + operators (#11158)
impl hash for more function expr (#11182)
list.join's separator can be expression (#11167)
Add some missing expr type hint for series (#11171)
consistently use negative every as the default for offset in group_by_dynamic (#11164)
Make pl.struct serializable (#11169)
only raise on actual parameter collision when "dtypes" specified in read_excel "read_csv_options" (#11162)
propagate null value for str/binary starts/ends_with and contains (#11141)

🛠️ Other improvements

simplify/clarify group_by_dynamic examples (#11335)
tighten assert_frame_equal for LazyFrames (don't collect until after the schema has been checked) (#11331)
unify display for namespaced function expr (#11342)
add lazy pivot example (#11325)
Use GITHUB_TOKEN to get contributor information for docs (#11321)
Enable version warning banner (#11322)
cross-reference null_count from has_validity (clarifies the correct way to check for nulls) (#11323)
Pin pydantic in dev requirements <2.4.0 (#11312)
remove default auto-explode for map_many_private (#11270)
Add type alias IntoExprColumn (#11296)
update a few dependencies (#11283)
Properly skip ADBC test (#11282)
Fix some minor Makefile issues (#11276)
update sponsors (#11271)
parametric tests for group_by_rolling (#11262)
Make some list function expr non-anonymous (#11230)
Mention the performant feature only once (#11223)
remove unneeded indirection (#11233)
remove unneeded mutex around object-store (#11224)
clarify every/period/offset in group_by_dynamic (#11175)
Fix read_database batch_size docstring (#11132)

Thank you to all our contributors for making this release possible!
@ByteNybbler, @Cheukting, @Fokko, @Hofer-Julian, @MarcoGorelli, @SeanTroyUWO, @alexander-beedie, @billylanchantin, @jonashaag, @mcrumiller, @orlp, @ptiza, @reswqa, @ritchie46, @stinodego and @universalmind303

Contributors

jonashaag, orlp, and 14 other contributors

Assets 2

17 Sep 16:31

github-actions

rs-0.33.0

7f8cd7d

Rust Polars 0.33.0

🏆 Highlights

implementing sink_csv for LazyFrame (#10682)

💥 Breaking changes

empty product returns identity (#10842)
return f64 for rank when method="average" (#10734)
Rename groupby to group_by (#10654)
Read/write support for IPC streams in DataFrames (#10606)
Change behavior of all - fix Kleene logic implementation for all/any (#10564)
remove fixed_seed and add pl.set_random_seed (#10388)
Make arange an alias for int_range (#9983)
date_range/time_range no longer return a List type (#10526)
Remove various functionalities deprecated before 0.18 (#10527)

⚠️ Deprecations

Rename is_first/last to is_first/last_distinct (#11130)
Rename count_match to count_matches (#11028)
Rename strip to strip_chars (#10813)
Add datetime_range expression function (#10213)
Rename Series/Expr.rolling_apply to rolling_map (#10750)

🚀 Performance improvements

improve performance of fast projection (#10945)
parse time zones outside of downcast_iter() in replace_time_zone (#10713)
use binary abstraction for atan2 (#10588)
use binary abstraction in pow (#10562)

✨ Enhancements

Expressify str.split argument. (#11117)
Expressify argument of binary contains (#11091)
dt.offset_by supports broadcasting lhs (#11095)
Expressify argument of binary starts_with and ends_with (#11076)
json_extract supports extract static and string value to list dtype (#11057)
add quote_style="never" option for write_csv (#11015)
add support for nextest (#11048)
Add literal for str count_match (#10996)
More dtypes supports cast to list (#11025)
ParquetCloudSink to allow streaming pipelines into remote ObjectStores (#10060)
Add strip_prefix and strip_suffix to the string namespace (#10958)
Add datetime_range expression function (#10213)
add proper cache for Regex compilation (#10934)
implementation of array_to_string (#10839)
apply left side predicate pushdown also to right side if all predicate columns are also join columns (#10841)
accept expr in str.count_match (#10900)
accept expressions in .offset_by (#9967)
implement drop as special case of select (#10885)
Supports is_last operation (#10760)
activate cse for group_by (again) (#10749)
add pairwise float sum implementation (#10756)
implementing sink_csv for LazyFrame (#10682)
Supports series unique & arg_unique & n_unique for list (#10743)
repeat_by should also support broadcasting of LHS (#10735)
deprecate 'use_earliest' argument in favour of 'ambiguous', which can take expressions (#10719)
is_first also supports numeric list type. (#10727)
improve slice pushdown in unions (#10723)
Support min and max strategy for binary & str columns fill null (#10673)
support broadcasting in list set operations (#10668)
add truncate_ragged_lines (#10660)
supports cast to list (#10623)
Rename groupby to group_by (#10654)
preserve whitespace in notebook output (#10644)
Read/write support for IPC streams in DataFrames (#10606)
improve binary (arity) generics (#10622)
propagate null is in is_in and more generic array construction (#10614)
Change behavior of all - fix Kleene logic implementation for all/any (#10564)
frame-level cast support (#10504)
Add failed column to cast exception (#10507)
Make arange an alias for int_range (#9983)
date_range/time_range no longer return a List type (#10526)
Remove various functionalities deprecated before 0.18 (#10527)

🐞 Bug fixes

Correct hash and fmt for struct expr (#11119)
enforce sortedness of by argument in rolling_* functions (#11002)
Filter on empty objectChunked should not throw error (#11073)
ensure null_count statistics accounts for null array (#11070)
toggle off cse if ext_context is used (#11051)
Correct field dtype of string concat (#11055)
pushed-down expr should be considered when evaluating ExternalContext (#11023)
fix rolling_* functions when "by" has nanosecond resolution (#11005)
Don't reuse member for Selector::Add (#11026)
fix the construction of List<Null> (#10969)
allow singular null in regex pattern (#10948)
compute length of null array in explode (#10946)
Allow exactly one value in start/end for int_range (#10914)
count was falsy tagged as cse in group by (#10917)
Retain original dtype when deserializing an empty list (#10893)
CSE don't accept opaque functions (#10905)
Make int_range(s) exclusive on the upper bound when step is negative (#10898)
fix conversion from decimal to float (#10776)
Add broadcasting for list comparisons (#10857)
don't overflow length before checking limit (#10883)
fix bug where datetimes were not parsed in read_csv when pattern had no hour or minute (#10877)
tag amortized iter unsafe and add safe alternatives (#10881)
use pool in dataframe arithmetic (#10864)
remove debug println! from datetime fn (#10862)
repair polars_err string interpolation (#10863)
make count_match docs and extract_all docs/impl consistent around zero matches (#10854)
empty product returns identity (#10842)
never panic in hash/equality doesn't hold in cse (#10836)
Improve bound checks on temporal ranges (#10837)
var/std behavior around few elements (#10828)
Fix divided by zero error when read empty csv in streaming mode (#10819)
fix equality of quantile aggregation node (#10816)
Reading an only-header csv file in streaming mode should not panic (#10810)
get_single_leaf can't handle Expr::Count (#10790)
string to decimal parsing (#10712)
support groupby literal in streaming (#10771)
ORDER BY on unselected columns (#10752)
Fix is_in cannot cast list type for float (#10769)
fix unicode truncation in json parsing (#10761)
Error message of list unique should not display inner type (#10748)
create chunks_mut entry in vtable (#10745)
Prevent panic on sample_n with replacement from empty df (#10731)
only preserve sortedness flag in replace_time_zone when safe (#10738)
Error on value_counts on column named "counts" (#10737)
Build Series from empty Series vector (#10558)
return f64 for rank when method="average" (#10734)
Keep min/max and arg_min/arg_max consistent. (#10716)
Fix bug when providing custom labels and opting for duplicates in qcut (#10686)
Cast small int type when scan csv in streaming mode. (#10679)
Reused input series in rolling_apply should not be orderly (#10694)
re-sort buffer when update window swap the whole buffer (#10696)
Set the correct fast_explode flag for ListUtf8ChunkedBuilder (#10684)
Sorted Utf8Chunked max_str and min_str should consider null value (#10675)
AllHorizontal format string (#10658)
List<null> chunked builder should take care of series name (#10642)
respect 'ignore_errors=False' in csv parser (#10641)
fix rename + projection pushdown (#10624)
fix int/float downcast in is_in (#10620)
Change behavior of all - fix Kleene logic implementation for all/any (#10564)
Fix serialization for categorical chunked. (#10609)
join_asof missing tolerance implementation, address edge-cases (#10482)
Take input_schema to create physical expr for Selection (#10571)
fix serialization of empty lists (#10563)
Clear window cache after evaluate predication expr (#10505)
Parsing regex col in Expr::Columns (#10551)
sanitize column naming in boolean ops (#10531)
fix build for wasm (#10536)
remove fixed_seed and add pl.set_random_seed (#10388)
fix build for wasm (#9502)
rollback cse in groupby: python 0.18.15 (#10491)

🛠️ Other improvements

Removed duplicated example (#11109)
Add CODEOWNERS for docs folder (#11107)
Refactor starts_with and ends_with for string (#11085)
Integrate user guide (#11089)
remove feature gate join/groupby in polars-core (#10965)
Add Documentation issue type (#11042)
complete intra-docs in api documentation (#11007)
genericize take implementation (#10976)
genericize PolarsDataType (#10952)
enhance internal crates readme with reference to main crate (#10928)
Add Duration method for checking full days (#10850)
apply with_name in more places (#10899)
never compare opaque functions (#10906)
eliminate repetition in utf8 datetime functions (#10860)
Fix issue templates for bug reports (#10896)
remove LocalProjection (#10886)
request verbose logging output of minimal reproducable examples (#10882)
Reorganize range expression module (#10871)
introduce with_name for Series/ChunkedArray (#10859)
Further refactor temporal range functions (#10844)
Refactor range related functions (#10830)
Fix the un-compile Black box function parts in polars lazy cookbook (#10809)
Fix some broken links / formatting (#10772)
Improve docs for polars-lazy (#10729)
update rustc nightly_2023-08-26 (#10467)
default to rust native flate2 lib (#10733)
Clear GitHub Actions caches weekly (#10715)
move 'is_in' to polars-ops (#10645)
Clean up schema calculation for date_range (#10653)
remove unused apply functions and add fallible generic apply functions (#10621)
Enforce up-to-date Cargo.lock (#10555)
make binary chunkedarray functions DRY (#10607)
bump MSRV to 1.65 (#10568)
genericize chunk implementation (#10506)
use ChunkArray::(try_)from_chunk_iter (#10497)
add VSCode rust-analyzer settings (#10498)
Update URLs for dev documentation (#10495)
Update features for latest flate2 release (#10492)

Thank you to all our contributors for making this release possible!
@Barsik-sus, @I8dNLo, @JulianCologne, @KacpiW, @MarcoGorelli, @Object905, @OndrejSlamecka, @Qqwy, @SeanTroyUWO, @TNieuwdorp, @VasanthakumarV, @alexander-beedie, @aminalaee, @an...

Contributors

jonashaag, OndrejSlamecka, and 45 other contributors

Assets 2

15 Sep 15:32

github-actions

py-0.19.3

e8949ff

Python Polars 0.19.3

🏆 Highlights

Polars plugins (#10924)

⚠️ Deprecations

Rename is_first/last to is_first/last_distinct (#11130)
Rename count_match to count_matches (#11028)
Rename strip to strip_chars (#10813)
Add datetime_range expression function (#10213)

🚀 Performance improvements

optimize _unpack_schema() (#11080)
optimize polars.utils._post_apply_columns() (#11086)
optimize polars.utils._post_apply_columns() (#11041)
optimize _unpack_schema() (#10960)
improve performance of fast projection (#10945)

✨ Enhancements

Expressify str.split argument. (#11117)
Polars plugins (#10924)
better async_collect (#10912)
Expressify argument of binary contains (#11091)
dt.offset_by supports broadcasting lhs (#11095)
Expressify argument of binary starts_with and ends_with (#11076)
add OpenOffice spreadsheet support via new pl.read_ods function (#11011)
json_extract supports extract static and string value to list dtype (#11057)
add quote_style="never" option for write_csv (#11015)
Add literal for str count_match (#10996)
More dtypes supports cast to list (#11025)
Add strip_prefix and strip_suffix to the string namespace (#10958)
improve read_excel table data identification (#10953)
Add from_dataframe fast path and improve typing (#10979)
add openpyxl as a new/optional engine for read_excel (#6183)
Add datetime_range expression function (#10213)

🐞 Bug fixes

Correct hash and fmt for struct expr (#11119)
enforce sortedness of by argument in rolling_* functions (#11002)
Make Series.__getitem__ raise an IndexError (#11061)
Filter on empty objectChunked should not throw error (#11073)
ensure null_count statistics accounts for null array (#11070)
toggle off cse if ext_context is used (#11051)
Correct field dtype of string concat (#11055)
fix partial schema init with read_dicts and reduce latency of small-frame creation (#11047)
pushed-down expr should be considered when evaluating ExternalContext (#11023)
fix rolling_* functions when "by" has nanosecond resolution (#11005)
Don't reuse member for Selector::Add (#11026)
ensure series_equal properly accounts for dtypes when strict=True (#11012)
fix the construction of List<Null> (#10969)
write_excel "hidden_columns" parameter fails when taking a selector (#10987)
allow singular null in regex pattern (#10948)
compute length of null array in explode (#10946)

🛠️ Other improvements

remove low contrast coloring from visited links (#11133)
Ignore matplotlib warning (#11129)
Do not run user guide examples by default (#11128)
Ignore matplotlib mypy warnings (#11126)
Add deprecation message in groupby docs (#11121)
Removed duplicated example (#11109)
Add CODEOWNERS for docs folder (#11107)
Refactor starts_with and ends_with for string (#11085)
Integrate user guide (#11089)
remove mentions of the deprecated random module (#11087)
simplify SchemaDefinition type alias (#11077)
put fetch explanation in a "notes" block to better highlight it in the docs (#11058)
remove feature gate join/groupby in polars-core (#10965)
Add Documentation issue type (#11042)
warn that "by" argument must be sorted for results to be correct in rolling_* functions (#11013)
Adds missing method refs in LazyDataFrame API docs (#11027)
Add lint for boolean trap (#11010)
Add private LazyFrame method for setting sink optimizations (#10988)
Enable a few more ruff lints (#10998)
document polars string duration language in temporal range functions (#10978)
Additional tests for interchange get_data_buffer (#10966)
genericize PolarsDataType (#10952)
Document that filter, drop_nulls, left join preserve order (#10955)
add note about adbc flight sql driver (#10949)
Revert pydantic >= 2.0.0 requirement (#10944)
note that pl.duration represents fixed durations, point to offset_by for non-fixed (#10927)
Test S3 functionality using moto server (#10164)

Thank you to all our contributors for making this release possible!
@I8dNLo, @KacpiW, @MarcoGorelli, @Object905, @Qqwy, @TNieuwdorp, @alexander-beedie, @antoniocali, @bvanelli, @cjackal, @henrikig, @jakob-keller, @mrogowski11, @nameexhaustion, @orlp, @reswqa, @ritchie46, @s-banach, @stinodego, @svaningelgem and @thomasjpfan

Contributors

svaningelgem, orlp, and 19 other contributors

Assets 2

05 Sep 14:33

github-actions

py-0.19.2

5aa9d04

Python Polars 0.19.2

🏆 Highlights

Add syntactic sugar for col("foo") -> col.foo (#10874)

⚠️ Deprecations

Rename Expr.is_not() to not_() (#10838)

✨ Enhancements

allow individual Config options to be easily reset to their default value (#10922)
accept expr in str.count_match (#10900)
allow additional glimpse customisation, fix strings repr (#10895)
accept expressions in .offset_by (#9967)
support schema overrides for frames created from databases (#10884)
Add syntactic sugar for col("foo") -> col.foo (#10874)
support negative indexing in set_at_idx (#10891)
implement drop as special case of select (#10885)
raise a more helpful error when non-query statements passed to read_database (#10851)

🐞 Bug fixes

Allow exactly one value in start/end for int_range (#10914)
fix(rust, python): raise error when function didn't receive any inputs (#8635)
count was falsy tagged as cse in group by (#10917)
CSE don't accept opaque functions (#10905)
Make int_range(s) exclusive on the upper bound when step is negative (#10898)
don't overflow length before checking limit (#10883)
fix bug where datetimes were not parsed in read_csv when pattern had no hour or minute (#10877)
use pool in dataframe arithmetic (#10864)
repair polars_err string interpolation (#10863)
make count_match docs and extract_all docs/impl consistent around zero matches (#10854)

🛠️ Other improvements

Set minimum version for pydantic to 2.0.0 (#10923)
fix and clarify docs for Expr.map_elements (#10647)
fix rendering of bullet points in dt.round (#10911)
add test for 10875 (#10913)
apply with_name in more places (#10899)
never compare opaque functions (#10906)
eliminate repetition in utf8 datetime functions (#10860)
Fix issue templates for bug reports (#10896)
request verbose logging output of minimal reproducable examples (#10882)
add a note about read_database connection/cursor behaviour (#10873)
introduce with_name for Series/ChunkedArray (#10859)

Thank you to all our contributors for making this release possible!
@Barsik-sus, @MarcoGorelli, @alexander-beedie, @c-peters, @cmdlineluser, @dependabot, @dependabot[bot], @drgif, @jeroenjanssens, @orlp, @ritchie46, @stinodego and @wdoppenberg

Contributors

orlp, jeroenjanssens, and 10 other contributors

Assets 2

01 Sep 05:31

github-actions

py-0.19.1

ad73217

Python Polars 0.19.1

💥 Breaking changes

empty product returns identity and product ignores nulls (#10842)

✨ Enhancements

add binary, boolean, categorical, date, object, and time selectors (#10806)
Supports is_last operation (#10760)
minor typing improvement for DataFrame.__iter__ (#10825)
Add custom error for allow_copy=False (#10822)

🐞 Bug fixes

empty product returns identity (#10842)
never panic in hash/equality doesn't hold in cse (#10836)
Improve bound checks on temporal ranges (#10837)
var/std behavior around few elements (#10828)
Fix divided by zero error when read empty csv in streaming mode (#10819)
behaviour of reversed(df) (#10823)
fix equality of quantile aggregation node (#10816)
Reading an only-header csv file in streaming mode should not panic (#10810)

🛠️ Other improvements

Refactor range related functions (#10830)
map-related docstring updates (#10779)
Move sink tests to streaming module (#10821)

Thank you to all our contributors for making this release possible!
@alexander-beedie, @orlp, @reswqa, @ritchie46 and @stinodego

Contributors

orlp, alexander-beedie, and 3 other contributors

Assets 2

30 Aug 14:04

github-actions

py-0.19.0

b1f60cd

Python Polars 0.19.0

An upgrade guide is available on our website.

🏆 Highlights

implementing sink_csv for LazyFrame (#10682)
Support DataFrame init from queries against users' existing database connections (#10649)
Rename groupby to group_by (#10656)

💥 Breaking changes

return f64 for rank when method="average" (#10734)
Update a lot of error types (#10637)
Remove deprecated behavior from vertical aggregations (#10602)
Read/write support for IPC streams in DataFrames (#10606)
Change behavior of all - fix Kleene logic implementation for all/any (#10564)
Improve consistency of parsing expression input (#9512)
allow from_arrow to take a generator of RecordBatches, change error type to TypeError (#10529)
remove fixed_seed and add pl.set_random_seed (#10388)
Make arange an alias for int_range (#9983)
date_range/time_range no longer return a List type (#10526)
Remove various functionalities deprecated before 0.18 (#10527)
Improve some error types and messages (#10470)

⚠️ Deprecations

Rename map to map_batches (#10801)
Rename GroupBy.apply to map_groups (#10799)
Rename DataFrame.apply to map_rows (#10797)
Rename Series/Expr.rolling_apply to rolling_map (#10750)
Rename Series/Expr.apply to map_elements (#10678)
Rename groupby to group_by (#10656)
Deprecate some parameters of cut/qcut (#10484)

🚀 Performance improvements

parse time zones outside of downcast_iter() in replace_time_zone (#10713)
use binary abstraction for atan2 (#10588)
use binary abstraction in pow (#10562)

✨ Enhancements

activate cse for group_by (again) (#10749)
implementing sink_csv for LazyFrame (#10682)
Supports series unique & arg_unique & n_unique for list (#10743)
repeat_by should also support broadcasting of LHS (#10735)
deprecate 'use_earliest' argument in favour of 'ambiguous', which can take expressions (#10719)
is_first also supports numeric list type. (#10727)
improve slice pushdown in unions (#10723)
Explicitly implement Protocol for interchange classes (#10688)
Support min and max strategy for binary & str columns fill null (#10673)
support broadcasting in list set operations (#10668)
csv: add schema argument (#10665)
Support DataFrame init from queries against users' existing database connections (#10649)
add truncate_ragged_lines (#10660)
supports cast to list (#10623)
Update a lot of error types (#10637)
preserve whitespace in notebook output (#10644)
Remove deprecated behavior from vertical aggregations (#10602)
support selector usage in write_excel arguments (#10589)
Add LazyFrame.collect_async and pl.collect_all_async (#10616)
Read/write support for IPC streams in DataFrames (#10606)
propagate null is in is_in and more generic array construction (#10614)
Change behavior of all - fix Kleene logic implementation for all/any (#10564)
frame-level cast support (#10504)
Improve consistency of parsing expression input (#9512)
Add failed column to cast exception (#10507)
allow from_arrow to take a generator of RecordBatches, change error type to TypeError (#10529)
Remove deprecated get_idx_type - use get_index_type instead (#10556)
Make arange an alias for int_range (#9983)
date_range/time_range no longer return a List type (#10526)
Remove various functionalities deprecated before 0.18 (#10527)
Improve some error types and messages (#10470)
suggest str.to_datetime instead of apply and stdlib strptime (#10266)

🐞 Bug fixes

get_single_leaf can't handle Expr::Count (#10790)
support groupby literal in streaming (#10771)
ORDER BY on unselected columns (#10752)
Fix is_in cannot cast list type for float (#10769)
whitespace CSS in Notebook HTML updated to use pre-wrap instead of pre (#10739)
only preserve sortedness flag in replace_time_zone when safe (#10738)
Error on value_counts on column named "counts" (#10737)
return f64 for rank when method="average" (#10734)
Keep min/max and arg_min/arg_max consistent. (#10716)
use time zone from dtype to overwrite output time zone when initialising Series (#10689)
Cast small int type when scan csv in streaming mode. (#10679)
raise exception with invalid on arg type for join_asof (#10690)
Reused input series in rolling_apply should not be orderly (#10694)
re-sort buffer when update window swap the whole buffer (#10696)
Set the correct fast_explode flag for ListUtf8ChunkedBuilder (#10684)
Sorted Utf8Chunked max_str and min_str should consider null value (#10675)
Correctly handle time zones in write_delta (#10633)
fix apply for empty series in threading mode (#10651)
respect 'ignore_errors=False' in csv parser (#10641)
fix rename + projection pushdown (#10624)
fix int/float downcast in is_in (#10620)
Change behavior of all - fix Kleene logic implementation for all/any (#10564)
Fix serialization for categorical chunked. (#10609)
Take input_schema to create physical expr for Selection (#10571)
Clear window cache after evaluate predication expr (#10505)
Parsing regex col in Expr::Columns (#10551)
sanitize column naming in boolean ops (#10531)
Fix write_delta with schema in delta_write_options (#10541)
remove fixed_seed and add pl.set_random_seed (#10388)
respect pl.Config options relating to shape, column names, and types when rendering HTML (#10449)

🛠️ Other improvements

update cargo.lock (#10800)
Create .venv in repo root (#10789)
refactored write_database unit tests to properly separate concerns (#10773)
Fix some broken links / formatting (#10772)
Document chained when-then behaviour more prominently (#10759)
Fix test failing due to new adbc release (#10763)
Unpin connectorx and bump other Python dependencies (#10753)
add note to testing docs about module import (#10741)
Clear GitHub Actions caches weekly (#10715)
Update for new pyarrow 13.0.0 behavior (#10691)
Fix minor issue with sink_parquet docs (#10669)
Remove deprecate_renamed_methods util (#10537)
add "see also" entries to ne/eq_missing and update related examples (#10667)
fix potential memory leak from usage of inspect.currentframe (#10630)
give more relevant example for polars.apply (#10631)
Bump ruff and enable new setting (#10626)
Add docstrings for Expr.meta namespace (#10617)
Enforce up-to-date Cargo.lock (#10555)
deprecate DataFrame.replace (#10600)
ensure that make requirements fully refreshes unpinned packages/deps (#10591)
fix out-of-date explain default parameter (#10566)
Fix expr_dispatch decorator to work on methods with decorators (#10549)
Fix link to source code (#10542)
Add title to index page (#10539)
Disable SIM108 lint (#10519)
Keep versioned docs (#10500)
switch to pyo3/maturin-action (#10503)
Update URLs for dev documentation (#10495)
Skip failing test (#10496)
Add version switcher to API reference (#10488)

Thank you to all our contributors for making this release possible!
@JulianCologne, @MarcoGorelli, @Object905, @OndrejSlamecka, @SeanTroyUWO, @VasanthakumarV, @alexander-beedie, @aminalaee, @braaannigan, @c-peters, @ion-elgreco, @lorepozo, @marki259, @mcrumiller, @messense, @orlp, @owrior, @rben01, @reswqa, @ritchie46, @sdamashek, @stinodego, @svaningelgem, @titoeb, @trueb2, @washcycle and @zundertj

Contributors

OndrejSlamecka, svaningelgem, and 25 other contributors

Assets 2

15 Aug 07:01

github-actions

py-0.18.15

0357177

Python Polars 0.18.15

🐞 Bug fixes

rollback cse in groupby: python 0.18.15 (#10491)

🛠️ Other improvements

Mark import timing check as slow (#10487)
Gather all streaming tests (#10485)
Bump maturin to version 1.2.1 (#10479)

Thank you to all our contributors for making this release possible!
@ritchie46 and @stinodego

Contributors

ritchie46 and stinodego

Assets 2

14 Aug 13:49

github-actions

rs-0.32.0

ec0c91f

Rust Polars 0.32.0

🏆 Highlights

common subexpression elemination (#9632)

💥 Breaking changes

remove deprecate tz_localize, name CastTimezone to ReplaceTimeZone (#10070)

⚠️ Deprecations

renaming approx_unique as approx_n_unique (#10290)
remove/deprecate cache and its logic (#10066)
Add date_ranges/time_ranges expression functions (#10005)

🚀 Performance improvements

pre-alloc int_ranges (#10399)
use hash as CSE Identifier (#10385)
re-use regex capture allocation (#10302) (#10335)
don't parallelize literal expressions (#10321)
fix O(n^2) in sorted check during append (#10241)
speedup mode on sorted data (#10084)
speedup boolean apply (#10073)
shrink alp/lp ~2.5x (#10039)
Remove fused arithmetic from expressions with literals (#10011)

✨ Enhancements

quote style option for csv writer (#10422)
add "raise_if_empty" flag to read_excel, read_csv, scan_csv, and read_csv_batched (#10409)
be more permissive on predicate pushdown to left side of left join (#10442)
add use_earliest to to_datetime / strptime (#10426)
{any/all}_horizontal to expression architecture (#10412)
serialize flags (#10140)
allow unaligned pointers in arrow FFI (#10403)
add line_terminator option to write_csv (#10373)
Add is_local and to_local to categorical namespace (#10372)
cse for groupby.agg and reduced cse collisions (#10381)
re-use regex capture allocation (#10302) (#10335)
Add Series.cat.uses_lexical_ordering (#10325)
improve datetime parsing error message (#10332)
allow sequential runners in select/with_columns (#10322)
improve err msg parsing time, date, datetime (#10298)
Add str.extract_groups (#10179)
add extra build profiles (#10268)
Extend datetime expression function with time zone/time unit parameters (#10235)
added gcs to gcp cloud schema in polars-core::cloud #10206. (#10207)
support writing duration type in json (#10112)
inline lit(Series).cast(..) to -> lit(Series.cast(..)) (#10092)
Move transpose naming to Rust (#10009)
cse in groupby's (#10062)
Adds sql CASE statement expressions (#10065)
Add date_ranges/time_ranges expression functions (#10005)
comm_subexpr_elim in streaming 'select/with_columns' (#10050)
common subexpression elemination (#9632)
Let qcut create evenly spaced probabilities (#9960)
sorted flag on singletons (#9933)
maintain sorted flag after partition_by (#9944)
keep sorted flag in streaming left join (#9932)
Add cloudpickle for serializing python UDFs (#9921)

🐞 Bug fixes

Fix incorrect handling of VisitRecursion::Skip. (#10452)
fix negative decimal parsing (#10444)
ensure sorted_sink hash equals the default path (#10464)
fix sum agg (#10459)
ensure last aggregation deals with default chunk (#10453)
fix cse input schema (#10450)
fix list groupby of array dtype (#10408)
correct AnyValue::hash (#10391)
finalize cast in partitioned groupby (#10359)
fix oob in 'last' (#10329)
fix categorical lexical sort (#10318)
Fix join validation (#10257)
Set correct dtype for .extract_groups() (#10306)
clear window cache and run windows on proper runners (#10303)
fix sorted fast path in streaming groupby wrt nulls (#10289)
fix nan aggregation in groupby (#10287)
check dtypes of single-column 'by' parameter in asof-join (#10284)
fix pyo3 link errors on macos (#10256)
fix empty streaming parquet file (#10252)
fix logical columns of streaming multi-column sort (#10250)
fix date/datetime parsing for short inputs with exact=False (#10231)
correct agg_sum for ChunkedArray. (#10243)
don't panic in wildcard apply (#10240)
fix cse profile (#10239)
correct struct null counts (#10142)
no cse in groupby until fixed (#10216)
fix is_in on empty series (#10195)
fix cse windows (#10197)
block predicate pushdown is_in and null producing … (#10194)
prevent re-ordering of dict keys inside .apply (#10172)
initialize fixed null values (#10192)
ensure window function run partitioned when cse is hit (#10170)
adjust for null values in str.replace fast path (#10132)
clear bit settings in list iteration (#10131)
use row-encoded for struct::is_sorted (#10129)
fix(rust, python): don't run file-caching in streaming mode (#10117)
Allow initialize of pl.Array in Dataframe using schema alone (#10100)
don't panic if masked out values are invalid in temporal kernels (#10114)
Fix struct get field by index out of bounds error. (#10097)
fix ub in simd-json (#10093)
fix invalid access when groupby rolling produces empty sets (#10109)
respect null_on_oob=False in list.take when pa… (#10105)
fix is_sorted for structs (#10099)
add file path to io error in scan_csv (#10076)
fix false positive in parquet stats evaluation (#10087)
fix error message from cast-timezone to replace-time-zone (#10089)
Address .col(regex).exclude() operations not executing. (#10025)
fix Boolean::isin(null values) (#10074)
predicate pushdown #10058 (#10071)
Fix weighted quantile for 0 weights (#10051)
fix incorrect state in projection pushdown with joins (#9987)
don't pass predicates referring to renamed literal… (#9965)
fix regression in regex expansion (#9952)
potential SO in csv infer schema (#9950)
raise on unsupported transpose and object types (#9946)
Fix as-of join when by groups are interleaved (#9938)

🛠️ Other improvements

fix and run polars-plan tests (#10465)
Simplify flag methods (#10429)
match_block_trailing_comma (#10414)
implement ChunkArray::(try_)from_chunk_iter (#10395)
add test for 10401 (#10405)
Bump some dependencies (#10396)
Move dependency version info to workspace level (#10295)
patch reedline until fix released (#10382)
remove wasm-timer dependency (#10347)
write down invariants of ChunkedArray (#10334)
fix typo in lib.rs (#10313)
Exclude examples from workspace default (#10309)
Update CODEOWNERS (#10261)
avoid outputting docs of dependencies (#10292)
Do not keep history in gh-pages branch (#10282)
Use workspace package info / organize dependencies section (#10279)
fix dead links in Rust documentation (#10251)
Fix make pre-commit command (#10205)
Fix make integration-tests command (#10202)
Replace "question" issues with link to Stack Overflow (#10230)
Update dependabot config (#10222)
Fix LICENSE symlink for moved crates (#10150)
Re-organize folder structure for Rust crates (#10141)
update to rustc nightly-2023-07-27 (#10139)
temporarily turn off fail-fast so that ubuntu tests run (#10133)
Refactor when/then/otherwise internals (#9922)
move replace_time_zone to polars-ops (#10078)
remove unneeded branch (#10082)
remove deprecate tz_localize, name CastTimezone to ReplaceTimeZone (#10070)
fix typo in contribution example (#10038)
correct example in API reference (#10032)
add developer contribution examples (#10013)
Update autolabeler again (#9984)
fix docs build and add to CI (#9904)
Minor makeover for Rust Makefile (#9874)

Thank you to all our contributors for making this release possible!
@0xbe7a, @CanglongCl, @JulianCologne, @MarcoGorelli, @OndrejSlamecka, @OneRaynyDay, @SeanTroyUWO, @StefanBRas, @TLouf, @alexander-beedie, @c-peters, @cjackal, @cmdlineluser, @dependabot, @dependabot[bot], @drgif, @duvenagep, @eltociear, @fsimkovic, @ion-elgreco, @jonashaag, @lfn3, @magarick, @mcrumiller, @orlp, @potzenhotz, @rea1bacon, @reswqa, @rikkaka, @ritchie46, @stinodego, @thomasaarholt, @varunmittal91 and @zundertj

Contributors

jonashaag, OndrejSlamecka, and 31 other contributors

Assets 2

14 Aug 13:48

github-actions

py-0.18.14

ec0c91f

Python Polars 0.18.14

🏆 Highlights

Native implementation of dataframe interchange protocol (#10267)

⚠️ Deprecations

Deprecate behavior of list/tuple inputs for lit (#10461)

🚀 Performance improvements

optimise retrieval of values from df.item (~4-5x speedup) (#10411)
pre-alloc int_ranges (#10399)
use hash as CSE Identifier (#10385)

✨ Enhancements

quote style option for csv writer (#10422)
add "raise_if_empty" flag to read_excel, read_csv, scan_csv, and read_csv_batched (#10409)
add use_earliest to to_datetime / strptime (#10426)
add new "header_format" option for write_excel (#10392)
{any/all}_horizontal to expression architecture (#10412)
Native implementation of dataframe interchange protocol (#10267)
allow unaligned pointers in arrow FFI (#10403)
add line_terminator option to write_csv (#10373)
add explicit selector variants for signed/unsigned integers (#10384)
Add is_local and to_local to categorical namespace (#10372)
enhance selectors expansion function, so it can operate on a schema as well as a frame (#10341)
Order percentiles in describe (#10378)
cse for groupby.agg and reduced cse collisions (#10381)
improve take_every(0) exception (#10352)
add offset and length to get_ptr (#10361)

🐞 Bug fixes

fix pyarrow write_to_dataset wrt check_not_directory parameter (#10471)
fix negative decimal parsing (#10444)
ensure sorted_sink hash equals the default path (#10464)
address inconsistency in init from square numpy arrays with/without an explicit schema (#10445)
ensure last aggregation deals with default chunk (#10453)
fix cse input schema (#10450)
Fix by argument handling in join_asof (#10447)
fix potential OverflowError in testing asserts with huge UInt64 diffs (#10437)
Create delta compatible schema during writing (#10165)
fix list groupby of array dtype (#10408)
correct AnyValue::hash (#10391)
finalize cast in partitioned groupby (#10359)

🛠️ Other improvements

add vertical_relaxed example for pl.concat (#10472)
Run all streaming tests on the same test runner (#10469)
Organize OOC tests (#10463)
add test for 10417 (#10420)
Clean up some Sphinx settings (#10400)
add test for 10401 (#10405)
Address Ruff per file ignores (#10258)
Small improvement for PySeries.get_buffer (#10363)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @OndrejSlamecka, @alexander-beedie, @c-peters, @cmdlineluser, @drgif, @ion-elgreco, @lfn3, @orlp, @potzenhotz, @rea1bacon, @reswqa, @ritchie46, @stinodego and @zundertj

Contributors

OndrejSlamecka, orlp, and 13 other contributors

Assets 2

Releases: pola-rs/polars

Python Polars 0.19.5

🚀 Performance improvements

🐞 Bug fixes

🛠️ Other improvements

Contributors

Python Polars 0.19.4

🏆 Highlights

⚠️ Deprecations

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

🛠️ Other improvements

Contributors

Rust Polars 0.33.0

🏆 Highlights

💥 Breaking changes

⚠️ Deprecations

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

🛠️ Other improvements

Contributors

Python Polars 0.19.3

🏆 Highlights

⚠️ Deprecations

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

🛠️ Other improvements

Contributors

Python Polars 0.19.2

🏆 Highlights

⚠️ Deprecations

✨ Enhancements

🐞 Bug fixes

🛠️ Other improvements

Contributors

Python Polars 0.19.1

💥 Breaking changes

✨ Enhancements

🐞 Bug fixes

🛠️ Other improvements

Contributors

Python Polars 0.19.0

🏆 Highlights

💥 Breaking changes

⚠️ Deprecations

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

🛠️ Other improvements

Contributors

Python Polars 0.18.15

🐞 Bug fixes

🛠️ Other improvements

Contributors

Rust Polars 0.32.0

🏆 Highlights

💥 Breaking changes

⚠️ Deprecations

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

🛠️ Other improvements

Contributors

Python Polars 0.18.14

🏆 Highlights

⚠️ Deprecations

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

🛠️ Other improvements

Contributors