Releases: pola-rs/polars
Python Polars 0.18.13
⚠️ Deprecations
- Rename
LazyFrame.read/write_json
tode/serialize
(#10238) - Add
categorical_as_str
parameter to testing utils (#10350)
🚀 Performance improvements
- don't parallelize literal expressions (#10321)
✨ Enhancements
- support
selectors
in additional frame methods (#10255) - Add
Series.cat.uses_lexical_ordering
(#10325) - utility to get buffers and pointers (#10331)
- improve datetime parsing error message (#10332)
- add ptr for small integer types (#10330)
- add offsets utility (#10328)
- allow sequential runners in select/with_columns (#10322)
- warn about inefficient apply json.loads if json is local import (#10310)
- improve err msg parsing
time
,date
,datetime
(#10298) - Add
categorical_as_str
parameter to testing utils
🐞 Bug fixes
- fix oob in 'last' (#10329)
- show inefficient apply warning in ipython (#10312)
- add cse to no_optimization in profile (#10317)
- fix categorical lexical sort (#10318)
- Fix join validation (#10257)
- Set correct dtype for
.extract_groups()
(#10306)
Thank you to all our contributors for making this release possible!
@CanglongCl, @JulianCologne, @MarcoGorelli, @alexander-beedie, @cmdlineluser, @eltociear, @orlp, @ritchie46 and @stinodego
Python Polars 0.18.12
⚠️ Deprecations
- renaming
approx_unique
asapprox_n_unique
(#10290) - Rename first
qcut
parameter toquantiles
(#10253) - Deprecate
avg
alias formean
(#10236)
🚀 Performance improvements
- fix O(n^2) in sorted check during append (#10241)
✨ Enhancements
- Add
str.extract_groups
(#10179) - raise
TypeError
for all LazyFrame comparison operators (#10275) - support bytecode translation to
map_dict
where the lookup key is an expression (#10265) - add entry point to the Consortium DataFrame API (#10244)
- Extend
datetime
expression function with time zone/time unit parameters (#10235) - add "batch_size" to
scan_pyarrow_dataset
parameters (#10249)
🐞 Bug fixes
- clear window cache and run windows on proper runners (#10303)
- fix sorted fast path in streaming groupby wrt nulls (#10289)
- Fix interchange protocol allowing copy even when
allow_copy
was set to False (#10262) - fix nan aggregation in groupby (#10287)
- don't panic on cse if function hasn't implemented __eq__ (#10286)
- fix empty streaming parquet file (#10252)
- fix logical columns of streaming multi-column sort (#10250)
- fix date/datetime parsing for short inputs with exact=False (#10231)
- don't panic in wildcard apply (#10240)
- fix cse profile (#10239)
🛠️ Other improvements
- Update CODEOWNERS (#10261)
- add note about pyarrow partitioning (#10297)
- Do not keep history in
gh-pages
branch (#10282) - make an explicit note in
read_parquet
andscan_parquet
about hive-style partitioning (point toscan_pyarrow_dataset
instead) (#10277) - Fix typo in error message (#10281)
- Replace "question" issues with link to Stack Overflow (#10230)
- Use sphinx'
maximum_signature_line_length
(#10228) - add warning about parallel eval of
.then(..)
branches (#10229) - Update Sphinx to 7.1.1 and bump related dependencies (#10221)
- Update dependabot config (#10222)
Thank you to all our contributors for making this release possible!
@0xbe7a, @MarcoGorelli, @TLouf, @alexander-beedie, @cmdlineluser, @dependabot, @dependabot[bot], @duvenagep, @mcrumiller, @orlp, @reswqa, @ritchie46 and @stinodego
Python Polars 0.18.11
🐞 Bug fixes
- correct struct null counts (#10142)
- no cse in groupby until fixed (#10216)
- avoid false positives from multiple
RETURN_VALUE
ops when checkingapply
lambdas/functions (#10211)
🛠️ Other improvements
- Improve deprecation utils (#10167)
Thank you to all our contributors for making this release possible!
@alexander-beedie, @magarick, @ritchie46, @stinodego and @varunmittal91
Python Polars 0.18.10
✨ Enhancements
- raise a better error message from
read_database
if not passed a string URI (#10191) - Add pyarrow write_to_dataset to write_parquet function (#9835)
🐞 Bug fixes
- fix
is_in
on empty series (#10195) - fix cse windows (#10197)
- block predicate pushdown is_in and null producing … (#10194)
- prevent re-ordering of dict keys inside
.apply
(#10172) - initialize fixed null values (#10192)
- Don't pickle
_scan_impl
(#10175) - ensure window function run partitioned when cse is hit (#10170)
🛠️ Other improvements
- prepend set_ to set operations on lists (#10182)
- Track version in deprecation utils (#10147)
- Add a simple util
issue_deprecation_warning
(#10146) - more precise checks for inefficient apply warnings (#10135)
Thank you to all our contributors for making this release possible!
@MarcoGorelli, @alexander-beedie, @cjackal, @cmdlineluser, @potzenhotz, @ritchie46 and @stinodego
Python Polars 0.18.9
🏆 Highlights
- common subexpression elemination (#9632)
⚠️ Deprecations
- Deprecate parsing string inputs as literals for
when-then-otherwise
(#10122) - deprecate "connection_uri" → "connection" param in read/write database methods (#10134)
- remove/deprecate cache and its logic (#10066)
- Add
date_ranges
/time_ranges
expression functions (#10005)
🚀 Performance improvements
✨ Enhancements
- suggest map_dict instead of lambda x: DICT[x] (#10123)
- enable "inefficient apply" warnings from
Series
(#10104) - support writing duration type in json (#10112)
- BytecodeParser can now handle mixed/nested
and/or
control flow (#10085) - inline
lit(Series).cast(..)
to ->lit(Series.cast(..))
(#10092) - Add ArcTan2 to
SQLContext
(#9571) - cse in groupby's (#10062)
- Adds sql
CASE
statement expressions (#10065) - Add
date_ranges
/time_ranges
expression functions (#10005) - comm_subexpr_elim in streaming 'select/with_columns' (#10050)
- add dataframe.flags property (#10037)
- common subexpression elemination (#9632)
- detect and warn about usage of str/int/float python-based casts with
apply
(#10026) - detect and warn about usage of
json.loads
in conjunction withapply
(#10023) - detect and warn about bare
numpy
functions passed toapply
(#10021) - support bytecode identification/mapping of python string-case functions in UDFs (#10007)
- support bytecode identification of
numpy
functions in UDFs that we can map to native expressions (#10003)
🐞 Bug fixes
- adjust for null values in str.replace fast path (#10132)
- clear bit settings in list iteration (#10131)
- use row-encoded for struct::is_sorted (#10129)
- fix(rust, python): don't run file-caching in streaming mode (#10117)
- Allow initialize of pl.Array in Dataframe using schema alone (#10100)
- silence Series.apply inefficient apply warning when calling Expr.apply (#10116)
- don't panic if masked out values are invalid in temporal kernels (#10114)
- Fix struct get field by index out of bounds error. (#10097)
- fix ub in simd-json (#10093)
- fix invalid access when groupby rolling produces empty sets (#10109)
- respect
null_on_oob=False
inlist.take
when pa… (#10105) - undo regression in scan_parquet from s3 (#10098)
- fix is_sorted for structs (#10099)
- add file path to io error in scan_csv (#10076)
- fix false positive in parquet stats evaluation (#10087)
- Address
.col(regex).exclude()
operations not executing. (#10025) - address an inadvertently shallow-copy issue on underlying PySeries (#10086)
- fix Boolean::isin(null values) (#10074)
- predicate pushdown #10058 (#10071)
- map 'postgres' URI prefix to ADBC 'postgresql' module (#10018)
- Fix weighted quantile for 0 weights (#10051)
- eager
time_range
/date_range
dimensions fix (#9996)
🛠️ Other improvements
- get test_udfs running on all python versions again (#10136)
- temporarily turn off fail-fast so that ubuntu tests run (#10133)
- clarify "clones data" in to_numpy (#10095)
- Refactor
when
/then
/otherwise
internals (#9922) - Properly format
Returns
sections of docstrings (#10064) - much-improved
Instruction
matching forBytecodeParser
(#10040) - add pure-python tests and CI for bytecodeparser (#10027)
- split-out expression translation and instruction-rewrite logic from
BytecodeParser
(#10012) - cleans api sections in docs (#10004)
- Bump some dependencies (#9997)
- Add patchelf extra to maturin (#9995)
- restructure all UDF parsing/translation methods into a new
BytecodeParser
class (#9993) - Clean up
date_range
/time_range
(#9985)
Thank you to all our contributors for making this release possible!
@MarcoGorelli, @SeanTroyUWO, @alexander-beedie, @c-peters, @cmdlineluser, @jonashaag, @magarick, @mcrumiller, @rikkaka, @ritchie46 and @stinodego
Python Polars 0.18.8
⚠️ Deprecations
🚀 Performance improvements
- Rolling min/max for partially sorted data (#9819)
- Use
pyo3::intern
to avoid needlessly recreating PyString (#9853)
✨ Enhancements
- Name transpose from column (#9846)
- adds
SQRT
,CBRT
,PI
functions toSQLContext
(#9936) - Let qcut create evenly spaced probabilities (#9960)
- add freeze_panes option to write_excel (#9974)
- initial support for parsing the set of
jump
bytecode instructions required to reconstructand/or
logic (#9972) - suggest more efficient expression if user passes simple lambda to Expr.apply or DataFrame.apply (#9918)
- sorted flag on singletons (#9933)
- maintain sorted flag after partition_by (#9944)
- keep sorted flag in streaming left join (#9932)
- Add cloudpickle for serializing python UDFs (#9921)
- Optional three-valued logic for any/all (#9848)
- Add
Series.extend
(#9901) - pass through unknown schema in unnest (#9896)
- convenience support for parsing a list of SQL strings with
sql_expr
(#9881) - respect and allow more options in eager json parsing (#9882)
- allow set_sorted in streaming (#9876)
- Expr.cat.get_categories expression (#9869)
- add
LENGTH
andOCTET_LENGTH
string functions for SQL (#9860) polars_warn!
macro (#9868)
🐞 Bug fixes
- fix incorrect state in projection pushdown with joins (#9987)
- don't pass predicates referring to renamed literal… (#9965)
- fix regression in regex expansion (#9952)
- potential SO in csv infer schema (#9950)
- raise on unsupported transpose and object types (#9946)
- Fix as-of join when
by
groups are interleaved (#9938) - Handle
DataFrame.extend
extending by itself (#9897) - don't SO on align_frames (#9911)
- respect original series dtype when constructing
LitIter
(#9886) - Handle
DataFrame.vstack
stacking itself (#9895) - sum aggregation empty set is 0, not null (#9894)
- preserve expression aliases when parsing SQL with
pl.sql_expr
(#9875) - fmt unknown dtype (#9872)
🛠️ Other improvements
- Update autolabeler again (#9984)
- use param_name more in udfs for greater defensiveness (#9969)
- fix or/and docstrings to say bitwise, not logical (#9964)
- minor fix for
apply
docstring example text (#9953) - add note that
collect_all
returns result frames in the same order as input (#9951) - Improve docstrings for renaming operations (#9942)
- Move
sink_*
methods to IO chapter (#9939) - Add 'nearest' in Expr.interpolation docstring with an example (#9935)
- fix hyperlinks to pandas (#9937)
- Address ignored Ruff doc rules (#9919)
- improve
weekday
,day
,ordinal_day
examples (#9926) - deprecate
bins
argument and rename tobreaks
inSeries.cut
(#9913) - Use Pathlib everywhere (#9914)
- Add various unit tests (#9903)
- add big warnings about using apply (#9906)
- Update autolabeler (#9885)
- Workaround for PyCharm deprecation warning (#9907)
- Mention func_horizontal on deprecated func docstrings (#9863)
- note ordering guarantee for groupby (#9879)
- add logo
link
entry to sphinx conf and factor-out website root paths (#9864)
Thank you to all our contributors for making this release possible!
@0xbe7a, @JulianCologne, @MarcoGorelli, @OneRaynyDay, @SeanTroyUWO, @StefanBRas, @alexander-beedie, @c-peters, @fsimkovic, @ion-elgreco, @magarick, @mcrumiller, @messense, @ritchie46, @sorhawell, @stinodego, @thomasaarholt and @zundertj
Rust Polars 0.31.1
🚀 Performance improvements
- Rolling min/max for partially sorted data (#9819)
- use hash set in drop_many (#9807)
- Faster is_sorted when no flag set (#9777)
- optimize n_unique for integers (#9568)
- remove sort columns on multiple-key OOC sort (#9545)
- don't needlessly trigger bitcount (#9561)
- don't initialize memory before row-encoding (#9435)
- reduce page faults in q1
~-30%
(#9423) - reduce rayon/idle time in streaming (#9416)
- use row format in streaming join
~15%
(#9379) - row encode buffer reuse (#9371)
- bytes row format for streaming groupby/unique keys
>3.5x
(#9346) - push slices down map functions (#9350)
- increase streaming groupby spill size from 256 to 10_000 (#9312)
- perf(rust, python) Improve rolling min and max for nonulls (#9277)
- slightly improve n_unique performance (#9286)
- speed up write_csv for time-zone-aware columns (#9093)
- parallelize rolling_window group materialization (#9095)
✨ Enhancements
- pass through unknown schema in unnest (#9896)
- access
OptState
inLazyFrame
to unit-test optimization toggle methods. (#9883) - respect and allow more options in eager json parsing (#9882)
- allow set_sorted in streaming (#9876)
- Expr.cat.get_categories expression (#9869)
- add
LENGTH
andOCTET_LENGTH
string functions for SQL (#9860) polars_warn!
macro (#9868)- Add Run-length Encoding functions (#9826)
- add
include_key
parameter topartition_by
(#9750) - add
LEFT
string function for SQL (#9836) - add
REGEXP_LIKE
function for SQL (both two and three parameter version) (#9838) - add
maintain_order
argument tosort
/top_k
/bottom_k
(#9672) - add drop_many_amortized (#9814)
- Dedicated horizontal aggregation functions (#9752)
- implement with_row_count as private function (#9810)
- add support for SQL
SUBSTR
function (#9803) - add SQL support for binary data and expand recognised SQL dtype strings (#9802)
- reworked comfy-table layout constraints, improving table wrapping/repr (#9744)
- allow qcut in window expressions (#9745)
- Improve cut and allow use in expressions (#9580)
- clearer message when stringcache-related errors occur (#9715)
- improve expression formatting (#9704)
- set string cache in window functions (#9705)
- raise on both sides of datetime/str comparison (#9692)
- support deserializing struct json into df (#9688)
- add tree formatter for expressions (#9684)
- add
.list.any()
and.list.all()
(#9573) - extend dtype/selector matching for
Datetime
with a "*" wildcard for timezones (#9641) - add polars::VERSION (#9660)
- add symmetric difference to list set operations (#9655)
- add dt.base_utc_offset (#9636)
- add dt.dst_offset feature (#9629)
- allow to specify index order in
to_numpy
(#9592) - accept expressions in
repeat
(#9614) - set operations for list (#9599)
- add drop_first parameter for to_dummies (issue #8246) (#9143)
- raise if window size in rolling functions isn't strictly positive (#9465)
- add infer schema len to json_extract (#9478)
- Adds (Most) Remaining Trig Functions to
SQLContext
(#9453) - update error handling msg for sql functions (#9474)
- add str.titlecase (#9457)
- raise if period is negative in groupby_rolling (#9445)
- add SQL
round
support (#9330) - dont error for time-zone-aware parsing if time zone is UTC (#9414)
- support all numeric dtypes in serde (#9393)
- ensure part of the plan is streaming if aggregati… (#9387)
- add relaxed concatenation (#9382)
- add sql DROP TABLE (#9355)
- support ternary expressions in streaming (#9343)
- add decoding support for row format (#9339)
- add SQL support for null-aware equality checks (#9332)
- add SQL support for regular expression operators (
~
,!~
,~*
, and!~*
) (#9327) - support
//
integer floordiv operator in the SQL engine (#9324) - serde for 'to_physical' expr (#9294)
- add join cardinality validation (#9278)
- keep sorted flag after Expr::truncate (#9275)
- add "sql_expr" function (#9248)
- rewrite correlation functions to expression architecture (#9258)
- keep sorted flag on
offset_by
(#9253) - add intersection primitive for selector API (#9240)
- building blocks for expression expansion sets (#9231)
- Add ddof option to rolling_var and rolling_std (#8957)
- immediately flatten nested unions (#9220)
- support float expression on integers (#9210)
- add binary to list<u8> cast (#9161)
- add arr.unique expression (#9159)
- implement explode for DataType::Array (#9157)
Decimal
type:sum
,min
,max
aggregations inselect
andagg
context. (#9135)- Decimal arithmetic (#9123)
- support decimals as cast types in csv parser (#9121)
- Improve error handling for
repeat
(#9117) - conversion from
Utf8
toDecimal
. (#9090)
🐞 Bug fixes
- fix(rust,python) respect original series dtype when constructing
LitIter
(#9886) - sum aggregation empty set is 0, not null (#9894)
- Allow None as exponent (#9880)
- preserve expression aliases when parsing SQL with
pl.sql_expr
(#9875) - fmt unknown dtype (#9872)
- fix row-encode of 32 byte payloads (#9843)
- shrink_type on all-null columns (#9811)
- don't go into streaming engine when groupby by list (#9834)
- fix regex + exclude (#9827)
- potential integer overflow in drop_many_amortized (#9829)
- add
maintain_order
argument tosort
/top_k
/bottom_k
(#9672) - fix array concat and Series::fill_null (#9825)
- dont preserve sortedness in offset_by for tz-aware non-constant durations (#9818)
- Remove stray
arr.eval
references (#9821) - fix row-encode of null data (#9813)
- allow +00:00 when loading from arrow (#9747)
- fix row-count schema (#9797)
- fix supertype detection (#9787)
- merge rev-maps when building list arrays of categoricals. (#9742)
- Loosen restrictions on cut expressions and add docs (#9730)
- Fix list symmetric difference (#9732)
- Fix list intersection (#9735)
- don't clear rev_map when categorical series is cle… (#9720)
- fix(rust, python) improve glob pattern testing (#9721)
- don't run hstack checks when using cached names (#9709)
- fix result dtype in date_range(..., eager=True) if duration contains "1s1d" (#9670)
- increment seed between samples (#9694)
- fix cse_plan invalid projection removal (#9700)
- fix ne_missing for booleans vs lit (#9693)
- raise if to_datetime would have parsed input incorrectly (#9675)
- respect time_zone in lazy date_range (#8591)
- redo weighted rolling var (#9609)
- Correct weighted rolling quantile definition (#9608)
- clear hashes buffer in generic streaming joins (#9612)
- stable list namespace ouput when all elements are … (#9610)
- validate time zone in cast and from_arrow operations (#9598)
- make json feature depend on "dtype-struct" feature (#9589)
- fix join suffix collision (#9579)
- fix sum consistency (#9576)
- fix take of array dtype (#9575)
- fix predicate pushdown case before sort (#9574)
- fix lazy schema of temporal_range functions when no alias is provided (#9543)
- change the path parameter from to (#9531)
- fix join validation when swapped (#9534)
- fix race condition in out-of-core sort (#9521)
- unset sortedness for local date and local datetime (#9515)
- maintain sortedness flags on append/extend (#9496)
- fix serde for small integer dtypes (#9495)
- raise if window size in rolling functions isn't strictly positive (#9465)
- groupby rolling with negative offset (#9428)
- date_range with unit microseconds was producing incorrect results (#9413)
- read_csv was parsing dates incorrectly when the dtype was overridden (#9420)
- Compute Spearman rank correlations using average ra… (#9415)
- Fix rolling min/max when window is empty (#9406)
- fix compilation of other rustc versions (#9392)
- list zip with (#9367)
- parquet + categorical (#9363)
- respect startby in groupby_dynamic when every is greater than 1d (#9362)
- raise groupby apply on empty frame (#9360)
- raise more informative error on string arguments (#9352)
- correct assertion (#9320)
- fix rolling weighted mean (#9292)
- raise on invalid sort_by (#9262)
- correct ne/e_missing schema (#9257)
- fix cached reproject offsets (#9254)
- delay opening files in streaming engine (#9251)
- ensure agg(F(lit)) == lit (#9222)
- don't SO on concat(expressions) (#9214)
- clip window_size to length in rolling_apply (#9209)
- rolling_apply window_size == len (#9181)
- respect time zone in strptime/to_datetime when exact=False (#9171)
- make null chunking behavior equal to other dtypes (#9176)
- return single numpy array in Array dtype -> numpy (#9164)
- fix regression in boolean nulls comparison (#9142)
- fix struct null_count if fields are null arrays (#9151)
- categorical construction from null values (#9145)
- let
apply
caller determine if length needs to be checked. (#9140) - struct
is_in
should upcast numeric types (#9110) - json_extract on empty series (#9126)
- bubble up dtype when converting from arrow (#9120)
- rolling_groupy was returning incorrect results when offset was positive (#9082)
🛠️ Other improvements
- Rolling quantile and median use DynArgs (#9867)
- Clean up workspace definition (#9861)
- Fix all clippy warnings in the test suite (#9839)
- Refactor failing test (#9823)
- Remove stray
arr.eval
references (#9821) - fix cut features (#9808)
- cluster file scans in one node (#9799)
- Remove old cut/qcut (#9763)
- Small updates to issue templates (#9789)
- unswap from_tz and to_tz in replace_timezone (#9768)
- More cleanup around
arange
(#9769) - More cleanup for
arange
(#9681) - Fix small typo (#9714)
- refactor
arange
and addint_range
/int_ranges
(#9666) - clean up inconsistencies in duration string language (#9551)
- ensure date-range integration test runs in CI (#9554)
- remove some redundancies in sort (#9541)
- Fix some doc examples (#9405)
- Remove outda...
Python Polars 0.18.7
🚀 Performance improvements
- speed up python object to AnyValue construction (#9840)
- use hash set in drop_many (#9807)
- speed up
in series
10x (#9794) - Faster is_sorted when no flag set (#9777)
✨ Enhancements
- Add Run-length Encoding functions (#9826)
- add
include_key
parameter topartition_by
(#9750) - add
LEFT
string function for SQL (#9836) - add
REGEXP_LIKE
function for SQL (both two and three parameter version) (#9838) - add
maintain_order
argument tosort
/top_k
/bottom_k
(#9672) - Dedicated horizontal aggregation functions (#9752)
- support numpy datetime64 units (from 'ns' to 'D') in polars.from_numpy (#9783)
- implement with_row_count as private function (#9810)
- add support for SQL
SUBSTR
function (#9803) - add SQL support for binary data and expand recognised SQL dtype strings (#9802)
- add new
duration
selector and improve selector typing (#9772) - reworked comfy-table layout constraints, improving table wrapping/repr (#9744)
🐞 Bug fixes
- fix row-encode of 32 byte payloads (#9843)
- shrink_type on all-null columns (#9811)
- don't go into streaming engine when groupby by list (#9834)
- fix regex + exclude (#9827)
- add
maintain_order
argument tosort
/top_k
/bottom_k
(#9672) - fix array concat and Series::fill_null (#9825)
- dont preserve sortedness in offset_by for tz-aware non-constant durations (#9818)
- Remove stray
arr.eval
references (#9821) - fix row-encode of null data (#9813)
- allow +00:00 when loading from arrow (#9747)
- improve/fix
write_database
handling of db schema and quoted table names (#9788) - fix row-count schema (#9797)
- fix supertype detection (#9787)
- fix import error when writing parquet with pyarrow (#9760)
🛠️ Other improvements
- Refactor failing test (#9823)
- Remove stray
arr.eval
references (#9821) - Remove old cut/qcut (#9763)
- improve note about the behaviour when converting from ns-precision temporal values to python-native types (#9798)
- Small updates to issue templates (#9789)
- More cleanup around
arange
(#9769) - add missing
last
entry (#9782) - Add
rows_by_key
docs (#9766)
Thank you to all our contributors for making this release possible!
@CloseChoice, @MarcoGorelli, @alexander-beedie, @avimallu, @jonashaag, @magarick, @mcrumiller, @ritchie46 and @stinodego
Python Polars 0.18.6
✨ Enhancements
- allow qcut in window expressions (#9745)
🐞 Bug fixes
- merge rev-maps when building list arrays of categoricals. (#9742)
- Loosen restrictions on cut expressions and add docs (#9730)
- Fix list symmetric difference (#9732)
- Fix list intersection (#9735)
Thank you to all our contributors for making this release possible!
@magarick and @ritchie46
Python Polars 0.18.5
🏆 Highlights
- drop Python 3.7 support (#9679)
🚀 Performance improvements
- optimize n_unique for integers (#9568)
- remove sort columns on multiple-key OOC sort (#9545)
- don't needlessly trigger bitcount (#9561)
- optimize
_datetime_to_pl_timestamp
(#9533)
✨ Enhancements
- Improve cut and allow use in expressions (#9580)
- clearer message when stringcache-related errors occur (#9715)
- improve expression formatting (#9704)
- set string cache in window functions (#9705)
- raise on both sides of datetime/str comparison (#9692)
- support deserializing struct json into df (#9688)
- add tree formatter for expressions (#9684)
- streamline
adbc
connectivity, adding snowflake support (#9600) - improve
selector
utility functions with better docstrings/examples (#9683) - add
.list.any()
and.list.all()
(#9573) - extend dtype/selector matching for
Datetime
with a "*" wildcard for timezones (#9641) - add symmetric difference to list set operations (#9655)
- Pass through stdin/stderr buffer in to_csv (#9624)
- add dt.base_utc_offset (#9636)
- add dt.dst_offset feature (#9629)
- allow to specify index order in
to_numpy
(#9592) - accept expressions in
repeat
(#9614) - set operations for list (#9599)
- make LazyFrame.map pickle (#9597)
- add a new
rows_by_key
method, returning a keyed-dictionary of row data (#9567) - implement apply object -> struct (#9578)
🐞 Bug fixes
- don't clear rev_map when categorical series is cle… (#9720)
- fix(rust, python) improve glob pattern testing (#9721)
- don't run hstack checks when using cached names (#9709)
- fix result dtype in date_range(..., eager=True) if duration contains "1s1d" (#9670)
- increment seed between samples (#9694)
- fix cse_plan invalid projection removal (#9700)
- fix ne_missing for booleans vs lit (#9693)
- raise if to_datetime would have parsed input incorrectly (#9675)
- respect time_zone in lazy date_range (#8591)
- Align dependency versions (#9661)
- redo weighted rolling var (#9609)
- Correct weighted rolling quantile definition (#9608)
- clear hashes buffer in generic streaming joins (#9612)
- stable list namespace ouput when all elements are … (#9610)
- address schema edge-case with scalar-expanded data that resolves to an empty frame (#9593)
- handle dictionary init with unsized iterators that also hits the scalar-expansion fast path (#9594)
- validate time zone in cast and from_arrow operations (#9598)
- ensure
from_dicts
drops columns explicitly omitted from schema (#9581) - fix join suffix collision (#9579)
- fix sum consistency (#9576)
- fix take of array dtype (#9575)
- fix predicate pushdown case before sort (#9574)
- fix lazy schema of temporal_range functions when no alias is provided (#9543)
- fix join validation when swapped (#9534)
🛠️ Other improvements
- More cleanup for
arange
(#9681) - Fix some more type hints (#9716)
- Added trivial examples for the aggregation of columns in groupby (#9708)
- Fix some type hints (#9695)
- additional ADBC examples and docstring information for
read_database
(inc snowflake) (#9686) - drop Python 3.7 support (#9679)
- improve
selector
utility functions with better docstrings/examples (#9683) - refactor
arange
and addint_range
/int_ranges
(#9666) - Clarify Dataframe.corr operates on columns (#9678)
- remove false "eager=True" from date_range tests (#9663)
- Add examples to .merge_sorted (#9664)
- bump maturin from 1.0.1 to 1.1.0 in /py-polars (#9646)
- remove deprecation warning of already-enforced valid timezones change (#9639)
- fix failing ci test (#9638)
- fix inconsistency in
.list.difference()
example (#9615) - Clean up doctests for rolling (#9626)
- fix faulty test of
to_numpy
(#9619) - examples for
.list.union()
,.list.difference()
,.list.intersection()
(#9602) - fix see also broken links (#9607)
- clarify sortedness condition of groupby_dynamic and groupby_rolling (#9606)
- clean up inconsistencies in duration string language (#9551)
- Adding examples to binary functions (#9553)
- Minor cleanup of
arange
(#9544) - Remove outdated badges from README (#9532)
Thank you to all our contributors for making this release possible!
@MarcoGorelli, @alexander-beedie, @borchero, @datapythonista, @dependabot, @dependabot[bot], @eitsupi, @guanqun, @jeroenjanssens, @jorisSchaller, @kljensen, @magarick, @mcrumiller, @messense, @mishpat, @moritzwilksch, @ritchie46, @stinodego, @ttencate, @universalmind303 and @zundertj