Releases: mage-ai/mage-ai
0.9.63 | Halo 👾
What's Changed
🎉 Exciting New Features
🦆 MotherDuck Support
This one's for all the ducklings out there! In addition to supporting DuckDB, Mage now supports MotherDuck destinations!
By specifying a MOTHERDUCK_TOKEN
and adding a md:
prefix to your DuckDB database, you can read/write from/to MotherDuck locations! Check it out and get started here.
by @wangxiaoyou1993 in #4533
🤖 Support Thick
mode in OracleDB
We like our OracleDB connections like we like our pancakes, THICK 🥞. By default, Mage's Oracle client runs in a "Thin" mode which connects directly to Oracle Database— this mode does not need Oracle Client libraries. However, some additional functionality is available when they're used.
Now, you can use the "Thick" mode in Mage to connect to OracleDB using the Oracle Client libraries!
Check out our docs to get started or read more about the differences between "Thin" and "Thick" modes here.
by @matrixstone in #4421
🗄️ Show unused block files in file browser
This is one of our most requested features and we're excited to announce that it's finally here! 🎉
You can now see which files are not being used in your pipeline! This is a great way to clean up your projects and remove any unnecessary files. Check out the gif below to see it in action!
Head over to our docs to learn more!
by @johnson-mage in #4449
🤐 Import functionality for pipeline zips
Like to share? So do we! You can now import pipelines (via .zip
files) in your Mage projects! We're optimistic that this simple improvement will make it easier to share your pipelines or even borrow pipelines from your friends!
by @johnson-mage in #4453
🐛 Bug Fixes
- Prevent add block menu from disappearing by @johnson-mage in #4502
- Fix some minor bugs by @tommydangerous in #4505
- Remove table name helper by @tommydangerous in #4506
- Fix alter table column names cleaning (Postgresql integration exporter) by @arnetkachev in #4493
- Fixed Oauth connection on Salesforce Source by @Luishfs in #4402
- Fix unclickable vertical scrollbar and jumping before panel by @johnson-mage in #4512
- Catch unknown host error by @wangxiaoyou1993 in #4518
- Upgrade snowflake library version and fix datetime column type by @wangxiaoyou1993 in #4524
- Only show files with correct prefix by @dy46 in #4509
- Update submodule sync for ssh auth by @dy46 in #4522
- Update connection url by @dy46 in #4523
- Update how the stale pipeline message modal is displayed by @johnson-mage in #4536
- Fix running dynamic blocks with k8s executor by @wangxiaoyou1993 in #4543
- Widget policy update by @johnson-mage in #4545
- Catch exception of building cache key for block cache by @wangxiaoyou1993 in #4548
- Disable stale pipeline modal by @johnson-mage in #4525
💅 Enhancements & Polish
- MSSQL data integration source - add support for DATETIMEOFFSET type by @hugabora in #4499
- Fix: handle unprivileged user for postgres by @jdbranham in #4357
- Optimize pipeline schedule queries by @dy46 in #4188
- Remove pipeline's updated_at attribute by @johnson-mage in #4521
- Create and view workspaces in different namespaces by @dy46 in #4513
- Select status for pipeline runs that time out by @dy46 in #4519
New Contributors
- @arnetkachev made their first contribution in #4493
- @jdbranham made their first contribution in #4357
- @tanjibpa made their first contribution in #4528
Full Changelog: 0.9.62...0.9.63
0.9.62 | The Beekeeper 🐝
What's Changed
🎉 Exciting New Features
🧵 [Mage IO] Weaviate Integration
🤔 Building AI apps with Mage? Perfect! Now you can use Weaviate as a data source. Weaviate is an open-source, AI-native vector database that helps developers create intuitive and reliable AI applications. With Mage, you can now read from and write to Weaviate databases! Read more in our docs here.
by @matrixstone in #4158
🔍 [Mage IO] Alogia Integration
Like search? Us too! That's why we've added support for Alogia as a data source in Mage. Alogia is a powerful search engine that helps you build fast and accurate search experiences. With Mage, you can now read from and write to Alogia! Read more in our docs here.
by @matrixstone in #4198
💥 Dynamic SQL Blocks
Another big improvement to dynamic blocks this week SQL Dynamic Blocks! That's right, you can now create dynamic outputs from SQL blocks in Mage. Previously, blocks had to be Python for dynamic outputs, but no more! This is a big step forward in making Mage more flexible and powerful. Give it a shot today. 🎉
by @tommydangerous in #4430
🌊 [Kafka Streaming Sources] Offset & Partitions
For our streaming aficionados, we have a few new Kafka features! You can now specify the offset and partitions for Kafka streaming sources. Offsets can be one of: Beginning
, End
, Int
, & Timestamp
. This allows users to set specific positions inside a topic to consume data.
⛴️ Configure Kubernetes Affinity & Tolerations
Last, but certainly not least, we've got some nitty Kubernetes configuration updates! You can now specify affinity
and tolerations
in your Kubernetes settings. ⚓
Node affinity is a set of rules used by the scheduler to determine where a pod can be placed. The rules are defined using custom labels on nodes and label selectors specified in pods. Node affinity allows a pod to specify an affinity (or anti-affinity) towards a group of nodes it can be placed on.
Tolerations are applied to pods and indicate that the pod can be scheduled on nodes with specific taints.
These configurations should help our Kubernetes power users make the most of Mage! 🧙
by @wangxiaoyou1993 in #4407
🐛 Bug Fixes
- Fix LDAP unpacking by @dy46 in #4399
- Fix reduce output and triple layout saving by @tommydangerous in #4409
- Fix file browser bug and Git push bug by @tommydangerous in #4411
- Fix serializing list and dict when formatting output by @tommydangerous in #4412
- Prevent index out of bounds by @tommydangerous in #4425
- Remove test print statements by @tommydangerous in #4431
- Removing Draft7 validation from Clickhouse destination by @Luishfs in #4424
- Fix
global_vars
context in pipeline executor by @wangxiaoyou1993 in #4435 - Fix a few global data product bugs by @tommydangerous in #4440
- Fix dynamic blocks OOM round 2 by @tommydangerous in #4445
- Fix incremental sync in chargebee source by @wangxiaoyou1993 in #4450
- Don’t count values if is None by @tommydangerous in #4454
- Fix policy issue updating settings by @tommydangerous in #4456
- Fix keyboard shortcuts when its empty array by @tommydangerous in #4458
- Run submodule sync by @dy46 in #4457
- Fixing Snowflake
write_pandas
issue by @Luishfs in #4395 - Fix writing to Snowflake with mixed int and str types by @wangxiaoyou1993 in #4460
- Fix some bugs and improve the edit page by @tommydangerous in #4462
- Fix error logging in pipeline executor by @tommydangerous in #4468
- When clicking show file versions in arcane library, show right panel by @tommydangerous in #4472
- Catch BigQuery if it fails to fix table names by @tommydangerous in #4479
- Fix pipeline detail prop passed for fetching files by @johnson-mage in #4485
- Always show edit pipeline button by @dy46 in #4484
- Revert change to PG IO by @tommydangerous in #4486
- Fix io redshift by @wangxiaoyou1993 in #4487
- Fix multi project flag by @wangxiaoyou1993 in #4490
- Fix Bigquery clean column name by @wangxiaoyou1993 in #4500
- Convert datetime type for s3 data integration destination by @wangxiaoyou1993 in #4501
- Fix dynamic child block outputs by @tommydangerous in #4422
- Fix database missing and serializing QueryJob by @tommydangerous in #4428
- Fix incorrect spelling by @ckfear in #4438
- Fix kafka type and doc by @wangxiaoyou1993 in #4466
💅 Enhancements & Polish
- Workspace improvements by @dy46 in #4469
- Scheduler improvements by @wangxiaoyou1993 in #4467
- Prevent unnecessary initial pipeline run by @johnson-mage in #4291
- Speed up monitor stats and reduce calls on overview by @tommydangerous in #4408
- Bump up dependency versions to resolve vulnerabilities by @wangxiaoyou1993 in #4433
- Limit dynamic block output in notebook by @tommydangerous in #4436
- Improve bigquery name parsing to help fill in full name by @tommydangerous in #4447
- Add error logging by @tommydangerous in #4448
- Fix app slowness due to project platform check by @wangxiaoyou1993 in #4483
- Add PVC retention policy by @dy46 in #4491
- E2e test for
/pipelines
by @edmondwinston in #4306 - Add query decorator for data integration blocks by @tommydangerous in #4465
New Contributors
Full Changelog: 0.9.60...0.9.62
0.9.60 | Yusuke Urameshi
What's Changed
🎉 Exciting New Features
🌊 [Streaming] Google Cloud Storage Destination
🎉 Google Cloud users rejoice! Streaming pipelines just even got better— Mage now supports Google Cloud Storage as a streaming destination! Check out the docs here and get started today!
by @wangxiaoyou1993 in #4340
👷♂️ Overwrite SQL types
For anyone with a data warehouse, listen up! (We assume that's most of you 😅)
You can now specify custom column types when exporting to SQL destinations. This is useful when you want to export a dataframe with a column that has a type that is not supported by the default mapping. You can read more about overwriting types here.
Here's an example of an exporter that overwrites column types for a PostgreSQL destination:
@data_exporter
def export_data_to_postgres(df: DataFrame, **kwargs) -> None:
schema_name = 'your_schema_name' # Specify the name of the schema to export data to
table_name = 'your_table_name' # Specify the name of the table to export data to
config_path = path.join(get_repo_path(), 'io_config.yaml')
config_profile = 'default'
overwrite_types = {'column_name': 'VARCHAR(255)'}
with Postgres.with_config(ConfigFileLoader(config_path, config_profile)) as loader:
loader.export(
df,
schema_name,
table_name,
index=False, # Specifies whether to include index in exported table
if_exists='replace', # Specify resolution policy if table name already exists
allow_reserved_words=True,
unique_conflict_method='UPDATE',
unique_constraints=['col'],
overwrite_types=overwrite_types,
)
This feature is currently supported for PostgreSQL, Redshift, Trino, MSSQL, MySQL, Clickhouse, and BigQuery.
👨💻 [Command Center] Version Control & Files
The Mage Command Center can now be used for version control commands! You can both configure git and run your favorite version control commands directly from the Mage UI. Additionally, you can manage your files via nav and Mage's new file explorer (🧙 Arcane Library)!
As a reminder, to enable the command center, you can do so by going to Settings
(the wizard icon) and click the Command Center
toggle. It can be accessed via ⌘ + .
(Mac) or Win + .
(Windows).
by @tommydangerous in #4273
👾 [Command Center] Terminal App
Mage LEGEND @tommydangerous is back at it again— he's implemented a full terminal app into the command center. For those of you with this beta feature enabled, you'll now have a terminal at your fingertips at all times.
Check out the following video for sample usage:
CleanShot.2024-01-14.at.10.09.43.mp4
As a reminder, to enable the command center, you can do so by going to Settings
(the wizard icon) and click the Command Center
toggle. It can be accessed via ⌘ + .
(Mac) or Win + .
(Windows).
by @tommydangerous in #4365
JSON Logging
A huge shoutout to @dy46 for adding JSON logging to Mage! This will make it easier to parse logs and integrate with other logging tools. Just specify SERVER_LOGGING_FORMAT=json
to change the output to something like the following:
🐛 Bug Fixes
- Fix caching issues with block cache and shared pipelines by @johnson-mage in #4338
- Fix SQL blocks by @tommydangerous in #4341
- Prevent error when searching for blocks by @johnson-mage in #4343
- Fix callbacks input data from dynamic child blocks by @tommydangerous in #4342
- Enable command center when user auth not required by @tommydangerous in #4346
- Fix bug when searching for block files by @johnson-mage in #4347
- Fix dynamic child block getting input data by @tommydangerous in #4349
- Fix cron expression conversion when using local midnight time by @johnson-mage in #4359
- Fix block search by @tommydangerous in #4360
- Fix command center hiding by @tommydangerous in #4361
- Fix GDP and add terminal colors by @tommydangerous in #4363
- Minor tweaks to existing apps by @tommydangerous in #4367
- Added
custom_fields
to freshdesk source by @Luishfs in #4354 - Fix creating widget by @wangxiaoyou1993 in #4375
- Lowercase
auth_type
enum by @dy46 in #4376 - Update cloud run workspace by @dy46 in #4377
- Fix terminal by @dy46 in #4389
- Fix pipeline run variable overwrite for sql block by @wangxiaoyou1993 in #4390
- Update dynamic block output and input data logic by @tommydangerous in #4388
- Fix dynamic block conditionals in runs and in notebook by @tommydangerous in #4397
- Remove terminal colors by @tommydangerous in #4398
- Fix reduce output block tests by @tommydangerous in #4400
- Removing modified
google-ads
lib by @Luishfs in #4330
💅 Enhancements & Polish
- Show multiple outputs and fix downstream dynamic child block inputs and outputs by @tommydangerous in #4382
- Improve command center shortcut wording and example by @johnson-mage in #4348
- Add mapping for active directory roles by @dy46 in #4345
- Make block type error more descriptive by @johnson-mage in #4353
- Upgrade app layout behavior by @tommydangerous in #4362
- Support overwriting column types in BigQuery by @wangxiaoyou1993 in #4374
- Add spark jar files to
emr_config
if using EMR cluster by @johnson-mage in #4379
Full Changelog: 0.9.59...0.9.60
0.9.59 | 🐲 🐉 Year of the Dragon
What's Changed
🎉 Exciting New Features
Note: many new features this week are in beta. You can enable them by navigating to your Mage settings and toggling the beta features there.
🎮 Multi-project Platform [BETA]
We've reworked our support for mulitple projects with a new multi-project platform! @tommydangerous is back at it again with this huge feature release, enabling nested projects, custom code paths, cross-project triggering, a split pipeline scheduler, and much more!
CleanShot.2024-01-08.at.02.33.54.mp4
If you'd like to try out the multi-project platform, you can check out this repo for a sample structure. Head over to your Mage settings to enable
- Support nested projects and custom code paths by @tommydangerous in #4161
- Trigger and run pipelines across projects by @tommydangerous in #4186
- Split pipeline scheduler and schedule models for project platform by @tommydangerous in #4233
- Configure root project preferences and settings by @tommydangerous in #4234
- Fix pipeline schedule creation and repo name by @tommydangerous in #4247
- Don't try to use file source if not exist by @tommydangerous in #4196
- Fix the way we store pipelines by type in the cache by @tommydangerous in #4278
🚀 Command Center [BETA]
Another new & exciting feature this week— the Mage Command Center. The command center is a floating search bar that can invoke actions like opening files & pages, perform actions within Mage, interacting with the page, and much more!
CleanShot.2024-01-08.at.03.15.15.mp4
Enable the Command Center in settings and give it a spin today!
- Command center by @tommydangerous in #4249
- Add command center models by @tommydangerous in #4254
🪣 Bitbucket Version Control
Shout out to @dy46 for continuing to crush the version control integrations!
You can now use Bitbucket as a version control provider! This is a great option for teams that use Bitbucket for their code repositories. To get started, navigate to the Mage Version Control app and select Bitbucket as your provider. You'll be prompted to authenticate with Bitbucket, and then you'll be able to select your repositories. Read more here.
◔ Qdrant integration
Mage now supports Qdrant, an open-source vector search engine. Qdrant is a great tool for similarity search, and it can be used for a variety of use cases, including product recommendations, image search, and more. With this update, you can load/export data from Qdrant sources in your batch pipelines! Read more here.
by @matrixstone in #4081
🧱 dbt DX v2 + dbt Upgrade [BETA]
This release contains a huge dbt overhaul 🤯
Alongside a much-awaited upgrade to dbt 1.7, the dbt developer experience has been completely rebuilt. @tommydangerous has been hard at work crafting a dbt experience that is more intuitive, powerful, and flexible!
Here's a quick demo:
CleanShot.2024-01-08.at.03.08.21.mp4
- Upgrade dbt to 1.7 by @tommydangerous in #4244
- dbt v2 browser UI by @tommydangerous in #4200
- Add block browser modal to notebook by @tommydangerous in #4246
- Add dbt cache by @tommydangerous in #4193
- Add fields to dbt code block 2.0 for manual entry by @tommydangerous in #4331
- Use absolute paths in dbt block by @tommydangerous in #4307
- Add custom code block tags by @tommydangerous in #4250
- Collapse or expand folders by @tommydangerous in #4203
🐛 Bug Fixes
SQL blocks
- Remove double quotes for postgres by @dy46 in #4170
- Escape BigQuery project name in SQL block by @wangxiaoyou1993 in #4294
- Pass in
query_vars
as a dict by @dy46 in #4280
Data integration
- Fix syncing MySQL
TIME
type by @wangxiaoyou1993 in #4275 - Bump Google Ads version from 14 to 15 by @Luishfs in #4289
- Fix incremental sync bug with missing arg by @tommydangerous in #4315
- Fix executing data integration block with ecs executor by @wangxiaoyou1993 in #4322
- Wrap MSSQL table name with double quotes in data integration pipeline by @wangxiaoyou1993 in #4290
Trigger and scheduling
- Use UTC date for trigger start date by @johnson-mage in #4283
- Display correct default start datetime when editing trigger/backfill by @johnson-mage in #4292
- Try preventing creating duplicate pipeline runs in scheduler by @wangxiaoyou1993 in #4311
- Wrap block run initialization logic with lock by @wangxiaoyou1993 in #4296
dbt
- Fix dbt profiles interpolation by @tommydangerous in #4225
- Lazy import for dbt files by @tommydangerous in #4251
- Re-work dbt project path for
yaml
files by @tommydangerous in #4207 - Error handling for "project not found" in dbt by @tommydangerous in #4334
- Adjust error behavior when adding dbt files by @tommydangerous in #4191
File browser
- Fix files page not opening files by @tommydangerous in #4222
- File browser bug bash by @tommydangerous in #4237
- Fix bug when deleting block from file browser by @tommydangerous in #4243
- Fix
requestIdleCallback
not supported on Safari by @tommydangerous in #4284 - Fix file browser not refreshing by @tommydangerous in #4223
Dynamic blocks
- Fix dynamic block + dynamic child blocks spawning other blocks by @tommydangerous in #4295
- Fix reduce output bug by @tommydangerous in #4326
Git
- Add actions to
GitBranchPolicy
by @dy46 in #4213 - Fix Git bugs by @tommydangerous in #4264
- Fix Git submodule sync by @dy46 in #4316
- Fix missing Git module by @tommydangerous in #4206
Other
- Fix silent errors from global hooks by @tommydangerous in #4173
- Fix block sorting bug by @johnson-mage in #4175
- Avoid saving error details in block run DB by @tommydangerous in #4179
- Reset page after applying pipeline filters by @johnson-mage in #4183
- Restrict opentelemetry package versions by @dy46 in #4208
- Fix dataframe validation by @tommydangerous in #4255
- Fix add new button tooltips by @tommydangerous in #4263
- Save
statistics.json
in correct execution partition folder by @wangxiaoyou1993 in #4271 - Fix dashboard resizing by @tommydangerous in #4282
- Only set the schema in the DB when the server is started by @dy46 in #4293
- Fix several bugs on Pipelines dashboard by @johnson-mage in #4300
- Improve UI in several areas by @tommydangerous in #4309
- Fix dragging and dropping file in browser by @tommydangerous in #4312
- Fix interpolating mage secret in project metadata.yaml by @wangxiaoyou1993 in https://github.com/mag...
0.9.50 | Wonka 🎩
What's Changed
🎉 Exciting New Features
🌊 [Data Integration] Dremio Source
🥳 Dremio users, rejoice! Mage now supports Dremio as a data integration source, meaning you can now build data integration pipelines pulling from data lakes and more!
🏃♂️ Manually run pipeline once in same trigger
This update, courtesy of our frontend engineer, Johnson, is a big one!
First, he added a new button to the Trigger Detail page for running pipelines once in the same trigger! 👀
Next, some quality of life improvements:
- For
@once
triggers, the trigger does not need to beactive
status before running the pipeline once using theRun@once
button in the Trigger Detail page. Many users have told us this is confusing... No more! - The trigger will automatically be updated to
active
status. However, if the trigger is NOT an@once
trigger (e.g. a recurring interval or API trigger), the trigger must be set toactive
status before manually running the pipeline once from the Trigger Detail page.
Finally, Johnson renamed the "start/pause" trigger on Trigger Detail page to "enable/disable" to be a bit more clear and communicative. Again, we've heard this is a bit misleading, so we did something about it! 🗣️
by @johnson-mage in #4133
[Streaming] ActiveMQ Sink
Shruti continues her epic tear of contributing magical ✨ pipelines. In this PR, she's added an ActiveMQ streaming sink to Mage. Apache ActiveMQ is an open source message broker written in Java... and now you can write data there via Mage! 💫
by @shrutimantri in #4141
🐛 Bug Fixes
- Fix stale pipeline message not appearing by @johnson-mage in #4138
- Fixed Salesforce Source not running sync by @Luishfs in #4048
- Fixed Salesforce Destination Upsert action by @Luishfs in #4130
- Fix external cloud storage logs for k8s blocks by @dy46 in #4128
- Update DownloadPolicy to allow downloading pipeline zip files by @johnson-mage in #4148
- Fix pagination by @tommydangerous in #4152
- Add tests and improve logs table by @tommydangerous in #4156
- Add tags to pipeline cache by @tommydangerous in #4162
- Disable error and UI limiting dynamic blocks by @tommydangerous in #4164
💅 Enhancements & Polish
- Trigger global hooks on pipeline execution by @tommydangerous in #4147
- Close modal after saving by @dy46 in #4131
- Speed up pipelines list API operation by @tommydangerous in #4132
- Add resource parent in the input data for hooks by @tommydangerous in #4144
- Optimize pipeline schedules LIST api by @dy46 in #4058
- Fix block runs page when there are a large number of block runs by @dy46 in #4109
- Dynamic blocks 2.0 by @tommydangerous in #4157
- Add more tests for pipeline execute global hook by @tommydangerous in #4159
- Update git settings when environment variables are set by @dy46 in #4154
- Update url logic for workspace manager by @dy46 in #4166
New Contributors
Full Changelog: 0.9.48...0.9.50
0.9.48 | The Boy and the Heron
What's Changed
🎉 Exciting New Features
Incremental data integration in batch pipelines
🥳 Data integrations in batch pipelines now support incremental replication! You can read more here to get started!
by @tommydangerous in #4068
[Streaming] RabbitMQ Destination
Another community PR from @shrutimantri adds support for RabbitMQ as a streaming data sink. 🔥
Check it out today with your favorite streaming sources! You can find the configuration reference here.
by @shrutimantri in #4041
Chroma integration
Mage now has a ChromaDB IO Class, meaning you can use data loaders and exporters in your batch pipelines to read/write from Chroma sources. You can read more about configuration here or visit Chroma's site to learn more about their vector database.
by @matrixstone in #4017
Bookmark overrides
🎊 If you're creating a trigger on a data integration, you can now override bookmarks with your own custom values!
by @tommydangerous in #4073
SQL Block environment variable interpolation
For our fans of SQL blocks, you can now interpolate environment variables directly in your queries!
SELECT
'{{ env_var("ENV") }}' AS test
, '{{ variables("test") }}' AS test2
, {{ test }} AS test3
This should allow for much greater flexibility in pipelines with SQL!
by @tommydangerous in #4076
Additional upstream dependencies for dynamic children
Love dynamic blocks? 🤔 They dynamic children can now have additional upstream dependencies!
by @tommydangerous in #4104
Support caching block output in memory
Previously, pipelines with large Spark DataFrames faced out of heap space errors when persisting block outputs to disk. This PR allows the user to disable persisting output. The feature is only supported in standard batch pipeline (without dynamic blocks) for now.
cache_block_output_in_memory: true
run_pipeline_in_one_process: true
by @wangxiaoyou1993 in #4127
🐛 Bug Fixes
- Backend API for getting information about bookmarks by @tommydangerous in #4070
- Support different operators when comparing bookmark properties by @tommydangerous in #4075
- Update backfill statuses by @johnson-mage in #3994
- Reduce block at any level UI by @tommydangerous in #4067
- Catch execption of empty integration streams in pipeline scheduler by @wangxiaoyou1993 in #4083
- Backfill's date-picker date value mismatch by @edmondwinston in #3972
- Gracefully access dictionaries in the Oauth Policy by @tommydangerous in #4086
- Pass tolerations to job pod by @wangxiaoyou1993 in #4089
- Fix load sample data for integration pipelines by @dy46 in #4034
- Default to using environment variables for git and workspace settings by @dy46 in #4088
- Fixed Google Ads Source by @Luishfs in #4099
- Fix chromadb dependency by @wangxiaoyou1993 in #4107
- Fix chromadb in all package by @wangxiaoyou1993 in #4108
- Update local timezone project setting from header by @johnson-mage in #4111
- Fix runtime variables not showing when creating new trigger by @tommydangerous in #4116
- Fix executing conditional blocks with pipeline executor by @wangxiaoyou1993 in #4120
- Remove itertools groupby by @dy46 in #4103
- Updates/nats add stream fixes by @mfreeman451 in #4113
- Update
opentelemetry-exporter-prometheus
package version by @dy46 in #4101 - Fix postgres streaming sink when there are no messages by @shrutimantri in #4074
💅 Enhancements & Polish
- Add any runtime variables by @tommydangerous in #4071
- Include
message_events_json
in Postmark messages_outbound stream by @wangxiaoyou1993 in #4085 - Hide "Unique" and "Key" columns for certain data integration destination blocks by @johnson-mage in #4096
- Add top padding to file code editor by @johnson-mage in #4098
- Display error in UI when variables directories configured incorrectly by @johnson-mage in #4091
- Improve the interface for Chroma class by @wangxiaoyou1993 in #4110
- Add
column_header_format
option by @dy46 in #4118 - Allow configuring Amplitude host by @wangxiaoyou1993 in #4060
New Contributors
- @andrewgetzdata made their first contribution in #4078
- @suvhotta made their first contribution in #4097
Full Changelog: 0.9.46...0.9.48
0.9.46 | Wish 🪄
What's Changed
🎉 Exciting New Features
⚡️ Spark UI/UX 2.0 for AWS EMR
It's finally here! Mage now comes with a completely revamped custom Spark UI/UX for our AWS EMR users! This is a huge update that comes with a complete overhaul of every element possible for managing your Spark cluster!
Check out the PR for more screenshots and get started today (docs coming soon)!
by @tommydangerous in #3997
🌊 Streaming: ActiveMQ Source
🔥 @shrutimantri is on fire! In another community PR, she adds streaming support for ActiveMQ as a source! If you're an ActiveMQ user, give it a shot today!
by @shrutimantri in #3978
📁 Download Files & Pipelines via the UI
Another community PR, this one from @PopaRares, allows you to download files and pipelines via the right-click menu in the Mage UI.
This will be a game changer for collabrative projects and importing/exporting data from Mage!
by @PopaRares in #3813
✨ Streaming: NATS JetStream Source
A big shoutout to community member @mfreeman451 for adding the NATS JetStream message broker as a Streaming Source in Mage!
by @mfreeman451 in #3985
🎏 Data integration: Kafka destination
@Luishfs is back at it with another destination— this one for data integration. You can now write DI outputs to a Kafka topic. We can't wait to see what y'all cook up with this one!
🛟 Auto-save triggers in code
Last, but certainly not least, Mage is now able to auto-save triggers as code. That means (when enabled) you can update triggers and have them auto-save to your Mage project. WThis should help you keep track of your trigger changes across projects.
by @tommydangerous in #4009
🐛 Bug Fixes
- Add
pyarrow-hotfix
torequirements.txt
by @wangxiaoyou1993 in #3993 - Fix workspace user management fetching by @dy46 in #3992
- Handle hidden block positions when split view by @tommydangerous in #4005
- Don’t show SSH tunnel option unless kernel is PySpark by @tommydangerous in #4006
- Fix Monoco Editor for base path by @dy46 in #4012
- Update OAuth sign on and fix OAuth sign on with
REQUIRE_USER_PERMISSIONS
by @dy46 in #4007 - Clarification when applying bookmark to all streams by @johnson-mage in #4036 and #4049
- Allow using block configuration when
run_pipeline_in_one_process
is true by @wangxiaoyou1993 in #4046 - Only show Backfills in vertical nav for standard (python) pipelines by @johnson-mage in #4050
- Fix files not being selected in notebook by @tommydangerous in #4051
💅 Enhancements & Polish
- Added application/gzip support to API source by @Luishfs in #3990
- Re-enqueue the job if queue is empty by @wangxiaoyou1993 in #3996
- Add better error message for API source by @wangxiaoyou1993 in #4016
- Pass envFrom to job pod by @wangxiaoyou1993 in #4031
New Contributors
- @mfreeman451 made their first contribution in #3985
0.9.45 | Yuji Itadori 👹
What's Changed
🎉 Exciting New Features
🔐 New SSO/OAuth providers
With our latest release, Mage now supports SSO/OAuth from not one, but two providers— Okta & Google. Our engineers also thought ahead, laying the groundwork for supporting more providers in the future, too! Check out the docs— Google & Okta.
🔥 Compute management for Apache Spark blocks
Tommy is back at it with another massive PR, this one adding full support for compute management in Apache Spark blocks. For those of you who leverage Spark, this PR will allow you fine-grained control over your compute. Keep an eye out for completely revamped EMR functionality in the near future!
by @tommydangerous in #3883
🤗 HuggingFace AI Client
Mage now supports using different AI models for interfaces within the applicaiton, not just OpenAI! The first we've added is a HuggingFace client... You can now use HuggingFace with Mage's AI functionality! Read more about getting started here.
by @matrixstone in #3850 and #3919
🧱 Azure Databricks Delta Lake Destination
🎉 Mage now supports Azure Databricks Delta Lake as a destination for data integration pipelines! That means you can write all of your favorite sources to the open, parquet-based storage system on Microsofts cloud infrastructure!
📊 Prometheus Metrics
Prometheus style metrics are a vendor neutral standard based on a pull model. Prometheus-enabled servers output time-series data on a route (usually /metrics), which can be scraped. Being an open standard, most monitoring tools know how to interface with Prometheus metrics (Open Telemetry support them too).
You can learn more about Prometheus here:
- https://prometheus.io/docs/concepts/data_model/
- https://github.com/prometheus/docs/blob/main/content/docs/instrumenting/exposition_formats.md
This PR enables the basic built-in metrics, which cover the Tornado server (http metrics) and the Python runtime. More metrics can be added in the future! Check out our docs here.
🐛 Bug Fixes
- Inject global and pipeline variables in the keyword arguments for extension blocks by @tommydangerous in #3917
- Fixed Stripe
INCREMENTAL
run and updated tap by @Luishfs in #3835 - Break
while
loop when not raising error on failure for pipeline triggered by code by @johnson-mage in #3933 - Make 2nd argument of lambda function optional by @tommydangerous in #3941
- Update git settings permissions by @dy46 in #3935
- Fix saving tokens when creating triggers with code by @dy46 in #3938
- Misc workspace changes by @dy46 in #3931
- Fix MongoDB destination and add unit test by @wangxiaoyou1993 in #3944
- Fix SQL destination reserved words by @wangxiaoyou1993 in #3951
- Fix notebook block ordering for upstream blocks by @tommydangerous in #3955
- Fix Dockerfile and API when using EMR by @tommydangerous in #3960
- Fix unit tests warnings and errors by @wangxiaoyou1993 in #3961
- Clean column name when using batch load in Snowflake destination by @wangxiaoyou1993 in #3968
- Serialize Snowflake dataframe
dict
column tojson
if column type isstring
by @wangxiaoyou1993 in #3969 - Move dbt seed logic to downstream block by @wangxiaoyou1993 in #3953
- Fix project dashboard overview count formatting by @johnson-mage in #3980
- Fix pipeline scheduler for integration pipelines by @dy46 in #3981
- Fix roles getting overwritten when updating profile by @dy46 in #3982
- Fix tree for data integration pipeline by @tommydangerous in #3986
- Minor bug fix in Pinot config on
io_config.yaml
by @shrutimantri in #3970
💅 Enhancements & Polish
- Support pipeline level EMR config by @wangxiaoyou1993 in #3922
- Add keyboard shortcuts for inserting new scratchpad cell by @anniexcheng in #3926
- Add colour code for pipeline backfills by @edmondwinston in #3904
- Show dependency graph zoom options by @anniexcheng in #3899
- Update how global data products are run by @dy46 in #3872
- Consistent run status colors across tables by @johnson-mage in #3940
- Enable users to cancel in progress runs when disabling a pipeline trigger by @anniexcheng in #3905
- Allow getting instance type from environment variable by @wangxiaoyou1993 in #3949
- Include Monaco Editor in build to avoid fetching from CDN by @johnson-mage in #3916
- Interpolate variables and upstream block output in dbt commands by @tommydangerous in #3945
- Update
duckdb
version by @dy46 in #3959 - Update backfill variables by @johnson-mage in #3963
- Always show "Overwrite global variables" setting when editing a trigger by @johnson-mage in #3973
- Add pipeline run limit for a pipeline by @dy46 in #3868
- Use personal access token if available by @dy46 in #3974
- Add exception failure message in callbacks by @dy46 in #3952
- Support override
assignPublicIp
andenableExecuteCommand
in EcsConfig by @wangxiaoyou1993 in #3966 - Add kafka
api_version
to the data loader and data exporter templates by @shrutimantri in #3967
Full Changelog: 0.9.43...0.9.45
0.9.43 | Attack on Titan 💥
What's Changed
🎉 Exciting New Features
🌳 Dependency Tree 2.0
This one is huge— a complete rewrite of our dependency tree functionality! You'll notice an improved appearance and performance in addition to the following:
- A full right click menu
- The ability to add blocks between nodes
- The ability to remove blocks using the menu
- Improved connectivity through dragging lines
- Block groups and subgroups
- The ability to drag and drop blocks to connect
- The ability to drag and drop groups of blocks
- Double click to see all connections
- New block execution animation
This new featureset is so expansive you'll need to upgrade to check it out. See this PR for a walkthough of all the features or update to the latest version to try it out today!
by @tommydangerous in #3886
🍷 Apache Pinot Data Loader
Pinot users rejoice!
Community member @shrutimantri has added a brand new data loader for the OLAP datastore! Update to the latest version to try it out.
by @shrutimantri in #3898
📊 Google Sheets Data Loader/Exporters
Who doesn't use Google Sheets?
Mage now supports reading and writing to individual Sheets/Tabs natively! This PR also includes some handy loader/exporter templates to make it easy.
Get started with your Sheets data in Mage today!
🧱 Redis dependency added to Mage Helm chart
Thanks to community member @sriniarul for adding a Redis dependency to our Helm chart! Mage now supports multiple replicas. They also added a standalone scheduler and webserver option to the chart. It you use Redis you'll want to give this one a look!
by @sriniarul in mage-ai/helm-charts#22
🐛 Bug Fixes
- Fix memory leak and warning by @wangxiaoyou1993 in #3881
- Fix permissions policy for owners by @tommydangerous in #3884
- Revert pipeline type when error occurs switching type by @johnson-mage in #3870
- Fix Monaco editor theme by @anniexcheng in #3882
- Update druid.py to have correct method name by @shrutimantri in #3892
- Fix sftp client side exception by @dy46 in #3901
- Fix EMR Spark pipeline issues by @wangxiaoyou1993 in #3894
- Fix pipeline
LIST
API performance by @wangxiaoyou1993 in #3903 - Fix JSON serializing Timestamp when exporting to BigQuery by @tommydangerous in #3908
- Restrict the version of openai by @wangxiaoyou1993 in #3912
💅 Enhancements & Polish
- Allow sending emails without TLS by @wangxiaoyou1993 in #3880
- Customize k8s executor by @dhia-gharsallaoui in #3535
- Speed up Pipeline
UPDATE
API by @wangxiaoyou1993 in #3909 - Add test functions for AI class by @matrixstone in #3818
- Add more unit tests to k8s job manager by @wangxiaoyou1993 in #3885
- Add "No tags" filtering option for pipelines by @edmondwinston in #3867
- Add keyboard shortcuts for inserting new scratchpad cell by @anniexcheng in #3889
- Update header for compute management w/ fixes by @tommydangerous in #3893
Full Changelog: 0.9.41...0.9.43
0.9.41 | Halloween 🎃👻
What's Changed
🎉 Exciting New Features
Workspace Lifecycle Management
🎉 Mage now provides support for managing the workspace lifecycles in Kubernetes! That means you can control how Mage is deployed, start-to-finish, with the following options:
- Auto-termination
- Pre-start scripts
- Post-start scripts
Read more about lifecycle management here and give it a shot today!
Elasticsearch Data Integration Destination
Mage now supports writing data to Elasticsearch for all of your search & LLM needs! 🧙🏻♂️
Block Detach for Shared Pipelines
Ok, this one is huge— say you have a block in multiple pipelines, but you need to change the logic in a single block instance... That sounds tricky, right? 🤔
Now you can with Block Detach! Simply click # Pipelines on the block, then Detatch to create a clone of the block in your current pipeline!
by @johnson-mage in #3816
🐛 Bug Fixes
- Fix disabled keyboard shortcuts due to Pipeline Runs table keyboard nav by @anniexcheng in #3833
- Allow admin users to read attributes on users list by @tommydangerous in #3837
- Fix
libodbc
conflicts in Dockerfile by @wangxiaoyou1993 in #3840 and #3845 - Fix save block functionality after pipeline execution by @anniexcheng in #3839
- Fix GHE by @dy46 in #3841
- Fix dynamic children not running and its downstream by @tommydangerous in #3847
- Only fetch spark jobs if compute is enabled by @tommydangerous in #3851
- Remove
pymssql
dependency by @wangxiaoyou1993 in #3859 - Reposition file browser context menu by @edmondwinston in #3819
- Send notification on block run intialization failure by @wangxiaoyou1993 in #3861
- Update
authorize_query
check by @dy46 in #3846
💅 Enhancements & Polish
- Make current time button transparent by default by @anniexcheng in #3829
- Expanded our vocabulary by @MageKai in #3856
- Added magical nouns by @MageKai in #3857
- Improve app header styling by @anniexcheng in #3849
- Add
ctrl/cmd + click
keyboard shortcut for selecting pipeline run rows by @anniexcheng in #3843 - Use read_namespaced_job instead of read_namespaced_job_status by @wangxiaoyou1993 in #3863
- Support
text/csv
response type in API source by @wangxiaoyou1993 in #3864 - Automatically clean up cached data integration files by @wangxiaoyou1993 in #3869
- Bump snowflake-connector-python version by @dy46 in #3871 and #3873
- Improve base DI destination and add unit tests by @wangxiaoyou1993 in #3875
- Add
aws_session_token
support toget_aws_boto3_client
by @nyc-de in #3877 - Add clone action to version control by @dy46 in #3878
- Added test connection and new index naming by @Luishfs in #3848
- Make sure file browser context menu is always fully visible in the viewport when open by @anniexcheng in #3855
New Contributors
Full Changelog: 0.9.38...0.9.41